CN111933281A - Disease typing determination system, method, device and storage medium - Google Patents

Disease typing determination system, method, device and storage medium Download PDF

Info

Publication number
CN111933281A
CN111933281A CN202011061077.1A CN202011061077A CN111933281A CN 111933281 A CN111933281 A CN 111933281A CN 202011061077 A CN202011061077 A CN 202011061077A CN 111933281 A CN111933281 A CN 111933281A
Authority
CN
China
Prior art keywords
target
data
disease
patient
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011061077.1A
Other languages
Chinese (zh)
Other versions
CN111933281B (en
Inventor
黄思皖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011061077.1A priority Critical patent/CN111933281B/en
Publication of CN111933281A publication Critical patent/CN111933281A/en
Application granted granted Critical
Publication of CN111933281B publication Critical patent/CN111933281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The embodiment of the application discloses a system, a method, a device and a storage medium for determining disease typing, which are applied to the technical field of medical treatment, wherein the system comprises a storage device and a terminal device, wherein the storage device is used for storing historical medical data related to at least one patient and a target disease; and the terminal equipment is used for acquiring target historical medical data of the target patient related to the target disease from the storage equipment, determining multi-modal data according to the target historical medical data, and determining the target disease type of the target patient according to the multi-modal data. Compared with single-modality data, the target disease type of the target patient is determined through the multi-modality data of the target patient, and the relevance among the data of a plurality of data sources can be considered, so that the accuracy of determining the target disease type of the target patient is improved. The present application relates to blockchain techniques, such as historical medical data of a patient may be written into a blockchain for use in scenarios such as determining a target disease classification to which a target patient belongs.

Description

Disease typing determination system, method, device and storage medium
Technical Field
The present application relates to the field of data analysis technology, and more particularly to a system, a method, a device, a terminal device, and a computer-readable storage medium for determining a disease type.
Background
In the technical field of medical treatment, a target disease may have a plurality of disease types, and the determination of the target disease type of a patient is important for the treatment abnormality of the patient. The sepsis is an example of target diseases, refers to a systemic inflammatory response syndrome caused by infection, can be caused by infection of any part, and the primary pathogenesis of the sepsis is unknown, and relates to multiple aspects of complex systemic inflammatory network effect, gene polymorphism, immune dysfunction, blood coagulation dysfunction, tissue injury, abnormal reaction of hosts to different infectious pathogenic microorganisms and toxins thereof and the like, which are closely related to the pathophysiological changes of multiple systems and multiple organs of an organism, and the pathogenesis of the sepsis still needs to be further clarified.
Because the classification of sepsis of patients cannot be distinguished, the existing treatment means is single and mainly aims at controlling infection, the illness condition of sepsis is generally more serious, the fatality rate is high, about 9 percent of patients with sepsis can suffer from septic shock and multi-organ dysfunction, more than half of deaths in an intensive care unit are caused by septic shock and multi-organ dysfunction, and sepsis becomes a main reason for death of non-heart patients in the intensive care unit.
Therefore, how to accurately determine the target disease type of the patient becomes a problem to be solved urgently.
Disclosure of Invention
The embodiment of the application provides a disease typing determination system, method and device and a storage medium, which can improve the accuracy of determining the target disease typing of a target patient.
A first aspect of the embodiments of the present application provides a disease typing determination system, where the system includes a storage device and a terminal device, where:
the storage device is used for storing target historical medical data of at least one patient;
the terminal device is used for acquiring target historical medical data of the target patient related to the target disease from the storage device; acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data; screening target data associated with preset variables corresponding to the target disease from the preprocessed electronic medical record data; acquiring a medical image from the target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network; fusing the target data and the feature vector to form multi-modal data; analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients; and determining a target disease type to which the target patient belongs based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
In a second aspect, an embodiment of the present application provides a method for determining a disease type, including:
acquiring target historical medical data of a target patient associated with a target disease;
acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data;
screening target data associated with preset variables corresponding to the target disease from the preprocessed electronic medical record data;
acquiring a medical image from the target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network;
fusing the target data and the feature vector to form multi-modal data;
analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients;
and determining a target disease type to which the target patient belongs based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
A third aspect of embodiments of the present application provides a disease typing determination apparatus, including:
an acquisition module for acquiring target historical medical data associated with a target patient and a target disease;
the processing module is used for acquiring structured electronic medical record data from the target historical medical data and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data;
the processing module is further configured to screen target data associated with a preset variable corresponding to the target disease from the preprocessed electronic medical record data, acquire a medical image from the target historical medical data, and extract a feature vector corresponding to the medical image through a target neural network;
the processing module is further used for performing fusion processing on the target data and the feature vectors to form multi-modal data, and analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients;
the processing module is further configured to determine a target disease type to which the target patient belongs based on a clustering result, where the target disease type is any one of at least one disease type corresponding to the target disease.
A fourth aspect of the embodiments of the present application provides a terminal device, including a processor and a memory, where the processor and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes a program, and the processor is configured to call the program to execute the method according to the first aspect.
A fifth aspect of embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method of the first aspect.
In the embodiment of the application, target historical medical data of a target patient associated with a target disease can be acquired, structured electronic medical record data can be acquired from the target historical medical data, and the electronic medical record data can be preprocessed. Furthermore, target data associated with preset variables corresponding to the target disease are screened from the preprocessed electronic medical record data, medical images are obtained from target historical medical data, feature vectors corresponding to the medical images are extracted through a target neural network, and fusion processing is performed on the target data and the feature vectors to form multi-modal data. Further, the multi-modal data is analyzed through a target disease typing processing model to perform clustering processing on the target patients, and the target disease typing of the target patients is determined based on the clustering processing result. In the embodiment of the application, the target disease type of the target patient can be determined through multi-modal data of the target patient, and compared with single-modal data (namely data of a single data source), the relevance between the data of a plurality of data sources can be considered, so that the accuracy of determining the target disease type of the target patient is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings described below are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic structural diagram of a disease typing determination system provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart of a disease typing determination method provided in the embodiments of the present application;
FIG. 3 is a schematic flow chart of another disease typing determination method provided in the embodiments of the present application;
FIG. 4 is a diagram of a quadtree data structure provided by an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a disease typing determination apparatus provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, the present application provides a disease typing determination system, which includes a storage device and a terminal device, where the terminal device may be any one of the following: portable devices such as smart phones, tablets, laptops, etc., and desktop computers, etc. Wherein:
a storage device for storing target historical medical data of at least one patient associated with a target disease. The storage device may refer to a server corresponding to the patient monitoring system, or a node device in the blockchain network. In one embodiment, historical medical data generated by each patient for each visit, diagnosis, laboratory test and surgery related to the target disease can be uploaded to a patient monitoring system, the patient monitoring system can store the historical medical data of each patient in a server, and the historical medical data of any patient comprises a plurality of visits related to the target disease and various diagnosis, test, examination, medicine, surgery items and the like contained in each visit.
Alternatively, after receiving the historical medical data of each patient, the patient monitoring system may send the historical medical data of the patient to the node device in the blockchain network, and the node device may write the historical medical data of the patient into the blockchain. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block.
The terminal device is used for acquiring target historical medical data of the target patient related to the target disease from the storage device, acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data; screening target data associated with preset variables corresponding to the target disease from the preprocessed electronic medical record data; acquiring a medical image from target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network; fusing the target data and the feature vector to form multi-modal data; analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on target patients; and determining the target disease type of the target patient based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
In an example, the terminal device is specifically configured to perform data analysis on the multi-modal data through a target disease typing processing model, and cluster the target patients into a target group of a plurality of preset groups based on a data analysis result, where the target group is any one of the plurality of preset groups.
In an example, the terminal device is further specifically configured to, if the clustering result indicates that the target patient is clustered into a target group of a plurality of preset groups, determine, from correspondence between each group and each disease type created in advance, a disease type corresponding to the target group as a target disease type to which the target patient belongs.
In one example, the terminal device is further configured to obtain multi-modal sample data of each sample patient of the at least one sample patient; performing dimensionality reduction on multi-modal sample data of each sample patient to obtain a two-dimensional feature vector corresponding to each sample patient; analyzing the two-dimensional characteristic vectors corresponding to the sample patients through a target disease typing processing model to perform clustering processing on the sample patients to obtain at least one group, wherein each group comprises at least one sample patient; comparing the variable information of the corresponding variables of the sample patients in each group to obtain variable difference information among the groups; and determining the disease types to which the groups belong based on the variable difference information, and establishing the corresponding relation between the groups and the disease types.
In one example, the disease of interest comprises sepsis and the variables for the sample patient in each cohort comprise at least one or more of creatinine, brain natriuretic peptide precursor, and oxygenation index.
In one example, the at least one group includes n groups, n is an integer greater than 2, the terminal device is further specifically configured to determine the disease type to which the mth group belongs as the type associated with the renal dysfunction if the variable difference information indicates that the average value of creatinine of the sample patient in the mth group and the average value of creatinine of the sample patient in the other groups satisfy a preset creatinine difference condition, wherein the mth group is any one of the n groups,
Figure 416417DEST_PATH_IMAGE001
or if the variable difference information indicates that the mean value of the brain natriuretic peptide precursors of the sample patients in the mth group and the mean value of the brain natriuretic peptide precursors of the sample patients in other groups meet a preset abnormal condition, determining the disease type to which the mth group belongs as the type which is in differential connection with the heart dysfunction brain natriuretic peptide precursors.
In one example, the at least one group includes n groups, n is an integer greater than 2, the terminal device is further specifically configured to determine the disease type to which the mth group belongs as the type associated with the respiratory system disorder if the variable difference information indicates that an average value of the oxygenation indexes of the sample patients in the mth group and an average value of the oxygenation indexes of the sample patients in other groups satisfy a preset oxygenation index difference condition, wherein the mth group is any one of the n groups,
Figure 976712DEST_PATH_IMAGE002
in the embodiment of the present application, reference may be made to the following description of relevant contents in the embodiments corresponding to fig. 2 and fig. 3 for specific implementation of the terminal device, which is not described in detail herein.
Please refer to fig. 2, which is a flowchart illustrating a method for determining a disease type according to an embodiment of the present application, where the method for determining a disease type is executable by a terminal device in the system for determining a disease type, and the method for determining a disease type includes the following steps:
s201, obtaining target historical medical data of the target patient and the target disease.
In one embodiment, the target disease may be sepsis, and the terminal device may perform data interaction with a storage device, which may refer to a server corresponding to the monitoring system or a node device in the blockchain network, and the storage device stores historical medical data of at least one patient associated with the target disease in advance. The historical medical data for any one patient includes multiple visits of the subject patient associated with sepsis, as well as various diagnostic, test, exam, medication, surgical procedures, etc. data for each visit.
In one embodiment, the terminal device runs an application program corresponding to the disease typing processing platform, or can open a webpage corresponding to the disease typing processing platform. Any user (which may refer to a patient or a doctor) can submit a disease typing determination request for a target patient by logging in a disease typing processing platform, wherein the disease typing determination request comprises identity information of the target patient and identification information of the target disease, the identity information of the target patient may refer to a certificate number of the target patient, or a medical record number which can uniquely identify the target patient, and the like.
Further, after the terminal device detects the disease typing determination request submitted by the user, the target historical medical data of the target patient can be acquired from the historical medical data of at least one patient and the target disease, which are stored in the storage device, based on the identity information and the identification information of the target disease.
Or, in another embodiment, the target historical medical data of each patient associated with the target disease may be written into the blockchain in advance, and then the terminal device may obtain the target historical medical data of the target patient associated with the target disease from the blockchain through the node device in the blockchain network.
S202, acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of a target patient and laboratory test data.
Among them, the electronic medical records, also called computerized medical record system or computer-based patient records, are digitized medical records that are stored, managed, transmitted and reproduced by electronic devices (computers, health cards, etc.) to replace handwritten paper medical records. Its contents include all the information of the paper case history. Generally, the electronic medical record records a patient record based on a fixed data structure, and therefore, the so-called electronic medical record data is a structured number.
The vital sign data of the target patient is continuous data, which may be sampled periodically (for example, every hour) in advance, and the sampled vital sign data of the target patient is uploaded to the patient monitoring system, and then when the step S201 is executed, the terminal device may obtain the vital sign data of the target patient from the patient monitoring system, where the vital sign data may be, for example, a heart rate, blood oxygen, a respiratory rate, and the like. Laboratory test data such as brain natriuretic peptide precursors, lactate, etc.
The preprocessing includes multiple interpolation. Since tests for certain indices are not performed on every patient, for example, tests for pro-brain natriuretic peptide are rare, and blood gas analysis is not performed on every patient, these indices may reflect the cardiopulmonary status of the patient, and thus have an impact on the outcome of typing the target disease. In order to improve the accuracy of determining the type of the target disease to which the target patient belongs, missing values in the electronic medical record data can be filled through multiple interpolation. The multiple interpolation is a method for processing missing values based on repeated simulation, a group of complete electronic medical record data is generated from a data set in the electronic medical record data containing the missing values, the data set, and the missing data in each data set is filled by a Monte Carlo method.
S203, screening target data related to preset variables corresponding to the target disease from the preprocessed electronic medical record data. Wherein the preset variable can be understood as an important variable for determining the type of the target disease, and the preset variable can be preset by a developer according to experimental data and can also be determined according to a data set of at least one sample patient which is collected in advance, wherein the sample patient refers to a patient suffering from the target disease. Illustratively, when the target disease is sepsis, the preset variables may include: brain natriuretic peptide precursor, blood glucose, oxygenation index, and the like. Correspondingly, the target data associated with the preset variables are the preset variables in the electronic medical record data and the values corresponding to the preset variables.
In particular, the specific implementation of determining the preset variable from the pre-acquired data set of at least one sample patient is: at least one sample patient can be divided into two groups according to the outcome of survival and death, t-test can be performed on continuous variables (variables corresponding to vital sign data) in the two groups, chi-square test can be performed on classified variables (e.g., sex, variables corresponding to indexes characterizing negativity or negativity in experimental detection data), whether the variables (continuous variables or classified variables) in each group are statistically different is tested, and if the difference between the two groups of any variable is significant, any variable is an important variable. For example, brain natriuretic peptide precursor in patients with fatalities at the end is significantly higher than in patients with survivals at the end, then it can be determined that brain natriuretic peptide precursor is an important variable. The criterion for determining whether the difference between the two groups of any variable is significant may be a predetermined determination condition, which is not specifically limited in the present application.
And S204, acquiring a medical image from the target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network.
The medical image may refer to an ultrasound image, a Computed Tomography (CT) scan image, and the like, wherein CT is a Tomography method combining x-ray photography technology with reconstruction mathematics and computer technology. CT can collect the information of X-ray scanning human body structure, and through A/D conversion, computer operation, D/A conversion and other treatment, the reconstructed cross section image of the scanned structure is generated.
The target Neural network may include CNN (Convolutional Neural network), DNN (Deep Neural network), RNN (adaptive Neural network), and other initial model determinations.
Taking the initial model as CNN as an example, the target neural network is obtained by deleting a classification calculation layer (softmax layer) in CNN, and in this case, the specific implementation manner of extracting the feature vector corresponding to the medical image through the target neural network is as follows: inputting the medical image into the CNN, preliminarily extracting the characteristics of the medical image by using the convolution layer in the CNN, and extracting the main characteristics of the medical image by using the pooling layer so as to construct the characteristic vector of the medical image. It is understood that, for deleting the CNN of the classification calculation layer, the data directly output by the CNN is the feature vector of the input medical image.
And S205, performing fusion processing on the target data and the feature vector to form multi-modal data.
And S206, analyzing the multi-modal data through the target disease typing processing model so as to perform clustering processing on the target patients.
S207, determining the target disease type of the target patient based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
In a specific implementation, a plurality of groups (i.e., preset groups) associated with a target disease are preset, each preset group corresponds to a disease type of the target disease, and a corresponding relationship between each group and each disease type is preset. In this case, if the clustering result indicates that the target patient is clustered into a target group among a plurality of preset groups, the disease type corresponding to the target group is determined as the target disease type to which the target patient belongs, from the correspondence relationship between each group and each disease type created in advance.
Exemplarily, assuming that the target disease is sepsis, the preset group includes a first group, a second group, and a third group, and the correspondence relationship of each group created in advance to each disease classification is shown in table 1. In this case, when the clustering result indicates that the target patient is clustered into the first group of the plurality of preset groups, the terminal device may determine the target disease classification to which the target patient belongs as a classification associated with renal dysfunction, and accordingly, subsequent treatment of the target patient may be directed to renal function rather than a simple anti-infection strategy.
TABLE 1
Grouping Disease typing
First group Typing associated with renal dysfunction
Second packet Typing associated with cardiac dysfunction
Third sub-group Typing associated with respiratory disorders
In the embodiment of the application, target historical medical data of a target patient related to a target disease can be acquired, structured electronic medical record data can be acquired from the target historical medical data, and the electronic medical record data can be preprocessed. Furthermore, target data associated with preset variables corresponding to the target disease are screened from the preprocessed electronic medical record data, medical images are obtained from target historical medical data, feature vectors corresponding to the medical images are extracted through a target neural network, and fusion processing is performed on the target data and the feature vectors to form multi-modal data. Further, the multi-modal data is analyzed through a target disease typing processing model to perform clustering processing on the target patients, and the target disease typing of the target patients is determined based on the clustering processing result. In the embodiment of the application, the target disease type of the target patient can be determined through multi-modal data of the target patient, and compared with single-modal data (namely data of a single data source), the relevance between the data of a plurality of data sources can be considered, so that the accuracy of determining the target disease type of the target patient is improved.
Please refer to fig. 3, which is a flowchart illustrating another method for determining a disease type according to an embodiment of the present application, where the method for determining a disease type can be executed by a terminal device, and the method for determining a disease type includes the following steps:
s301, obtaining multi-modal sample data of each sample patient in at least one sample patient.
In a specific implementation, medical sample data associated with a target disease and a plurality of sample patients may be acquired, where the content type included in the medical sample data is the same as the content type included in the target historical medical data in step S201, and only the specific data corresponding to different patients is different.
Further, structured electronic medical record sample data can be obtained from the target historical medical data of each sample patient, the electronic medical record sample data is preprocessed, and target data related to preset variables corresponding to the target disease is screened from the preprocessed electronic medical record sample data; and acquiring medical images from the medical sample data, extracting the feature vectors corresponding to the medical images of the sample patients through a target neural network, and fusing the target data and the feature vectors of the sample patients to form multi-modal sample data of the sample patients.
And S302, performing dimensionality reduction on the multi-modal sample data of each sample patient to obtain a two-dimensional feature vector corresponding to each sample patient.
In one embodiment, taking multimodal sample data of any sample patient as an example, x can be passed1、x2、x3、…、xNThe indices characterizing the multi-modal sample data, such as the indices including the natriuretic peptide precursor, blood glucose, oxygenation index, ultrasound feature vector, etc., can be obtained by linear mapping (such as PCA (Principal Component Analysis), LDA (linear Discriminant Analysis) on x1、x2、x3、…、xNReducing dimension to obtain x1、x2、x3、…、xNA respective corresponding two-dimensional feature vector.
Alternatively, in another embodiment, still taking the multi-modal sample data of any of the above sample patients as an example, x can be embedded by a T-distribution random neighbor method1、x2、x3、…、xNReducing dimension to obtain x1、x2、x3、…、xNA respective corresponding two-dimensional feature vector. The T-distribution random neighbor embedding is a machine learning method for dimension reduction, can be used for identifying the association pattern, and has the main advantage of maintaining a local structure. This means that points in the high dimensional data space that are close in distance project close into the low dimension.
The traditional T-distribution random neighbor embedding method is to reduce the dimension by reducing the inconsistency between two distributions, which are respectively the pairwise similar distribution for measuring the input variables and the pairwise similar distribution for measuring the corresponding variables that have been converted into two dimensions in the embedding layer. Taking the multi-modal sample data of any sample patient as an example, the data include indexes such as brain natriuretic peptide precursor, blood glucose, oxygenation index, ultrasound feature vector, and the like, and x is respectively1、x2、x3、…、xNAnd calculating the distance d (x) between the two indexes by using the Euclidean distancei,xj) Wherein d (x)i,xj)=|| xi- xj||,
Figure DEST_PATH_IMAGE003
Y in (1)1、y2、y3、…、yNIs to mix x1、x2、x3、…、xNAnd reducing the dimensionality to obtain a corresponding point.
First a joint probability p is definedi,jThis joint probability is obtained by equalizing two conditional probabilities (p respectively)j|iAnd pi|j) To measure xiAnd xjAre similar in pairs. Wherein the joint probability pi,jTwo conditional probabilities pj|iAnd pi|jThe corresponding formula is shown as formula 1.1-formula 1.3, and formula 1.2
Figure 309211DEST_PATH_IMAGE004
Is the degree of confusion of the distribution, the greater the data density, the less the degree of confusion.
Figure 322166DEST_PATH_IMAGE005
Formula 1.1
Figure 266988DEST_PATH_IMAGE006
Formula 1.2
pi|i=0 formula 1.3
Secondly, it is necessary to define a point y measuring the corresponding reduced dimension in the embedding layeri,yj(will x)i,xjReduced dimensionality) as shown below:
Figure 619472DEST_PATH_IMAGE007
and finally, solving y by gradient descent with minimum inconsistency of the two distributions, wherein the inconsistency of the two distributions is defined as:
Figure 624337DEST_PATH_IMAGE008
gradient descent solving:
Figure 62272DEST_PATH_IMAGE009
wherein Z is a normalization function:
Figure 46671DEST_PATH_IMAGE010
both distributions need to traverse N (N-1) pairs of different features, and the algorithm time complexity is O (N)2) The computational complexity is high and may take several hours in a large sample dataset.
The scheme can improve the traditional T distribution random neighbor embedding method, and firstly redefines the conditional probability pj|iThe following are:
Figure 519240DEST_PATH_IMAGE011
wherein the content of the first and second substances,N i finding x by nearest neighbor search methodiAnd u is self-defined according to the data set. Thus, there is no need to traverse N (N-1) any more here compared to the conventional approach. Therefore, the processing time for performing dimensionality reduction on the multi-modal sample data by a T-distributed random neighbor embedding method to obtain the corresponding two-dimensional feature vector is reduced, and the processing efficiency of dimensionality reduction is improved.
Further, the gradient can be divided into two parts, as shown in the following equation:
Figure 695007DEST_PATH_IMAGE012
wherein the content of the first and second substances,F attr is the sum of all the similar terms and,F rep is the sum of all the distinct terms and,F attr andF rep the corresponding calculation formula is shown as follows:
Figure 620237DEST_PATH_IMAGE013
in such a way that the calculation is madeF attr Time complexity of (D) is represented by O (N)2) The value is reduced to O (uN), thereby improving the processing efficiency of dimension reduction.
Further optionally, the reduction can also be performed by a dual-tree algorithmF rep Time complexity of (d), in particular: first, a quadtree data structure is adopted, which can be shown in fig. 4, for example, each node in the quadtree data structure has 4 rectangular sub-blocks, the block of the minimum unit includes at most one point y, and the maximum original block includes all the points y
Figure 906862DEST_PATH_IMAGE014
。YcellIs the center of a block, and all points in the block are represented by NcellAnd (4) showing. A quad tree has O (N) nodes.
Approximate estimation by a dual-tree algorithmF rep The dual-tree algorithm only considers the block and the influence between the blocks, rather than considering the influence between two points as in the conventional algorithm. Specifically, only if the impact factor between blocks is smaller than
Figure 30676DEST_PATH_IMAGE016
The effect between these two blocks is only considered, and the effect factor is shown in equation 1.4.
Figure 315027DEST_PATH_IMAGE017
Formula 1.4
Wherein A and B are two isomorphic quadtrees, ycell-AAnd ycell-BIs the corresponding block center in the blocks of A and B, rcell-AAnd rcell-BIs the diagonal of the two blocks and,
Figure 316405DEST_PATH_IMAGE020
is predefined from the data set. Suppose a tile is small enough and far enough away from yiIn the case of (a), the first,F rep can be approximated as
Figure 560304DEST_PATH_IMAGE021
The algorithm time complexity is represented by O (N)2) Reduced to O (NLogN).
S303, analyzing the two-dimensional characteristic vectors corresponding to the sample patients through the target disease typing processing model to perform clustering processing on the sample patients to obtain at least one group, wherein each group comprises at least one sample patient. In the classification process, the characteristic vectors subjected to dimension reduction are used for clustering the patients of all samples, and compared with the traditional clustering method, the method can reduce the number of samples, thereby increasing the difference among the samples and improving the accuracy of clustering results. In the traditional clustering method, there are usually many samples, and in the process of clustering the samples, there are many sample types that easily appear, so that the difference between the samples is low, and the accuracy of the typing result is seriously affected.
S304, comparing the variable information of the corresponding variables of the sample patients in each group to obtain the variable difference information among the groups, determining the disease types of the groups based on the variable difference information, and establishing the corresponding relation between the groups and the disease types.
In a specific implementation, the target disease typing processing model may be a model corresponding to a consistency K-means classification algorithm, and the consistency K-means classification algorithm may be used to perform clustering processing on each sample patient to obtain at least one group. Specifically, the consistency K-means classification is performed by calculating the consistency between a pair of sample patients, the consistency is between 0 and 1, and the higher the value, the higher the consistency of the pair of patients. The value of consistency is the ratio of the number of times that the data for each pair of sample patients is classified into the same class after being sampled, with 0 indicating no classification into the same class and 1 indicating that the data is always classified into the same class. Through a consistency K-means classification method, each sample patient can be divided into n (integers larger than 2) classes, and the same class is clustered into the same group, so that n groups are obtained.
Wherein the disease of interest comprises sepsis and the variables for the sample patient in each cohort comprise at least one or more of creatinine, brain natriuretic peptide precursor, and oxygenation index. In one embodiment, the at least one group includes n groups, n is an integer greater than 2, and determining the disease type to which each group belongs based on the variable difference information includes:
if the variable difference information indicates that the average value of the creatinine of the sample patient in the mth group and the average value of the creatinine of the sample patient in other groups meet the preset creatinine difference condition, determining the disease type to which the mth group belongs as the type associated with the renal dysfunction, wherein the mth group is any one of the n groups
Figure 749977DEST_PATH_IMAGE022
. Accordingly, the target patient may be subsequently treated for renal function rather than simply treated against infection.
Wherein the preset creatinine difference condition may be, for example, that the mean value of creatinine of a sample patient in the mth group is significantly higher than the mean value of creatinine of sample patients in other groups. Specifically, the average value of the creatinine of the sample patient in the mth group may be subtracted from the average value of the creatinine of the sample patient in each of the other groups, and if the difference values (obtained by subtracting the average values of the creatinine) between the mth group and each of the other groups are greater than the preset creatinine threshold, the average value of the creatinine of the sample patient in the mth group and the average values of the creatinine of the sample patient in the other groups are determined to satisfy the preset creatinine difference condition.
Alternatively, in another embodiment, if the variable difference information indicates that the mean value of the brain natriuretic peptide precursors of the sample patients in the mth cohort and the mean value of the brain natriuretic peptide precursors of the sample patients in the other cohort satisfy the preset brain natriuretic peptide precursor difference condition, the disease classification to which the mth cohort belongs is determined to be the classification associated with cardiac dysfunction. Accordingly, the target patient may be subsequently treated for cardiac function, rather than simply treated against infection.
The predetermined brain natriuretic peptide precursor difference condition can be, for example, that the mean value of the brain natriuretic peptide precursors of the sample patients in the mth group is significantly higher than the mean value of the brain natriuretic peptide precursors of the sample patients in the other groups. Specifically, the average value of the brain natriuretic peptide precursors of the sample patients in the mth group can be subtracted from the average value of the brain natriuretic peptide precursors of the sample patients in each other group, and if the difference values between the mth group and each other group (obtained by subtracting the average values of the brain natriuretic peptide precursors) are both greater than the preset brain natriuretic peptide precursor threshold, the average value of the brain natriuretic peptide precursors of the sample patients in the mth group and the average values of the brain natriuretic peptide precursors of the sample patients in the other groups are determined to meet the preset brain natriuretic peptide precursor difference condition.
Alternatively still, in another embodiment, if the variable difference information indicates that the average value of the oxygenation indexes of the sample patients in the mth group and the average value of the oxygenation indexes of the sample patients in the other groups satisfy a preset oxygenation index difference condition, the disease type to which the mth group belongs is determined to be a type associated with a respiratory system disorder. Accordingly, the target patient may be subsequently treated for respiratory function rather than simply treated against infection.
Wherein the preset oxygenation index difference condition may be, for example, that the average value of the oxygenation indexes of the sample patients in the mth group is significantly higher than the average value of the oxygenation indexes of the sample patients in the other groups. Specifically, the average value of the oxygenation indexes of the sample patients in the mth group and the average value of the oxygenation indexes of the sample patients in each other group can be subtracted, and if the difference values (obtained by subtracting the average values of the oxygenation indexes) between the mth group and each other group are all larger than the preset oxygenation index threshold, the average value of the oxygenation indexes of the sample patients in the mth group and the average value of the oxygenation indexes of the sample patients in the other groups are determined to meet the preset oxygenation index difference condition.
S305, acquiring target historical medical data of the target patient related to the target disease, acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data.
S306, screening target data related to preset variables corresponding to the target disease from the preprocessed electronic medical record data, acquiring medical images from target historical medical data, extracting feature vectors corresponding to the medical images through a target neural network, and fusing the target data and the feature vectors to form multi-modal data.
S307, performing data analysis on the multi-modal data through the target disease typing processing model, and clustering the target patients into target groups in a plurality of preset groups based on the data analysis result, wherein the target groups are any one of the preset groups.
And S308, determining the target disease type of the target patient based on the clustering result. For specific implementation of steps S305 to S308, refer to the description related to steps S201 to S207 in the above embodiments, which is not repeated herein.
Please refer to fig. 5, which is a schematic structural diagram of a disease typing determination apparatus according to an embodiment of the present application. The apparatus for determining disease classification described in this embodiment may be configured in a terminal device, and includes:
an acquisition module 50 for acquiring target historical medical data of a target patient associated with a target disease;
the processing module 51 is configured to acquire structured electronic medical record data from the target historical medical data and preprocess the electronic medical record data, where the electronic medical record data includes vital sign data of the target patient and laboratory test data;
the processing module 51 is further configured to screen target data associated with a preset variable corresponding to the target disease from the preprocessed electronic medical record data, obtain a medical image from the target historical medical data, and extract a feature vector corresponding to the medical image through a target neural network;
the processing module 51 is further configured to perform fusion processing on the target data and the feature vector to form multi-modal data, and analyze the multi-modal data through a target disease typing processing model to perform clustering processing on the target patient;
the processing module 51 is further configured to determine a target disease type to which the target patient belongs based on a clustering result, where the target disease type is any one of at least one disease type corresponding to the target disease.
In one embodiment, the processing module 51 is specifically configured to perform data analysis on the multi-modal data through a target disease typing processing model; clustering the target patients into a target group of a plurality of preset groups based on the data analysis result, wherein the target group is any one group of the plurality of preset groups.
In an embodiment, the processing module 51 is further specifically configured to, if the clustering result indicates that the target patient is clustered into a target group of a plurality of preset groups, determine, from the pre-created correspondence between each group and each disease type, a disease type corresponding to the target group as the target disease type to which the target patient belongs.
In one embodiment, the processing module 51 is further configured to obtain multi-modal sample data of each sample patient in at least one sample patient before determining a disease classification corresponding to the target group as a target disease classification to which the target patient belongs from a pre-created correspondence between each group and each disease classification; performing dimensionality reduction on the multi-modal sample data of each sample patient to obtain a two-dimensional feature vector corresponding to each sample patient; analyzing the two-dimensional characteristic vectors corresponding to the sample patients through a target disease typing processing model to perform clustering processing on the sample patients to obtain at least one group, wherein each group comprises at least one sample patient; comparing the variable information of the corresponding variables of the sample patients in each group to obtain variable difference information among the groups; and determining the disease types to which the groups belong based on the variable difference information, and establishing the corresponding relation between the groups and the disease types.
In one embodiment, the disease of interest comprises sepsis and the variables for the sample patient in each grouping comprise at least one or more of creatinine, brain natriuretic peptide precursor, and oxygenation index.
In one embodiment, the at least one group includes n groups, where n is an integer greater than 2, the processing module 51 is further specifically configured to determine the disease type to which the mth group belongs as the type associated with the renal dysfunction if the variable difference information indicates that the average value of the creatinine of the sample patient in the mth group, which is any one of the n groups, and the average value of the creatinine of the sample patient in the other groups satisfy a preset creatinine difference condition
Figure 712117DEST_PATH_IMAGE022
Or, if the variable difference information indicates that the mean value of the brain natriuretic peptide precursors of the sample patients in the mth group and the mean value of the brain natriuretic peptide precursors of the sample patients in other groups meet the preset brain natriuretic peptide precursor difference condition, determining the disease type to which the mth group belongs as the type associated with the cardiac dysfunction.
In one embodiment, the at least one group includes n groups, where n is an integer greater than 2, the processing module 51 is further specifically configured to determine the disease type to which the mth group belongs as the type associated with the respiratory system disorder if the variable difference information indicates that the average value of the oxygenation indexes of the sample patients in the mth group, which is any one of the n groups, and the average value of the oxygenation indexes of the sample patients in the other groups satisfy a preset oxygenation index difference condition
Figure 278227DEST_PATH_IMAGE022
It can be understood that each functional module of the disease typing determination apparatus of this embodiment may be specifically implemented according to the method in the foregoing method embodiment fig. 2 or fig. 3, and the specific implementation process thereof may refer to the related description in the foregoing method embodiment fig. 2 or fig. 3, which is not described herein again.
In the embodiment of the application, the disease typing determining device can acquire target historical medical data of a target patient associated with a target disease, acquire structured electronic medical record data from the target historical medical data, and preprocess the electronic medical record data. Furthermore, target data associated with preset variables corresponding to the target disease are screened from the preprocessed electronic medical record data, medical images are obtained from target historical medical data, feature vectors corresponding to the medical images are extracted through a target neural network, and fusion processing is performed on the target data and the feature vectors to form multi-modal data. Further, the multi-modal data is analyzed through a target disease typing processing model to perform clustering processing on the target patients, and the target disease typing of the target patients is determined based on the clustering processing result. In the embodiment of the application, the target disease type of the target patient can be determined through multi-modal data of the target patient, and compared with single-modal data (namely data of a single data source), the relevance between the data of a plurality of data sources can be considered, so that the accuracy of determining the target disease type of the target patient is improved.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present disclosure. The terminal device may include: one or more processors 601; one or more output devices 602 and memory 603. The processor 601, the output device 602, and the memory 603 are connected by a bus. The memory 603 is used for storing a computer program comprising program instructions, and the processor 601 is used for executing the program instructions stored in the memory 603 and performing the following operations:
acquiring target historical medical data of a target patient associated with a target disease;
acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data;
screening target data associated with preset variables corresponding to the target disease from the preprocessed electronic medical record data;
acquiring a medical image from the target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network;
fusing the target data and the feature vector to form multi-modal data;
analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients;
and determining a target disease type to which the target patient belongs based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
In one embodiment, the processor 601 is specifically configured to perform data analysis on the multi-modal data through a target disease typing process model; clustering the target patients into a target group of a plurality of preset groups based on the data analysis result, wherein the target group is any one group of the plurality of preset groups.
In an embodiment, the processor 601 is further specifically configured to determine, if the clustering result indicates that the target patient is clustered into a target group of a plurality of preset groups, a disease type corresponding to the target group as the target disease type to which the target patient belongs from the pre-created correspondence between each group and each disease type.
In one embodiment, the processor 601 is further configured to obtain multi-modal sample data of each sample patient of at least one sample patient before determining a disease classification corresponding to the target group as a target disease classification to which the target patient belongs from a pre-created correspondence between each group and each disease classification; performing dimensionality reduction on the multi-modal sample data of each sample patient to obtain a two-dimensional feature vector corresponding to each sample patient; analyzing the two-dimensional characteristic vectors corresponding to the sample patients through a target disease typing processing model to perform clustering processing on the sample patients to obtain at least one group, wherein each group comprises at least one sample patient; comparing the variable information of the corresponding variables of the sample patients in each group to obtain variable difference information among the groups; and determining the disease types to which the groups belong based on the variable difference information, and establishing the corresponding relation between the groups and the disease types.
In one embodiment, the disease of interest comprises sepsis and the variables for the sample patient in each grouping comprise at least one or more of creatinine, brain natriuretic peptide precursor, and oxygenation index.
In one embodiment, the at least one group includes n groups, where n is an integer greater than 2, and the processor 601 is further specifically configured to determine the disease type to which the mth group belongs as the type associated with the renal dysfunction if the variable difference information indicates that the average of creatinine of the sample patient in the mth group, which is any one of the n groups, and the average of creatinine of the sample patient in the other groups satisfy a preset creatinine difference condition
Figure 642213DEST_PATH_IMAGE022
Or, if the variable difference information indicates that the mean value of the brain natriuretic peptide precursors of the sample patients in the mth group and the mean value of the brain natriuretic peptide precursors of the sample patients in other groups meet the preset brain natriuretic peptide precursor difference condition, determining the disease type to which the mth group belongs as the type associated with the cardiac dysfunction.
In one embodiment, the at least one group comprises n groups, where n is an integer greater than 2, and the processor 601 is further specifically configured to determine the disease type to which the mth group belongs as the type associated with the respiratory system disorder if the variable difference information indicates that a preset oxygenation index difference condition is satisfied between an average value of the oxygenation indexes of the sample patients in the mth group and an average value of the oxygenation indexes of the sample patients in other groups, where the mth group is any one of the n groups, and the n groups are any one of the n groups
Figure 566568DEST_PATH_IMAGE022
It should be understood that in the embodiment of the present Application, the Processor 601 may be a Central Processing Unit (CPU), and the Processor 601 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 603 may include both read-only memory and random access memory and provides instructions and data to the processor 601. A portion of the memory 603 may also include non-volatile random access memory.
In a specific implementation, the processor 601, the output device 602, and the memory 603 described in this embodiment of the present application may execute the implementation described in the method for determining a disease classification provided in this embodiment of the present application, and may also execute the implementation of the device for determining a disease classification described in this embodiment of the present application, which is not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, where the computer program includes program instructions, and the program instructions, when executed by a processor, may perform the steps executed in the above-mentioned embodiment of the method for determining a disease type.
Embodiments of the present application also provide a computer program product comprising computer program code to, when run on a computer, cause the computer to perform the steps performed in the above-described method embodiments of determining a disease type.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A disease typing determination system, comprising a storage device and a terminal device, wherein:
the storage device is used for storing historical medical data of at least one patient associated with a target disease;
the terminal device is used for acquiring target historical medical data of a target patient associated with the target disease from the storage device; acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data; screening target data associated with preset variables corresponding to the target disease from the preprocessed electronic medical record data; acquiring a medical image from the target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network; fusing the target data and the feature vector to form multi-modal data; analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients; and determining a target disease type to which the target patient belongs based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
2. The system according to claim 1, wherein the terminal device is specifically configured to: performing data analysis on the multi-modal data through a target disease typing processing model; clustering the target patients into a target group of a plurality of preset groups based on the data analysis result, wherein the target group is any one group of the plurality of preset groups.
3. The system of claim 2, wherein the terminal device is further specifically configured to: and if the clustering processing result indicates that the target patient is clustered into a target group in a plurality of preset groups, determining the disease type corresponding to the target group as the target disease type of the target patient from the pre-established corresponding relationship between each group and each disease type.
4. The system of claim 3, wherein the terminal device is further configured to: obtaining multi-modal sample data of each sample patient in at least one sample patient; performing dimensionality reduction on the multi-modal sample data of each sample patient to obtain a two-dimensional feature vector corresponding to each sample patient; analyzing the two-dimensional characteristic vectors corresponding to the sample patients through a target disease typing processing model to perform clustering processing on the sample patients to obtain at least one group, wherein each group comprises at least one sample patient; comparing the variable information of the corresponding variables of the sample patients in each group to obtain variable difference information among the groups; and determining the disease types to which the groups belong based on the variable difference information, and establishing the corresponding relation between the groups and the disease types.
5. The system of claim 4, wherein the target condition comprises sepsis and the variables associated with the sample patients in each cohort comprise at least one or more of creatinine, brain natriuretic peptide precursor, and oxygenation index.
6. The system according to claim 5, wherein the at least one packet includes n packets, where n is an integer greater than 2, and the terminal device is further specifically configured to: determining the disease type to which the mth group belongs as the type associated with the renal dysfunction if the variable difference information indicates that the average value of the creatinine of the sample patient in the mth group and the average value of the creatinine of the sample patient in other groups meet a preset creatinine difference condition, wherein the mth group is any one of the n groups
Figure 345788DEST_PATH_IMAGE001
(ii) a Or, if the variable difference information indicates that the mean value of the brain natriuretic peptide precursors of the sample patients in the mth group and the mean value of the brain natriuretic peptide precursors of the sample patients in other groups meet the preset brain natriuretic peptide precursor difference condition, determining the disease type to which the mth group belongs as the type associated with the cardiac dysfunction.
7. The system according to claim 5, wherein the at least one packet includes n packets, where n is an integer greater than 2, and the terminal device is further specifically configured to: if the variable difference information indicates that the average value of the oxygenation indexes of the sample patients in the mth group and the average value of the oxygenation indexes of the sample patients in other groups meet the preset oxygenation index difference conditionDetermining a disease type to which the mth group belongs as a type associated with a respiratory disorder, wherein the mth group is any one of the n groups, the
Figure 871447DEST_PATH_IMAGE002
8. A method for determining disease typing, the method being performed by a terminal device in a system according to claims 1-7, comprising:
acquiring target historical medical data of a target patient associated with a target disease;
acquiring structured electronic medical record data from the target historical medical data, and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data;
screening target data associated with preset variables corresponding to the target disease from the preprocessed electronic medical record data;
acquiring a medical image from the target historical medical data, and extracting a feature vector corresponding to the medical image through a target neural network;
fusing the target data and the feature vector to form multi-modal data;
analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients; and determining a target disease type to which the target patient belongs based on the clustering processing result, wherein the target disease type is any one of at least one disease type corresponding to the target disease.
9. An apparatus for determining a disease type, the apparatus comprising:
an acquisition module for acquiring target historical medical data associated with a target patient and a target disease;
the processing module is used for acquiring structured electronic medical record data from the target historical medical data and preprocessing the electronic medical record data, wherein the electronic medical record data comprises vital sign data of the target patient and laboratory test data;
the processing module is further configured to screen target data associated with a preset variable corresponding to the target disease from the preprocessed electronic medical record data, acquire a medical image from the target historical medical data, and extract a feature vector corresponding to the medical image through a target neural network;
the processing module is further used for performing fusion processing on the target data and the feature vectors to form multi-modal data, and analyzing the multi-modal data through a target disease typing processing model to perform clustering processing on the target patients;
the processing module is further configured to determine a target disease type to which the target patient belongs based on a clustering result, where the target disease type is any one of at least one disease type corresponding to the target disease.
10. A computer-readable storage medium, characterized in that the readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method as claimed in claim 8.
CN202011061077.1A 2020-09-30 2020-09-30 Disease typing determination system, method, device and storage medium Active CN111933281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011061077.1A CN111933281B (en) 2020-09-30 2020-09-30 Disease typing determination system, method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011061077.1A CN111933281B (en) 2020-09-30 2020-09-30 Disease typing determination system, method, device and storage medium

Publications (2)

Publication Number Publication Date
CN111933281A true CN111933281A (en) 2020-11-13
CN111933281B CN111933281B (en) 2021-02-12

Family

ID=73334805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011061077.1A Active CN111933281B (en) 2020-09-30 2020-09-30 Disease typing determination system, method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111933281B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112420192A (en) * 2020-11-26 2021-02-26 清华大学 Disease typing method fusing multi-dimensional diagnosis and treatment information and related equipment
CN112735596A (en) * 2020-12-31 2021-04-30 神州医疗科技股份有限公司 Similar patient determination method and device, electronic equipment and storage medium
CN112820416A (en) * 2021-02-26 2021-05-18 重庆市公共卫生医疗救治中心 Major infectious disease queue data typing method, typing model and electronic equipment
CN113221162A (en) * 2021-04-28 2021-08-06 健康数据(北京)科技有限公司 Private disease-specific big data privacy protection method and system based on block chain
CN117112729A (en) * 2023-08-21 2023-11-24 北京科文思数据管理有限公司 Medical resource docking method and system based on artificial intelligence
CN117112729B (en) * 2023-08-21 2024-05-31 北京科文思数据管理有限公司 Medical resource docking method and system based on artificial intelligence

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608091A (en) * 2014-11-21 2016-05-25 中国移动通信集团公司 Construction method and device of dynamic medical knowledge base
US20160267235A1 (en) * 2015-03-12 2016-09-15 Wayne State University PINS: A Perturbation Clustering Approach for Data Integration and Disease Subtyping
CN106127263A (en) * 2016-07-06 2016-11-16 中国人民解放军国防科学技术大学 The human brain magnetic resonance image (MRI) classifying identification method extracted based on three-dimensional feature and system
CN107133448A (en) * 2017-04-10 2017-09-05 温州医科大学 A kind of metabolism group data fusion optimized treatment method
CN108491770A (en) * 2018-03-08 2018-09-04 李书纲 A kind of data processing method based on fracture image
CN110970131A (en) * 2019-10-24 2020-04-07 中国医科大学附属盛京医院 Immunoglobulin quantitative detection-based glomerular disease classification typing model and application thereof
CN111403024A (en) * 2019-01-02 2020-07-10 天津幸福生命科技有限公司 Method and device for obtaining disease judgment model based on medical data
CN111583320A (en) * 2020-03-17 2020-08-25 哈尔滨医科大学 Breast cancer ultrasonic image typing method and system fusing deep convolutional network and image omics characteristics and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608091A (en) * 2014-11-21 2016-05-25 中国移动通信集团公司 Construction method and device of dynamic medical knowledge base
US20160267235A1 (en) * 2015-03-12 2016-09-15 Wayne State University PINS: A Perturbation Clustering Approach for Data Integration and Disease Subtyping
CN106127263A (en) * 2016-07-06 2016-11-16 中国人民解放军国防科学技术大学 The human brain magnetic resonance image (MRI) classifying identification method extracted based on three-dimensional feature and system
CN107133448A (en) * 2017-04-10 2017-09-05 温州医科大学 A kind of metabolism group data fusion optimized treatment method
CN108491770A (en) * 2018-03-08 2018-09-04 李书纲 A kind of data processing method based on fracture image
CN111403024A (en) * 2019-01-02 2020-07-10 天津幸福生命科技有限公司 Method and device for obtaining disease judgment model based on medical data
CN110970131A (en) * 2019-10-24 2020-04-07 中国医科大学附属盛京医院 Immunoglobulin quantitative detection-based glomerular disease classification typing model and application thereof
CN111583320A (en) * 2020-03-17 2020-08-25 哈尔滨医科大学 Breast cancer ultrasonic image typing method and system fusing deep convolutional network and image omics characteristics and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵文龙等: "基于数据挖掘技术的冠心病临床分型研究", 《中华医学会第二十次全国医学信息学术会议》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112420192A (en) * 2020-11-26 2021-02-26 清华大学 Disease typing method fusing multi-dimensional diagnosis and treatment information and related equipment
CN112420192B (en) * 2020-11-26 2023-12-15 清华大学 Disease typing method and related equipment integrating multidimensional diagnosis and treatment information
CN112735596A (en) * 2020-12-31 2021-04-30 神州医疗科技股份有限公司 Similar patient determination method and device, electronic equipment and storage medium
CN112820416A (en) * 2021-02-26 2021-05-18 重庆市公共卫生医疗救治中心 Major infectious disease queue data typing method, typing model and electronic equipment
CN113221162A (en) * 2021-04-28 2021-08-06 健康数据(北京)科技有限公司 Private disease-specific big data privacy protection method and system based on block chain
CN117112729A (en) * 2023-08-21 2023-11-24 北京科文思数据管理有限公司 Medical resource docking method and system based on artificial intelligence
CN117112729B (en) * 2023-08-21 2024-05-31 北京科文思数据管理有限公司 Medical resource docking method and system based on artificial intelligence

Also Published As

Publication number Publication date
CN111933281B (en) 2021-02-12

Similar Documents

Publication Publication Date Title
Lassau et al. Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients
Bermejo-Peláez et al. Classification of interstitial lung abnormality patterns with an ensemble of deep convolutional neural networks
CN111933281B (en) Disease typing determination system, method, device and storage medium
Rahaman et al. Identification of COVID-19 samples from chest X-Ray images using deep learning: A comparison of transfer learning approaches
Afshar et al. From handcrafted to deep-learning-based cancer radiomics: challenges and opportunities
Chieregato et al. A hybrid machine learning/deep learning COVID-19 severity predictive model from CT images and clinical data
Bozkurt et al. Using automatically extracted information from mammography reports for decision-support
Baker et al. Continuous and automatic mortality risk prediction using vital signs in the intensive care unit: a hybrid neural network approach
WO2015023732A1 (en) Systems, methods and devices for analyzing quantitative information obtained from radiological images
US20220084633A1 (en) Systems and methods for automatically identifying a candidate patient for enrollment in a clinical trial
Sunitha et al. A comparative analysis of deep neural network architectures for the dynamic diagnosis of COVID‐19 based on acoustic cough features
Abdulkareem et al. [Retracted] Automated System for Identifying COVID‐19 Infections in Computed Tomography Images Using Deep Learning Models
Hadjiiski et al. AAPM task group report 273: recommendations on best practices for AI and machine learning for computer‐aided diagnosis in medical imaging
CN111009321A (en) Application method of machine learning classification model in juvenile autism auxiliary diagnosis
Maldonado et al. Validation of the BRODERS classifier (Benign versus aggRessive nODule Evaluation using Radiomic Stratification), a novel HRCT-based radiomic classifier for indeterminate pulmonary nodules
Khanna et al. Radiologist-level two novel and robust automated computer-aided prediction models for early detection of COVID-19 infection from chest X-ray images
US11449680B2 (en) Method for testing medical data
Kang et al. Quantitative assessment of chest CT patterns in COVID-19 and bacterial pneumonia patients: a deep learning perspective
Li et al. Analysis and minimization of overtraining effect in rule‐based classifiers for computer‐aided diagnosis
Dack et al. Artificial intelligence and interstitial lung disease: Diagnosis and prognosis
JP2023532292A (en) Machine learning based medical data checker
Khader et al. Multimodal deep learning for integrating chest radiographs and clinical parameters: a case for transformers
Álvarez-Rodríguez et al. Does imbalance in chest X-ray datasets produce biased deep learning approaches for COVID-19 screening?
Khalid et al. Machine learning for feature selection and cluster analysis in drug utilisation research
Salem Salamh et al. A study of a new technique of the CT scan view and disease classification protocol based on level challenges in cases of coronavirus disease

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant