CN112349412B - Method for predicting probability of illness and electronic device - Google Patents

Method for predicting probability of illness and electronic device Download PDF

Info

Publication number
CN112349412B
CN112349412B CN201910720427.1A CN201910720427A CN112349412B CN 112349412 B CN112349412 B CN 112349412B CN 201910720427 A CN201910720427 A CN 201910720427A CN 112349412 B CN112349412 B CN 112349412B
Authority
CN
China
Prior art keywords
paths
variables
path
models
disease
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910720427.1A
Other languages
Chinese (zh)
Other versions
CN112349412A (en
Inventor
陈陪蓉
蔡宗宪
陈亮恭
彭莉甯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Acer Inc
National Yang Ming Chiao Tung University NYCU
Original Assignee
Acer Inc
National Yang Ming University NYMU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Acer Inc, National Yang Ming University NYMU filed Critical Acer Inc
Priority to CN201910720427.1A priority Critical patent/CN112349412B/en
Publication of CN112349412A publication Critical patent/CN112349412A/en
Application granted granted Critical
Publication of CN112349412B publication Critical patent/CN112349412B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a disease probability prediction method and an electronic device. The method comprises the following steps: determining a path length; obtaining a first path conforming to a path length from medical history data of a specific disease; obtaining a second path positively correlated with the specific disease from the first path; screening the second path to obtain a third path, and establishing a prediction model according to the third path; and inputting the path to be predicted into the prediction model and outputting the probability of suffering from the specific disease.

Description

Method for predicting probability of illness and electronic device
Technical Field
The invention relates to a disease probability prediction method and an electronic device.
Background
In general, a doctor can judge what disease is likely to be suffered from a specific disease after appearance by looking at the experience. For example, mental retardation is more likely to occur after suffering from diabetes. However, there is currently no effective method for studying the risk of a single disease, and it is known which disease is more likely to be affected by a specific disease after the disease is sequentially affected.
Disclosure of Invention
The invention provides a disease probability prediction method and an electronic device, which can be used for calculating information such as the proportion or probability of a certain specific disease of a patient according to the sequence of the disease of the patient.
The invention provides a disease probability prediction method for an electronic device, which comprises the following steps: determining a path length, wherein the path length is a number of diseases; obtaining a plurality of first paths conforming to the path length from a plurality of medical history data of a specific disease according to the path length, wherein the first paths are formed by other diseases sequentially suffered before the specific disease is suffered; obtaining a plurality of second paths positively correlated with the specific disease from the plurality of first paths according to the plurality of first paths; screening the plurality of second paths to obtain a plurality of third paths, and establishing a prediction model according to the plurality of third paths; and inputting a path to be predicted into the prediction model and outputting a probability that the path to be predicted suffers from the specific disease, wherein the path to be predicted is composed of a plurality of diseases.
The invention proposes an electronic device comprising: a memory circuit and a processor. The memory circuit records a plurality of modules. The processor accesses and executes the plurality of modules to perform the operations of: determining a path length, wherein the path length is a number of diseases; obtaining a plurality of first paths conforming to the path length from a plurality of medical history data of a specific disease according to the path length, wherein the first paths are formed by other diseases sequentially suffered before the specific disease is suffered; obtaining a plurality of second paths positively correlated with the specific disease from the plurality of first paths according to the plurality of first paths; screening the plurality of second paths to obtain a plurality of third paths, and establishing a prediction model according to the plurality of third paths; and inputting a path to be predicted into the prediction model and outputting a probability that the path to be predicted suffers from the specific disease, wherein the path to be predicted is composed of a plurality of diseases.
Based on the above, the disease probability prediction method and the electronic device of the present invention can find the order (also referred to as the route) of the disease before a specific disease (e.g. dyszhia) is suffered from the medical history data. People with these pathways will have a higher probability of getting the aforementioned specific disease in the future than people without these pathways. In addition, the disease probability prediction method of the present invention may further calculate the ratio or probability of suffering from a specific disease using the above-mentioned path.
In order to make the above features and advantages of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
FIG. 1 is a block diagram of an electronic device according to an embodiment of the invention;
FIG. 2 is a flow chart of a method for screening paths for predicting mental retardation using medical history data according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method of path screening according to an embodiment of the present invention;
FIG. 4 is a flow chart of a method for multimodal screening of variables in accordance with an embodiment of the invention;
fig. 5 is a flowchart illustrating a method for predicting probability of illness according to an embodiment of the invention.
Description of the reference numerals
100: electronic device
20: processor and method for controlling the same
22: input/output circuit
24: memory circuit
S201 to S205, S301 to S305, S401 to S405, S501 to S509: step (a)
Detailed Description
Reference will now be made in detail to exemplary embodiments of the invention, examples of which are illustrated in the accompanying drawings. In addition, wherever possible, the same reference numbers will be used throughout the drawings and the description to refer to the same or like parts.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the invention.
Referring to fig. 1, the electronic device 100 includes a processor 20, an input/output circuit 22, and a memory circuit 24. The input/output circuit 22 and the storage circuit 24 are coupled to the processor 20, respectively. The electronic device 100 is, for example, an electronic mobile device such as a desktop computer, a server, a mobile phone, a tablet computer, a notebook computer, etc., and is not limited thereto.
The processor 20 may be a central processing unit (Central Processing Unit, CPU) or other general purpose or special purpose Microprocessor (Microprocessor), digital signal processor (Digital Signal Processor, DSP), programmable controller, application specific integrated circuit (Application Specific Integrated Circuit, ASIC) or other similar element or combination thereof.
The input-output circuit 22 is, for example, an input interface or circuit for retrieving relevant data from outside the electronic device 100 or from other sources. The input/output circuit 22 may also transmit the data generated by the electronic device 100 to an output interface or circuit of another electronic device, which is not limited herein.
The memory circuit 24 may be any type of fixed or removable random access memory (random access memory, RAM) read-only memory (ROM), flash memory (flash memory), or the like, or a combination thereof.
In the present exemplary embodiment, the memory circuit 24 of the electronic device 100 stores a plurality of program code segments, and the program code segments are executed by the processor 20 after being installed. For example, the memory circuit 24 includes a plurality of modules, each of which is composed of one or more program code segments, for performing the respective operations of the probability prediction method applied to the electronic device 100. However, the present invention is not limited thereto, and the operations of the electronic device 100 may be implemented by other hardware forms.
It is noted that the order (also referred to as the route) of other diseases that a patient suffering from dysthymia has previously suffered from may be related to future dysthymia before the dysthymia is diagnosed. The disease probability prediction method can find out the sequence of the diseases and provide information for doctors to assist in preventing and treating the dyszhia, and the information is used as a tool for evaluating the dyszhia risk. For example, a physician can know what diseases are more or less mental than what is likely to be, and how much mental is likely to be, etc. to be, after having suffered from those diseases in sequence.
In particular, the present invention is exemplified by the case of the mental retardation, but the present invention is not limited thereto. In other embodiments, the method of predicting risk of developing a disease may be applied to predict other diseases other than dyszhia. Such as, but not limited to, parkinson's disease or other diseases. The following will describe the case of the mental retardation.
FIG. 2 is a flow chart of a method for screening paths for predicting mental retardation using medical history data according to an embodiment of the present invention. In the present exemplary embodiment, the definition of "path" is as follows: a combination of several different diseases arranged in a sequence that corresponds to the order of the time when the disease occurred earliest (or the earliest diagnosis was diagnosed). The definition of "path length" is as follows: number of diseases in the aforementioned path. In short, the path may be a disease that a patient is suffering from (or diagnosed with) in sequence, and the length of the path is the number of diseases that the patient is suffering from.
Referring to fig. 2, first, the processor 20 needs to perform a step of medical history data processing (step S201) to convert medical history data into the data of the path.
In detail, the processor 20 determines a path length according to the requirement. Then, the processor 20 may obtain a plurality of paths (hereinafter referred to as a first path) corresponding to the path length from the plurality of medical history data of the developing mental retardation according to the path length. In more detail, for each person in the data, the medical history data before suffering from the mental retardation is taken, and then the earliest occurrence time (or earliest diagnosis time) is taken for each disease in the medical history for disease sequencing. All disease orders that fit the aforementioned path length are taken from each person's disease ordering sequence, and the relative ordering of the disease is maintained. And taking out all paths which appear in the data.
The steps of the foregoing medical history data processing are exemplified below.
[ example of step S201 ]
Assume that the diagnostic history of a nail prior to the loss of intelligence is: a- > B- > C- > A- > D- > C. Wherein A, B, C and D are respectively different diseases, and the diagnosis history may be a visit record of a first person. After a history of diagnosis is obtained by the input/output circuit 22, the processor 20 may first order the time each disease was diagnosed at the earliest time, resulting in a disease ordering sequence: a- > B- > C- > D. Let the previously determined path length be 3. Processor 20 may take a combination of path lengths of 3 (i.e., consisting of 3 diseases) from the disease ordering sequence described above and maintain the relative ordering among the diseases, thereby resulting in the following path: a- > B- > C, A- > B- > D, A- > C- > D, B- > C- > D.
It is assumed that all paths taken from data of all patients (i.e., the aforementioned first paths) are as follows: a- > B- > C, A- > B- > D, A- > C- > D, B- > C- > D and B- > A- > C. Featuring these pathways, the disease ordering sequence of a nail was transformed into pathway data as follows in table 1:
A->B->C A->B->D A->C->D B->C->D B->A->C
certain armor 1 1 1 1 0
TABLE 1
Similarly, the path length 3 can be obtained from the data of each patient in the manner described above, and the path data of each patient can be recorded in table 1.
Referring to fig. 2 again, the processor 20 then performs a step of finding a positive correlation path (step S203). For example, the processor 20 may obtain a plurality of paths (also referred to as a second path) positively associated with the mental retardation from the first paths according to the plurality of first paths. More specifically, the processor 20 may use machine learning or statistical methods to calculate each path in the above table 1 individually to determine whether a path is positively or negatively correlated with the mental retardation. For example, the values of each row in table 1 are summed, paths whose sum is greater than a threshold value are identified as positively correlated paths, and paths whose sum is not greater than the threshold value are identified as negatively correlated paths. The processor 20 may reserve the positively correlated path and delete the negatively correlated path. In particular, a patient with a positively correlated pathway may have a higher chance of developing a mental retardation than a patient without a positively correlated pathway.
After performing step S203, the processor 20 performs a path filtering step (step S205), thereby filtering the aforementioned positively correlated paths (i.e., the second paths) to obtain a plurality of paths (also referred to as third paths) therefrom. Processor 20 may build a predictive model according to such third paths. In particular, the step mainly screens the second path according to the performance level of the predictive intelligence failure to find a third path with better predictive ability from the second paths.
Fig. 3 is a flow chart of a method of path screening according to an embodiment of the present invention.
Referring to fig. 3, the aforementioned step S205 can be subdivided into steps S301 to S305. In more detail, in the process of executing step S205, the processor 20 generates a variable through feature engineering (step S301). For example, the processor 20 may generate a plurality of variables corresponding to a plurality of modes (patterns) according to the plurality of second paths.
In more detail, step S301 is to extract a part or all of the diseases from each of the second paths and generate new variables accordingly. In particular, there are many different patterns (patterns) for generating new variables based on the location of the disease, the order of the disease, and the number of diseases.
V (Count, position, order) is defined herein as the pattern of the new variable. Where "Count" represents the number of diseases taken from a path, at least 1 and at most the path length. "Position" represents a feature of whether a way to take a disease retains its Position in the original path, and its value may be "retention Position (Preserve Position, PP) or" Ignore Position (IP) ". "Order" means whether the way of taking the disease preserves the Order of the disease, and its value may be "guaranteed Order (hereinafter referred to as PO)" or "unordered Order (hereinafter referred to as IO)". In particular, when the number of diseases is "1", the Order is meaningless, and the "Order" field value may be set to "X".
Taking path length 3 as an example, the new variables can be represented by the following modes (1) to (7).
V (1, PP, X) … … … … … … … … … … … … … … … … mode (1)
V (1, IP, X) … … … … … … … … … … … … … … … … mode (2)
V (2, PP, PO) … … … … … … … … … … … … … … … mode (3)
V (2, PP, IO) … … … … … … … … … … … … … … … mode (4)
V (2, IP, PO) … … … … … … … … … … … … … … … mode (5)
V (2, IP, IO) … … … … … … … … … … … … … … … mode (6)
V (3, X, IO) … … … … … … … … … … … … … … … … mode (7)
It should be noted that, for a path, since the Position is meaningless when the number is "3", the value of the "Position" field is X, and when the "Order" field is "PO", the path is represented, so the mode "V (3, X, PO)" is not counted as a new variable.
In this embodiment, the processor 20 performs the feature engineering on all the second paths to obtain all the new variables, and then converts the path data of each person into the new variable data. In particular, in one embodiment, the processor 20 may pre-establish a table for recording the correspondence between the path and the variable. The processor 20 may generate a plurality of variables corresponding to the plurality of modes according to the second path and the lookup table.
The step of converting the path data into new variable data is as follows: for a new variable X, if there is at least one path in the path data of a patient, X can be generated by the above procedure, and in the new variable data of the patient, the value of X field=1, otherwise=0.
The following describes a process of generating new variables by feature engineering and a process of converting the new variables into path data by taking a path as an example.
[ example of step S301 ]
Assume that a path of length 3 is: a- > B- > C. Because the path length is 3, in step S301, the new variable is a number of new variables obtained from the aforementioned patterns (1) to (7) by taking a part or all of 3 diseases (i.e., taking a minimum of 1 disease and a maximum of 3 diseases).
For mode V (1, PP, X), a new variable can be obtained: a_1, b_2 and c_3. Wherein the numbers following the bottom line represent the location of the disease in the path.
For mode V (1, IP, X), a new variable can be obtained: A. b and C.
For mode V (2, pp, po), a new variable can be obtained: a- > b_1 and B- > c_2. Wherein "- >" represents the order. Taking the variable "A- > B_1" as an example, it is indicated that the order of suffering from the disease is A followed by B, and A is located at the first position in a path. For another example, the variable "B- > C_2" represents the order of suffering from a disease B followed by C, and B is located at the second position in a path.
For mode V (2, pp, io), a new variable can be obtained: a & b_1&2, a & c_1&3 and B & c_2&3. Taking the variable "A & B_1&2" as an example, it is meant that the afflicted disease includes A and B (order is not limited), and that the two diseases are located at the first and second positions in a path. In other words, this example represents that in a path, from the first location, it may be a first then B or B first then a. For another example, the variables "A & C_1&3" are used as examples, which represent the suffering from a disease including A and C (order not limited), and the two diseases are located at the first and third positions in a path. In other words, in a path, the two positions (i.e., the first and third positions) may be a-then-C or C-then-a. Other variables may be similarly referred to and are not described herein.
For mode V (2, ip, po), a new variable can be obtained: a- > B and B- > C. Taking the variable "A- > B" as an example, it is meant that the disease is in turn A and B, but the positions of A and B in a path are not limited. For another example, the variable "B- > C" is used as an example to represent B and C in the order of the disease, but the positions of B and C in a path are not limited. Other variables may be similarly referred to and are not described herein.
For mode V (2, ip, io), a new variable can be obtained: a & B, A & C and B & C. Taking the variable "A & B" as an example, it is meant that the disease is a disease including A and B (the order is not limited), and the positions of A and B in the path are not limited. For another example, the variable "A & C" is used as an example, and represents a disease including A and C (the order is not limited), and the positions of A and C in a path are not limited. Other variables may be similarly referred to and are not described herein.
For mode V (3, X, IO), a new variable can be obtained: a & B & C. Taking the variable "A & B & C" as an example, it is meant that the disease is affected by A, B and C (the order is not limited), and the positions of A, B and C in the pathway are not limited.
By the modes (1) to (7) described above, 17 new variables can be obtained in total.
In one example, if there is a path a- > B- > C (field value=1) in the path data of a certain B, the values of the 17 new variables of a certain B are set to "1". In another example, if a- > B- > C field=0 and a- > B- > G field=1 in the path data of a certain person, the value of the new variables related to A, B of the certain person is also set to "1" because the path a- > B- > G can also generate new variables such as A, B, A- > B, A & B.
Referring to fig. 3 again, after step S301 is performed, the processor 20 performs a step of using the multimodal filter variable (step S303). In more detail, the processor 20 uses a plurality of models to filter the plurality of variables generated in the step S301 to obtain a plurality of optimal variables from the plurality of variables.
FIG. 4 is a flow chart of a method for multimodal screening of variables in accordance with an embodiment of the invention.
Referring to fig. 4, the aforementioned step S303 can be further subdivided into steps S401 to S405.
Referring to fig. 4, in the step of performing multi-model filtering of the variables, the processor 20 first determines a plurality of machine learning algorithms and a plurality of variable input modes, and performs permutation and combination according to the determined machine learning algorithms and variable input modes to generate a plurality of models. Then, the processor 20 may input the plurality of variables generated in the aforementioned step S301 to the model generated in the step S401 to obtain the filtered variable (step S401).
The variable input mode refers to how the variable is input to the machine learning model. A machine learning model (also known as One-model) can be built for all variables put in at a time and the output of this model used; or according to the variable mode (also called By-pattern) of the original variable, a plurality of models are built, and finally the output results of the models are combined.
The random forest algorithm and the rogowski regression algorithm are described below as examples. The processor 20 generates the combination of the machine learning method and the variable input mode as shown in the following table two:
model numbering Machine learning method Variable input mode Number of models
M01 Random forest One-model 1
M02 Random forest By-pattern 7
M03 Rogels regression One-model 1
M04 Rogels regression By-pattern 7
Watch II
Specifically, in the step of generating the model in step S401, the processor 20 may use a machine learning algorithm to build a model (also referred to as a first model) in the modes (1) to (7) described above. Referring to Table two, in generating the first model, the processor 20 may model the patterns (1) to (7) to generate seven models (i.e. model M02 in Table two) using a random forest algorithm. For another example, processor 20 may also model patterns (1) through (7) described above to generate seven models (i.e., model M04 in table two) using the rogowski regression algorithm, respectively.
In addition, in the step of generating the model in step S401, the processor 20 may generate a model (also referred to as a second model) corresponding to the aforementioned modes (1) to (7) using a machine learning algorithm. Referring to Table II, in generating the second model, the processor 20 may use a random forest algorithm to build a model (i.e., model M01 in Table II) for the modes (1) to (7). Processor 20 models patterns (1) through (7) to generate a model (i.e., model M03 in table two) using the rogowski regression algorithm.
After obtaining the first model and the second model, the processor 20 may input the plurality of variables obtained in step S301 to each of the first models (e.g., seven models in the model M02 or seven models in the model M04) to obtain a post-screening variable (also referred to as a first post-screening variable) output by each model. In other words, taking seven models of the model M02 as an example, the filtered variables share seven groups of variables. The processor 20 then performs a union operation on the seven sets of post-screening variables to obtain a set of post-screening variables (also referred to as a second post-screening variable). Similarly, taking seven models of model M04 as an example, the post-screening variables share seven groups of variables. The processor 20 then performs a union operation on the seven sets of filtered variables to obtain a set of filtered variables.
In addition, the processor 20 inputs the plurality of variables obtained in step S301 to the second model to obtain a post-screening variable (also referred to as a third post-screening variable). For example, taking the model of model M01 as an example, since model M01 includes only one model, there is only one set of variables after screening. Similarly, taking the model of model M03 as an example, since model M03 includes only one model, there is only one set of variables after screening.
Then, the processor 20 performs a performance prediction on the filtered variables (e.g., the second filtered variable and the third filtered variable) using a plurality of third models (step S403) to select the best one with the better prediction accuracy as the "best variable" (step S405).
Taking the models M01 to M04 as an example, 4 sets of post-screening variables can be obtained through the above step S401. In performing performance prediction on each set of filtered variables, the processor 20 builds a plurality of models, each model resulting in a predicted performance (e.g., prediction accuracy), using a plurality of machine learning methods for each set of filtered variables. The predicted performance is statistically (e.g., averaged, maximized, etc.) to calculate a performance representative of the set of screened variables. Then, in selecting the best variable, the processor 20 compares the predicted performance of each set of the filtered variables and selects the best one (e.g., the one with the highest predicted performance) as the best variable.
Referring to fig. 3 again, after step S303 is performed, the processor 20 restores the plurality of optimal variables to paths (also referred to as third paths) corresponding to the plurality of optimal variables (step S305). For example, the processor 20 may restore the optimal variable to the third path corresponding to the optimal variable according to the above-mentioned comparison table used in step S301.
After obtaining the third paths, the processor 20 may build a predictive model based on such third paths. Then, when performing risk assessment of the mental retardation, the processor 20 may input a path to be predicted to the prediction model and output a probability of the path to be predicted suffering from the specific disease. The path to be predicted is composed of a plurality of diseases, and the number of the diseases is the same as the path length determined in the previous step S201.
An example of risk assessment for a mental disorder may be as follows:
after obtaining the route to be predicted, the processor 20 calculates risk information such as the proportion of the route to be predicted to be afflicted with the nootropic disease, the proportion of the route to be predicted to be afflicted with the nootropic disease in the group nootropic patients, and the like, and establishes a comparison table of the route and the nootropic disease risk information according to the information. Then, given the medical history data of any person, the patient risk can be estimated by using the comparison table after the medical history data is converted into the path data. The following table three is a schematic diagram of the path and risk comparison table:
path Proportion of dyszhision in the route Patients with dystopia take up the total proportion
Dizzy syndrome, anxiety syndrome and mental retardation 52% 4.3%
Coronary heart disease, apoplexy and mental retardation 58% 2.8%
Watch III
Another risk assessment paradigm for nootropic disorders may be as follows:
the processor 20 may establish a prediction model for each set of the filtered variables after the step S401, and take the prediction model of the optimal variable as the optimal model for generating risk indexes such as the probability of the misuse or the prediction label. Thereafter, when given the medical history of any person, the processor 20 may further convert the path to an optimal variable after converting to path data, and calculate an risk indicator (e.g., probability or label) for the mental retardation using the optimal model.
Fig. 5 is a flowchart illustrating a method for predicting probability of illness according to an embodiment of the invention.
Referring to fig. 5, in step S501, the processor 20 determines a path length, wherein the path length is the number of diseases. In step S503, the processor 20 obtains a plurality of first paths corresponding to the path lengths from a plurality of medical history data of the specific disease according to the path lengths, wherein the first paths are formed by other diseases sequentially suffered before the specific disease. In step S505, the processor 20 obtains a plurality of second paths positively correlated with the specific disease from the first paths according to the aforementioned plurality of first paths. In step S507, the processor 20 screens the second paths to obtain a plurality of third paths, and builds a prediction model according to the third paths. Finally, in step S509, the processor 20 inputs the path to be predicted to the prediction model and outputs the probability of the path to be predicted suffering from the specific disease, wherein the path to be predicted is composed of a plurality of diseases.
In summary, the disease probability prediction method and the electronic device of the present invention can find the order (also referred to as the route) of the disease before a specific disease (e.g. dyszhia) is suffered from the medical history data. People with these pathways will have a higher probability of getting the aforementioned specific disease in the future than people without these pathways. In addition, the disease probability prediction method of the present invention may further calculate the ratio or probability of suffering from a specific disease using the above-mentioned path.
Although the invention has been described with reference to the above embodiments, it should be understood that the invention is not limited thereto, but rather may be modified or altered somewhat by persons skilled in the art without departing from the spirit and scope of the invention.

Claims (8)

1. A method of predicting probability of illness for an electronic device, the method comprising:
determining a path length, wherein a path is a disease that a patient sequentially suffers from or is diagnosed with, and the path length is the number of the disease that the patient sequentially suffers from or is diagnosed with;
obtaining a plurality of first paths conforming to the path length from a plurality of medical history data of a specific disease according to the path length, wherein the first paths are formed by other diseases before and sequentially suffering from the specific disease;
obtaining a plurality of second paths positively correlated with the specific disease from the plurality of first paths according to the plurality of first paths;
screening the plurality of second paths to obtain a plurality of third paths, and building a prediction model according to the plurality of third paths; and
inputting a path to be predicted to the prediction model and outputting the probability that the path to be predicted is suffering from the specific disease, wherein the path to be predicted is composed of a plurality of diseases,
wherein the step of screening the plurality of second paths to obtain the plurality of third paths comprises:
generating a plurality of variables corresponding to a plurality of modes according to the plurality of second paths;
screening the plurality of variables using a plurality of models to obtain a plurality of optimal variables from the plurality of variables; and
reducing the plurality of optimal variables to the plurality of third paths corresponding to the plurality of optimal variables,
wherein the plurality of patterns are associated with a combination of the location of the disease, the order of the disease, and the number of diseases in each of the plurality of second paths.
2. The method of claim 1, wherein the step of screening the plurality of variables using the plurality of models to obtain the plurality of best variables from the plurality of variables comprises:
determining a machine learning algorithm;
determining a plurality of variable input modes; and
generating the plurality of models based on the determined machine learning algorithm and the plurality of variable input modes,
the variable input modes include putting all variables at one time to establish a model and using the output results of the model, establishing a plurality of models according to the variable modes of the original variables, and finally performing a union operation on the output results of the plurality of models.
3. The method of claim 2, wherein generating the plurality of models from the determined machine learning algorithm and the plurality of variable input modes comprises:
establishing a plurality of models for the plurality of modes respectively using the machine learning algorithm to generate a plurality of first models; and
one model is built for the plurality of patterns using the machine learning algorithm as a second model.
4. The method of claim 3, wherein the step of screening the plurality of variables using the plurality of models to obtain the plurality of best variables from the plurality of variables comprises:
inputting the plurality of variables into the plurality of first models to obtain a first screened variable output by each of the plurality of models, and performing a union operation on the first screened variable output by each of the plurality of first models to obtain a second screened variable;
inputting the plurality of variables into the second model to obtain a third post-screening variable; and
performing performance prediction on the second and third filtered variables using a plurality of third models, respectively, to select a prediction accuracy better from the second and third filtered variables as the plurality of best variables,
and establishing a plurality of models for the second filtered variable and the third filtered variable by using a plurality of machine learning algorithms to obtain a plurality of third models, so that each third model outputs the corresponding prediction accuracy.
5. The method of claim 2, wherein the plurality of machine learning algorithms includes a random forest algorithm and a rogowski regression algorithm.
6. The method of claim 1, wherein generating the plurality of variables corresponding to the plurality of modes from the plurality of second paths comprises:
and generating the variables corresponding to the modes according to the second paths and a comparison table.
7. The method of claim 6, wherein the step of reducing the plurality of best variables to the plurality of third paths corresponding to the plurality of best variables comprises:
and restoring the optimal variables into the third paths corresponding to the optimal variables according to the comparison table.
8. An electronic device, comprising:
a memory circuit that records a plurality of modules; and
a processor accessing and executing the plurality of modules to perform the operations of:
determining a path length, wherein a path is a disease that a patient is suffering from or diagnosed with in sequence, the path length being the number of the disease that the patient is suffering from or diagnosed with in sequence,
obtaining a plurality of first paths conforming to the path length from a plurality of medical history data of a specific disease according to the path length, wherein the first paths are composed of other diseases before and sequentially suffering from the specific disease,
obtaining a plurality of second paths positively correlated with the specific disease from the plurality of first paths based on the plurality of first paths,
screening the plurality of second paths to obtain a plurality of third paths, building a prediction model according to the plurality of third paths, and
inputting a path to be predicted to the prediction model and outputting the probability that the path to be predicted is suffering from the specific disease, wherein the path to be predicted is composed of a plurality of diseases,
wherein the step of screening the plurality of second paths to obtain the plurality of third paths comprises:
generating a plurality of variables corresponding to a plurality of modes according to the plurality of second paths;
screening the plurality of variables using a plurality of models to obtain a plurality of best variables from the plurality of variables; and
reducing the plurality of optimal variables to the plurality of third paths corresponding to the plurality of optimal variables,
wherein the plurality of patterns are associated with a permutation and combination of locations of diseases, order of diseases, and number of diseases in each of the plurality of second paths.
CN201910720427.1A 2019-08-06 2019-08-06 Method for predicting probability of illness and electronic device Active CN112349412B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910720427.1A CN112349412B (en) 2019-08-06 2019-08-06 Method for predicting probability of illness and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910720427.1A CN112349412B (en) 2019-08-06 2019-08-06 Method for predicting probability of illness and electronic device

Publications (2)

Publication Number Publication Date
CN112349412A CN112349412A (en) 2021-02-09
CN112349412B true CN112349412B (en) 2024-03-22

Family

ID=74366416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910720427.1A Active CN112349412B (en) 2019-08-06 2019-08-06 Method for predicting probability of illness and electronic device

Country Status (1)

Country Link
CN (1) CN112349412B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778014A (en) * 2016-12-29 2017-05-31 浙江大学 A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network
TW201725526A (en) * 2015-09-30 2017-07-16 伊佛曼基因體有限公司 Systems and methods for predicting treatment-regimen-related outcomes
CN108573752A (en) * 2018-02-09 2018-09-25 上海米因医疗器械科技有限公司 A kind of method and system of the health and fitness information processing based on healthy big data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8504343B2 (en) * 2007-01-31 2013-08-06 University Of Notre Dame Du Lac Disease diagnoses-bases disease prediction
US20170169180A1 (en) * 2015-12-14 2017-06-15 International Business Machines Corporation Situation-dependent blending method for predicting the progression of diseases or their responses to treatments
KR102024373B1 (en) * 2016-12-30 2019-09-23 서울대학교 산학협력단 Apparatus and method for predicting disease risk of metabolic disease

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201725526A (en) * 2015-09-30 2017-07-16 伊佛曼基因體有限公司 Systems and methods for predicting treatment-regimen-related outcomes
CN106778014A (en) * 2016-12-29 2017-05-31 浙江大学 A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network
CN108573752A (en) * 2018-02-09 2018-09-25 上海米因医疗器械科技有限公司 A kind of method and system of the health and fitness information processing based on healthy big data

Also Published As

Publication number Publication date
CN112349412A (en) 2021-02-09

Similar Documents

Publication Publication Date Title
US11152119B2 (en) Care path analysis and management platform
US20210391079A1 (en) Method and apparatus for monitoring a patient
CN114528934A (en) Time series data abnormity detection method, device, equipment and medium
CN112132624A (en) Medical claims data prediction system
CN112201346A (en) Cancer survival prediction method, apparatus, computing device and computer-readable storage medium
Filipe et al. Predict hourly patient discharge probability in Intensive Care Units using Data Mining
CN116831523A (en) Alarm method, device, equipment and storage medium based on health monitoring
CN114141349A (en) Intelligent allocation method and system for ICU nursing personnel
TWI774964B (en) Disease suffering probability prediction method and electronic apparatus
Azeez et al. Secondary triage classification using an ensemble random forest technique
CN113707263A (en) Medicine effectiveness evaluation method and device based on group division and computer equipment
CN112349412B (en) Method for predicting probability of illness and electronic device
CN117637159A (en) Method, device, equipment and medium for constructing disease risk prediction model
CN116844725A (en) Health information generation method, device, medium and equipment
KR20130104883A (en) Apparatus and method for prediction of cac score level change
US20210043327A1 (en) Computing device, portable device and computer-implemented method for predicting major adverse cardiovascular events
CN111354449B (en) Long-term care strategy distribution method, device, computer equipment and storage medium
JP6301853B2 (en) Secular change prediction system
CN110390999B (en) Value range calculation method and device of clinical data, readable medium and electronic equipment
Khidirova et al. Intelligent system of diagnosing and predicting cardiovascular diseases
Aguirre et al. Stability analysis of sleep apnea time series using identified models: a case study
JP7328466B1 (en) Diagnostic system and method
CN118299070B (en) Treatment effect estimation method, system, equipment and medium based on inverse fact prediction
WO2023224085A1 (en) Information processing system and information processing method
US20230085062A1 (en) Generating a recommended periodic healthcare plan

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant