CN116364274A

CN116364274A - Disease prediction method and system based on causal inference and dynamic integration of multiple labels

Info

Publication number: CN116364274A
Application number: CN202310268757.8A
Authority: CN
Inventors: 张岩波; 杨弘; 田晶; 闫晶晶; 李靓; 何航帜; 杨晓敏
Original assignee: Shanxi University of Chinese Mediciine; Shanxi Medical University
Current assignee: Shanxi University of Chinese Mediciine; Shanxi Medical University
Priority date: 2023-03-16
Filing date: 2023-03-16
Publication date: 2023-06-30

Abstract

The invention provides a disease prediction method and a disease prediction system based on causal inference and dynamic integration of multiple labels, wherein the method comprises the following steps: acquiring multi-source information of a patient, comprising: demographic index, lifestyle, physical examination, complaint symptoms, past medical history, and past medication information; establishing a causal model to analyze causal relations among all the features and select a feature set with causal effects; training a plurality of multi-label-base learning classifiers by utilizing a feature set with causal effect, and updating weights through stacking integration to obtain a prediction model with optimal performance; and dynamically constructing a new multi-label integrated prediction model by combining prediction models with different numbers and types and with optimal performance, and selecting a combination model with highest prediction performance to predict the disease. The prediction method provided by the invention can reflect the causal relationship among the features, avoid misjudgment caused by correlation, improve the prediction accuracy, help to improve the diagnosis level and treatment effect of doctors, and promote the digital and intelligent development of the medical industry.

Description

Disease prediction method and system based on causal inference and dynamic integration of multiple labels

Technical Field

The invention relates to the technical field of disease prediction, in particular to a disease prediction method and system based on causal inference and dynamic integration of multiple labels.

Background

Currently, predictions for various chronic diseases rely mainly on traditional medical and biometric methods. These methods are typically based on sample data sets, using some machine learning or artificial intelligence algorithms, such as support vector machines, decision trees, neural networks, etc., to predict the risk of onset of the disease. However, these methods have some limitations, such as failure to take into account interactions and time evolution of various factors, and thus, accuracy and reliability of prediction are limited.

Traditional machine learning algorithms typically ignore potential causal relationships in making disease predictions, which may lead to model bias in the predictions. In addition, the progression of the disease is often dynamic, and time factors also play an important role in the prognosis of the disease. Currently, disease prediction plays a vital role in clinical practice. Traditional disease prediction methods typically focus only on specific disease indicators or symptoms, ignoring complex relationships between different factors. Therefore, misjudgment and missed diagnosis often occur during diagnosis and treatment.

Disclosure of Invention

Therefore, the technical problem to be solved by the invention is to overcome the defects existing in the prior art, thereby providing a disease prediction method and a disease prediction system based on causal inference and dynamic integration of multiple labels, and improving the accuracy of disease prediction by analyzing causal relations among different data by using a causal inference algorithm; meanwhile, the robustness and generalization performance of the algorithm are improved by training and predicting by adopting a dynamic integrated multi-label algorithm.

The technical scheme for solving the technical problems is as follows:

in a first aspect, the present invention provides a causal inference and dynamic integration multi-label based disease prediction method, comprising the steps of:

acquiring multi-source information of a patient, comprising: demographic index information, lifestyle information, physical examination information, complaint symptoms, past medical history information, and past medication information;

establishing a causal model to analyze causal relations among all the features, and screening feature sets with causal effects;

training a plurality of multi-label-base learning classifiers by using the feature set with causal effect, and updating weights through stacking integration to obtain a prediction model with optimal performance;

and dynamically constructing new multi-label integrated prediction models by combining different numbers and different types of prediction models with optimal performance, and selecting a combination model with highest prediction performance to predict the disease.

Optionally, the demographic index information includes: sex, age, height, weight; the lifestyle information includes: history of smoking and history of drinking; physical examination information, comprising: biochemical examination, electrocardiogram information and imaging data; the past medical history information includes: patient's own medical history and family medical history.

Optionally, the step of establishing a causal model to analyze causal relationships among the features and screen feature sets with causal effects includes:

constructing a Bayesian network: let probability P (U) be the joint probability distribution of outcome y, y e L, n=1,..n, N represents the number of patients, l= { L1, L2,..q } is the set of q different binary outcome labels, U is the set of nodes G of the directed acyclic graph, if G < U, G, P (U) > satisfies the markov condition, the triplet of < U, G, P (U) > is called bayesian network, each variable being independent of any subset of non-child items under the parent condition in G;

training a markov chain: setting BN<U,G,P(U)>F in a loyalty based hypothetical bayesian network _i E F, denoted MB (Fi), where MB (F _i )＝{pa(F _i )Uch(F _i )Usp(F _i ) Is the only term, F represents different features, pa (F _i ) Represents F _i Of the parent node set, i.e. directly affecting F _i Is a variable set of (1); ch (F) _i ) Represents F _i Of the sub-node sets, i.e. F _i A set of directly affected variables; sp (F) _i ) Representation and F _i Other node sets with the same parent node, with F _i A set of variables having an indirect influence relationship;

screening multi-label association features: the ending probability P (T _i S) maximization, wherein

The causal feature selection for data set D is defined as:

S ^* ＝arrgmax|S|,

s.t.P ⁱ (T _i S)＝P′(T _i S)(T _i GT′ _i ,,j≠i)

wherein T represents a disease category that may be output;

repeating the process of training the Markov chain and screening the multi-label associated features, finally maximizing the most feature distribution probability corresponding to all ending labels y E L, and selecting feature sets with causal effects.

Optionally, the process of training a plurality of multi-label based learning classifiers comprises:

initializing: for all patient individuals i, initializing the weight W of each patient ₁ (i, l) acquiring an initialisation sample dataset D ₁ L represents a label, and the iteration number t=1 is set;

training a base classifier: data set D using an mth base classifier ₁ Stacking and integrating, and training a single base classifier h _m1 (x, l) predicting patient outcome;

and (5) weight updating: computing hamming of a base classifierLoss, i.e. misclassified label proportion e _t Calculating an update weight alpha _t By alpha _t Calculate the next iteration update W _t+1 (i,l)；

Repeating the integration iteration, namely setting the iteration times t=t+1 until the preset iteration times are reached;

weight lifting learning classifier weighted voting: taking a single classifier h of t=1, …, T _mt To obtain a lifting learning classifier h _m 。

Optionally, the step of dynamically constructing a new multi-label integrated prediction model by combining different numbers and different kinds of base classifiers includes:

initializing: raw dataset D with causal effects _s Is an empty set;

classifying the training samples: using trained basis learning classifier h _m (x, l) vs. feature x ₁ Classifying to obtain c _1m ＝h _m (x _1, l)；

Updating data set D _s ：D _s ＝{c ₁₁ ,c ₁₂ ,…,c _1m Y, repeating the classifying process of the training samples until all N inpatients are classified and predicted to obtain c _nm ＝h _m (x _n L) to obtain a new dataset D _s ＝{((c _i1 ,c _i2 ,…,c _im ),y)}；

Integrated part of training model: using the new dataset D _s Model results of the training model, the model is dynamically selected according to the base learner pool, and finally stacked and integrated, and the new learning algorithm Z is used in the part, so that H=Z (D _s )；

And (3) outputting: h (x) =h (H ₁ (x,l),h ₂ (x,l),…,h _m (x,l))。

Optionally, after acquiring the multi-source information of the patient, the method further includes:

and preprocessing and cleaning the multi-source information, removing noise and abnormal values, and performing feature selection and dimension reduction operation.

In a second aspect, embodiments of the present invention provide a causal inference and dynamic integration multi-labeled disease prediction system, the system comprising:

a data collection module for obtaining multi-source information of a patient, comprising: demographic index information, lifestyle information, physical examination information, complaint symptoms, past medical history information, and past medication information;

the causal inference module is used for analyzing causal relations among all the features to establish a causal model, and screening feature sets with causal effects based on the causal inference model;

the dynamic integrated multi-label algorithm module is used for training a plurality of multi-label-based learning classifiers by utilizing the feature set with causal effect, and updating weights through stacking integration to obtain a prediction model with optimal performance;

and the disease prediction model determining module is used for dynamically constructing a new multi-label integrated prediction model through different numbers and different types of prediction model combination modes with optimal performance, and selecting a combination model with highest prediction performance to predict the disease.

In a third aspect, an embodiment of the present invention provides a computer apparatus, including: the system comprises a memory and a processor, wherein the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions, thereby executing the method in the first aspect or any optional implementation manner of the first aspect.

In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium storing computer instructions for causing a computer to perform the method of the first aspect, or any one of the alternative embodiments of the first aspect.

The disease prediction method and system based on causal inference and dynamic integration of multiple labels provided by the embodiment of the invention can reflect causal relationships among various characteristics by analyzing causal relationships and dynamic changes among different patient characteristics and combining a multiple label classification algorithm, avoid misjudgment caused by correlation, improve disease prediction accuracy, help improve diagnosis level and treatment effect of doctors, and facilitate the digitized and intelligent development of the medical industry. Meanwhile, the disease prediction method and the disease prediction system provided by the invention have wide application prospects, and have important significance in the fields of medical care, health management, medical scientific research and the like.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a disease prediction method based on causal inference and dynamic integration of multiple tags according to an embodiment of the present invention;

FIG. 2 is a schematic block diagram of a disease prediction method based on causal inference and dynamic integration of multiple tags according to an embodiment of the present invention;

FIG. 3 is a flowchart showing key steps of a causal inference and dynamic integration multi-label based disease prediction method according to an embodiment of the present invention;

FIG. 4 is a schematic block diagram of a causal inference and dynamic multi-label integrated disease prediction system according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

The embodiment of the invention provides a disease prediction method based on causal inference and dynamic integration of multiple labels, wherein a flow chart of the method is shown in fig. 1, and a schematic block diagram is shown in fig. 2:

step S1: acquiring multi-source information of a patient, comprising: demographic index information, lifestyle information, physical examination information, complaint symptoms, past medical history information, and past medication information.

Specifically, in the embodiment of the present invention, the demographic index information includes: sex, age, height, weight, occupation; the lifestyle information includes: history of smoking and history of drinking; physical examination information, comprising: biochemical examination (white blood cells, red blood cells, distribution width of red blood cells, hemoglobin, platelets, etc.), electrocardiographic information, and imaging data; the past medical history information includes: patient's own medical history and family medical history. The complaint symptoms are clinical manifestations of the patient's own initiative, such as chest pain, nausea, palpitations, etc.

Step S2: and establishing a causal model to analyze causal relation among all the features, and screening out a feature set with causal effect.

Specifically, in the embodiment of the present invention, the collected patient information is used to construct a multi-label causal feature selection framework according to the causal invariance principle, and a feature set with causal effect of a final multi-label data source is given by using the concept of a markov blanket MB in a bayesian network, which specifically includes the following steps:

step S21: constructing a Bayesian network: let probability P (U) be the joint probability distribution of outcome y, y e L, n=1,..n, N represents the number of patients, l= { L1, L2,..q } is the set of q different binary outcome labels, U is the set of nodes G of the directed acyclic graph, if G < U, G, P (U) > satisfies the markov condition, the triplet of < U, G, P (U) > is called bayesian network, each variable being independent of any subset of non-child items under the parent condition in G;

step S22: training a markov chain: setting BN<U,G,P(U)>F in Bayesian networks for loyalty-based assumptions _i E F, denoted MB (Fi), where MB (F _i )＝{pa(F _i )Uch(F _i )Usp(F _i ) Is the only term, F represents different features, pa (F _i ) Represents F _i Of the parent node set, i.e. directly affecting F _i Is a variable set of (1); ch (F) _i ) Represents F _i Of the sub-node sets, i.e. F _i A set of directly affected variables; sp (F) _i ) Representation and F _i Other node sets with the same parent node, with F _i There is a set of variables that indirectly affect the relationship.

Step S23: screening multi-label association features: the ending probability P (T _i S) maximization, wherein

The causal feature selection for data set D is defined as:

S ^* ＝arrgmax|S|,

s.t.P ⁱ (T _i S)＝P′(T _i S)(T _i GT _i ′ _, ,j≠i)

wherein T represents a disease category that may be output;

step S24: repeating the steps S22-S23, finally maximizing the probability of the most effective feature distribution corresponding to all the ending labels y E L, and selecting the feature set with causal effect.

Step S21-step S22 constructs causal chains through constructing causal network diagrams, selects the characteristics which are possible to have causal effect on the predicted diseases, combines the potential disease labels with the screened variables through step S23, and continuously screens the characteristics which have causal effect on the disease labels. Through the process, the dimension reduction can be effectively carried out on the patient information, the characteristics with causal effects are screened out for subsequent model construction, and the screened characteristics are ensured to have real causal effects on the labels.

Step S3: and training a plurality of multi-label-base learning classifiers by using the feature set with causal effect, and updating weights through stacking integration to obtain a prediction model with optimal performance.

The goal of step S3 is to find the underlying multi-label classifier to improve model predictive performance. Because of various multi-label algorithms, the selection of the basic classifier is a difficult point, in the embodiment of the invention, the basic classifier is built by combining four multi-label models of BR, CC, LP and RAkEL which are relatively stable in current performance by taking C4.5 as a meta-classifier, and the optimal prediction model is obtained by updating weights through Stacking integration (Stacking). For m=1, the embodiment of the invention, M, trains M multi-label based learning classifiers, comprising the steps of:

step S31: initializing: for all patient individuals i, initializing the weight W of each patient ₁ (i, l) =1/N, an initialized sample data set D is acquired ₁ Setting the iteration times t=1;

step S32: training a base classifier: data set D using an mth base classifier ₁ Stacking and integrating, and training a single base classifier h _m1 (x, l) predicting patient outcome;

step S33: and (5) weight updating: calculating the Hamming loss of the base classifier (the smaller the value is, the better the model prediction effect is, namely the wrong label proportion e) _t Calculating an update weight

Using alpha _t Calculate the next iteration update

As normalization factor, y [ l ]]Indicating whether tag l belongs to instance (x, y).

Step S34: repeating the integration iteration, namely setting the iteration times t=t+1 until the iteration times t=t+1 are set until the preset iteration times T are reached;

step S35: weight lifting learning classifier weighted voting: taking a single classifier h of t=1, …, T _mt To obtain a lifting learning classifier h _m And, as aTo represent the optimal predictive model.

Step S4: and dynamically constructing new multi-label integrated prediction models by combining different numbers and different types of prediction models with optimal performance, and selecting a combination model with highest prediction performance to predict the disease.

The embodiment of the invention dynamically builds a new multi-label integrated prediction model, which comprises the following steps:

step S41: initializing: raw dataset D with causal effects _s Is an empty set;

step S42: classifying the training samples: using trained basis learning classifier h _m (x, l) vs. feature x ₁ Classifying to obtain c _1m ＝h _m (x ₁ ,l)；

Step S43: updating data set D _s ：D _s ＝{c ₁₁ ,c ₁₂ ,…,c _1m Y, repeating the classifying process of the training samples until all N inpatients are classified and predicted to obtain c _nm ＝h _m (x _n L) to obtain a new dataset D _s ＝{((c _i1 ,c _i2 ,…,c _im ),y)}；

Step S44: integrated part of training model: using the new dataset D _s Model results of the training model, the model is dynamically selected according to the base learner pool, and finally Stacking is performed, and the new learning algorithm Z is used in the part, so that h=z (D _s )；

Step S45: and (3) outputting: h (x) =h (H ₁ (x,l),h ₂ (x,l),…,h _m (x,l))。

The process dynamically builds a new multi-label integrated prediction model through the combination modes of different numbers and different types of base classifiers, selects the base classifier with the optimal performance, and finally diagnoses the disease of the patient so as to ensure that the prediction accuracy is highest.

The flow chart of key steps of the disease prediction method based on causal inference and dynamic integration multi-label provided by the embodiment of the invention is shown in figure 3, has better universality and adaptability, can be suitable for various different disease prediction scenes, can adaptively update a prediction model according to the change of data, and keeps the accuracy and efficiency of prediction.

An embodiment of the present invention provides a disease prediction system based on causal inference and dynamic integration of multiple tags, as shown in fig. 4, the system includes:

a data collection module for obtaining multi-source information of a patient, comprising: demographic index information, lifestyle information, physical examination information, complaint symptoms, past medical history information, and past medication information. Details refer to the related description of step S1 in the above method embodiment, and will not be described herein.

The data processing module is used for preprocessing and cleaning the multi-source information, removing noise and abnormal values, performing feature selection and dimension reduction operation, and preparing for the construction of a subsequent disease prediction model.

And the causal inference module is used for analyzing causal relations among all the features to establish a causal model, and screening feature sets with causal effects based on the causal inference model. For details, refer to the related description of step S2 in the above method embodiment, and no further description is given here.

And the dynamic integration multi-label module is used for training a plurality of multi-label-based learning classifiers by utilizing the feature set with the causal effect, and updating weights through stacking integration to obtain a prediction model with optimal performance. For details, refer to the related description of step S3 in the above method embodiment, and no further description is given here.

And the disease prediction model determining module is used for dynamically constructing a new multi-label integrated prediction model through different numbers and different types of prediction model combination modes with optimal performance, and selecting a combination model with highest prediction performance to predict the disease. For details, see the description of step S4 in the above method embodiment, and the details are not repeated here.

The interface display module is used for displaying the disease prediction result to the user, so that the user can more conveniently obtain the prediction result through an intuitive user interface and make a corresponding clinical decision.

In a preferred embodiment, the system provided in the embodiment of the present invention further includes: a disease prediction model update module comprising: the incremental learning unit adjusts the weights of the new sample and the old sample by adopting a sample importance-based method, and the transfer learning unit predicts the new disease by utilizing the existing prediction model and can continuously ensure the disease prediction accuracy by learning and receiving new data.

In practical application, an example of application of the disease prediction system provided by the embodiment of the present invention is adopted:

1. patient A, female, 45 years old, height 160cm, weight 70kg, no smoking history, drinking history, no family history; the complaints are the symptoms such as palpitation, shortness of breath and the like in the near future; the patient has undergone an electrocardiographic examination to find an arrhythmia; the biochemical examination results were as follows: white blood cells 8.5X10-9/L, red blood cells 4.1X10-12/L, red blood cell distribution width 13.6%, hemoglobin 131g/L, platelets 226X 10-9/L. The imaging data shows the presence of left ventricular hypertrophy.

The doctor inputs the sex, age, height, weight, smoking history, drinking history, family history, clinical manifestation (symptoms such as palpitation and shortness of breath), biochemical examination (leucocyte, erythrocyte distribution width, hemoglobin, platelet and the like), electrocardiogram information, imaging information and the like of the first patient by using the system provided by the embodiment of the invention. The system screens out the characteristics related to the diseases, then carries out model training and prediction, and finally obtains the following disease prediction results:

hypertension: the probability of illness is 70%

Arrhythmia: the probability of illness is 60%

Coronary heart disease: the probability of illness is 40%

The doctor further diagnoses the disease prediction result by combining the information of the symptom, the sign, the examination result and the like of the first patient, and finally determines the disease prediction result as arrhythmia. And corresponding treatment schemes are formulated, including drug treatment, lifestyle adjustment and the like.

2. In the intelligent medical device, the device can monitor physiological indexes (such as heart rate, blood pressure and blood oxygen saturation) of a patient by using a sensor, transmit the data to a cloud server integrated with the disease prediction system provided by the embodiment for analysis and diagnosis, and automatically send an alarm or reminder to a doctor when necessary according to the diagnosis result.

Fig. 5 shows a schematic structural diagram of a computer device according to an embodiment of the present invention, including: a processor 901 and a memory 902, wherein the processor 901 and the memory 902 may be connected by a bus or otherwise, for example in fig. 5.

The processor 901 may be a central processing unit (Central Processing Unit, CPU). The processor 901 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or a combination thereof.

The memory 902 is used as a non-transitory computer readable storage medium for storing a non-transitory server program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to the methods in the above method embodiments. The processor 901 executes various functional applications of the processor and data processing, i.e., implements the methods in the above-described method embodiments, by running non-transitory server programs, instructions, and modules stored in the memory 902.

The memory 902 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created by the processor 901, and the like. In addition, the memory 902 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 902 optionally includes memory remotely located relative to processor 901, which may be connected to processor 901 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in the memory 902 that, when executed by the processor 901, perform the methods of the method embodiments described above.

The specific details of the computer device may be correspondingly understood by referring to the corresponding related descriptions and effects in the above method embodiments, which are not repeated herein.

It will be appreciated by those skilled in the art that implementing all or part of the above-described methods in the embodiments may be implemented by a computer program for instructing relevant hardware, and the implemented program may be stored in a computer readable storage medium, and the program may include the steps of the embodiments of the above-described methods when executed. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.

Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations are within the scope of the invention as defined by the appended claims.

Claims

1. A disease prediction method based on causal inference and dynamic integration of multiple tags, comprising the steps of:

2. The causal inference and dynamic integration multi-labeled disease prediction method according to claim 1, wherein the demographic index information comprises: sex, age, height, weight; the lifestyle information includes: history of smoking and history of drinking; physical examination information, comprising: biochemical examination, electrocardiogram information and imaging data; the past medical history information includes: patient's own medical history and family medical history.

3. The causal inference and dynamic integration multi-label based disease prediction method of claim 1, wherein the step of establishing a causal model to analyze causal relationships between each feature, and screening feature sets with causal effects comprises:

training a markov chain: setting BN<U，G，P(U)>F in Bayesian networks for loyalty-based assumptions _i E F, denoted MB (Fi), where MB (F _i )＝{pa(F _i )Uch(F _i )Usp(F _i ) Is the only term, F represents different features, pa (F _i ) Represents F _i Of the parent node set, i.e. directly affecting F _i Is a variable set of (1); ch (F) _i ) Represents F _i Of the sub-node sets, i.e. F _i A set of directly affected variables; sp (F) _i ) Representation and F _i Other node sets with the same parent node, with F _i A set of variables having an indirect influence relationship;

The causal feature selection for data set D is defined as:

S ^* ＝arrgmax|S|，

s.t.P ⁱ (T _i S)＝P′(T _i S)(T _i GT′ _i ，，j≠i)

wherein T represents a disease category that may be output;

4. The causal inference and dynamic integration multi-labeled disease prediction method of claim 3, wherein the process of training a plurality of multi-labeled based learning classifiers comprises:

and (5) weight updating: calculating Hamming loss of the base classifier and misclassification label proportion e _t Calculating an update weight alpha _t By alpha _t Calculate the next iteration update W _t+1 (i，l)；

Repeating the integration iteration: setting iteration times t=t+1 until reaching preset iteration times;

5. The causal inference and dynamic integration multi-labeled disease prediction method according to claim 4, wherein the step of dynamically constructing a new multi-labeled integration prediction model by combining different numbers and different kinds of base classifiers comprises:

initializing: raw dataset D with causal effects _s Is an empty set;

classifying the training samples: using trained basis learning classifier h _m (x, l) vs. feature x ₁ Classifying to obtain c _1m ＝h _m (x ₁ ,l)；

And (3) outputting: h (x) =h (H ₁ (x,l),h ₂ (x,l),…,h _m (x,l))。

6. The causal inference and dynamic integration multi-labeled disease prediction method according to claim 5, further comprising, after obtaining the multi-source information of the patient:

7. A causal inference and dynamic integration multi-tag based disease prediction system, comprising:

8. The causal inference and dynamic integration multi-tag based disease prediction system of claim 7, further comprising:

the data processing module is used for preprocessing and cleaning the multi-source information, removing noise and abnormal values, and performing feature selection and dimension reduction operation;

the interface display module is used for displaying the disease prediction result to a user;

a disease prediction model update module comprising: the device comprises an increment learning unit and a transfer learning unit, wherein the increment learning unit adopts a method based on the importance of samples to adjust the weights of new samples and old samples, and the transfer learning unit utilizes the existing prediction model to predict new diseases.

9. An electronic device, comprising:

a memory and a processor in communication with each other, the memory having stored therein computer instructions that, upon execution, perform the causal inference and dynamic integrated multi-label based disease prediction method of any of claims 1-6.

10. A computer readable storage medium storing computer instructions for causing the computer to perform the causal inference and dynamically integrated multi-labeled disease prediction method of any of claims 1-6.