CN116629311A

CN116629311A - Self-adaptive nerve fuzzy system applied to diabetes analysis and integration

Info

Publication number: CN116629311A
Application number: CN202310396053.9A
Authority: CN
Inventors: 于鉴麟; 王曦; 李祯浩; 李熙; 孙成林; 何丽莉; 白洪涛
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2023-04-14
Filing date: 2023-04-14
Publication date: 2023-08-22

Abstract

The invention discloses a self-adaptive nerve fuzzy system applied to diabetes analysis and integration, which comprises the following steps of S1: inputting the data set into the model, and normalizing the data set by the model; step S2: calculating a causal coefficient of each feature corresponding to the prediction result by adopting a causal reasoning mode, and taking the causal coefficient as the weight of the feature; step S3: randomly selecting a plurality of different subsets, respectively constructing a self-adaptive fuzzy inference system for each different subset, and obtaining a prediction result of each inference system; step S4: and obtaining a prediction result of the whole model through an integrated learning mode. The invention relates to the technical field of computer algorithms, and has the beneficial effects that a causal reasoning method is introduced into feature selection, and a causal coefficient is used as a weight to randomly generate a data subset for training, so that the model training cost is reduced, and better model efficiency can be achieved.

Description

Self-adaptive nerve fuzzy system applied to diabetes analysis and integration

Technical Field

The invention relates to the field of computer algorithms, in particular to a self-adaptive neural fuzzy system applied to diabetes analysis and integration.

Background

The interpretability of the adaptive neural fuzzy system (ANFIS) makes it a popular disease prediction tool today. However, the structure of ANFIS determines that as the number of input features increases, the number of parameters increases geometrically. Simply reducing feature quantities will reduce the accuracy of the resulting model, while adding more features will increase the cost of training the model and reduce the interpretability of the ANFIS, which severely limits the application in situations where a high level of interpretability is required, especially disease prediction. In this context, an integrated stochastic causal inference-adaptive neural fuzzy system (CIR-ANFIS) based model is presented. It uses causal inference to determine causal relationships between features and results, and uses this randomly generated subset for training different ANFIS models, and then develops various bagging strategies for comparison based on the unique characteristics of the ANFIS. Data comparison experiments showed that models perform significantly better on other data sets for diabetes and other disease predictions than traditional ANFIS models, other feature selection techniques, and feature selection using causal inference. The CIR-ANFIS model described in this study may also be used to classify binary disease prediction scenarios, particularly those based on assay indicators.

To construct the CIR-ANFIS model to predict diabetes and apply it to other medical datasets, a new attempt was made to make feature selection by Causal Inference (CI) and to use ensemble learning to reduce under-fitting and over-fitting of individual ANFISs. In the following section, integrated learning and causal reasoning (CI) will be discussed.

Technical defects:

first, although ensemble learning is more accurate than using a single model, it is less efficient due to the more complex training process. Especially when the number of features is high, the number of sub-models will increase significantly and the training time of the CIR-ANFIS will also increase.

Second, the ensemble learning mode used is a semi-persuasive mode. More integrated learning methods will be tried in the future in order to obtain better results. In addition, in this item, a technique of adding and averaging each sub-data model is used to obtain a result, and the weight of each model is the same, and a more preferable combination method needs to be further searched for.

Disclosure of Invention

The invention aims to solve the problems, and designs an adaptive neural fuzzy system applied to diabetes analysis and integration.

The technical scheme of the invention for achieving the purpose is that the self-adaptive nerve fuzzy system applied to diabetes analysis and integration comprises the following steps of S1: inputting the data set into the model, and normalizing the data set by the model;

step S2: calculating a causal coefficient of each feature corresponding to the prediction result by adopting a causal reasoning mode, and taking the causal coefficient as the weight of the feature;

step S3: randomly selecting a plurality of different subsets, respectively constructing a self-adaptive fuzzy inference system for each different subset, and obtaining a prediction result of each inference system;

step S4: and obtaining a prediction result of the whole model through an integrated learning mode.

In the step S1, before training the CIR-ANFIS model, the read-in data set is required to be standardized;

in this section, all features are converted to a form with an average value of 0 and a variance of 1. The method comprises the following steps:

(1) For each feature, calculating its average value and then subtracting its average value;

(2) The variance of each feature is calculated and then divided by the (1) result.

The weight of each feature in step S2 may be determined by the causal relationship between it and the result, this indication being called causal effect; in the statistical field, causal inference is an experimental study that determines causal relationships; the following are four steps of causal inference:

step 1, modeling a problem according to a hypothesis;

since the causal model is developed using problem assumptions, a priori knowledge to be used is built, the model can be used to contain past information and create a graph;

under the condition that no priori knowledge is known on the data set, assuming that no causal relation exists among the features, all the features are affected by unobserved confounding factors, when the causal relation between a specific feature and a result is determined, assuming that the causal relation exists if the feature is higher than the average value and the result is positive, for this purpose, converting the normalized feature into a 0-1 variable, and calculating the causal relation by using the variable, if the priori knowledge is available on the data set, a finer causal reasoning graph can be made by using the knowledge, which is helpful for obtaining a more accurate result;

step 2, determining causal relation (causal estimation);

definition of causal analysis shows that when other variables remain unchanged, the change in intervention will have an effect on the outcome, and after determining causal effects of the causal model, it can be estimated from previous assumptions, in this study, the backdoor criterion is mainly used;

step 3, estimating by using a statistical method;

in this step, a causal relationship between each feature and the result may be calculated. Many techniques, such as trend score matching, trend score stratification, inverse probability weighting based on trend scores, linear regression, generalized linear models (e.g., logistic regression), tool variables, and point regression, have been used to determine statistical causal effects;

the causal effect ranges between-1 and 1, where negative means that there is a negative correlation between the feature and the result;

step 4, verifying the credibility of the causal effect by using a plurality of robustness testing methods;

although statistical estimates of data are used in the third section to calculate causal relationships, the causal relationships themselves are based on previous assumptions rather than data, and therefore, several robustness tests are utilized in order to demonstrate the previous assumptions, three robustness tests used in this study are as follows:

1) Random confounding factors;

in the test, independent random variables are introduced as common influencing factors of the data set, and if the assumption is accurate, the estimation result is not changed naturally;

2) Verifying the data subset;

in this test, a randomly selected subset is used instead of the data set provided, and if assumed to be true, the estimation will not change significantly;

3) Placebo intervention;

in this test, independent random variables were used instead of actual intervention variables; if the assumption is valid, the estimation will be close to zero.

The causal effect may be considered as the effect of each feature on classification, features with high levels of causal effect may contribute more than features with low levels, and a random subset is generated using the causal effect as a feature weight by;

in the CIR-ANFIS model, n-3 sub-models are used, where n is the number of input features, four in each sub-model, and for the 2CIR-ANFIS model, two times (2 n-6) sub-models are used, and the following is the algorithm that selects the input features of each sub-model:

(1) For each feature, a weight score WS _i The definition is as follows:

wherein |Cas _i I is the absolute value of the causal relationship between the feature and the result and r is a random number from 0 to 1.

(2) The four features with the highest weight scores are selected.

(3) If this is not the first sub-model, the selection result is compared to the previous result, and if all four features match, the selection result is invalid; returning to the step 1 and executing calculation again; otherwise, continuing to calculate each subsequent sub-model until all sub-data are calculated;

through the above procedure, the step of preparing the sub-data set for sub-model training is completed. To demonstrate that this approach can use more features, it is compared to the traditional simple selection of four features;

in CIR-ANFIS, for a dataset with n features, probability P is not selected by the submodel _i Can be approximated as

Probability P not selected by all submodels _{all_not} Is that

Continuing the analysis in each case;

(1)|Cas _i the property is significantly higher than most other common traits, top-ranked 4, which is likely to be added multiple times to the sub-dataset, meaning that it can contribute to many sub-models, apparently contributing significantly to the overall CIR-ANFIS model;

(2)|Cas _i i is lower or higher than Top-4, but the gap is not large, in which case,the value range is 0.1 to 0.2, p _i The values range from 0.4 to 0.6, which indicates that it is still possible to add some sub-data sets to the feature, i.e. the CIR-ANFIS model allows more features to be incorporated, rather than only selecting a limited number of features;

(3)|Cas _i the I is significantly lower than Top-4, e.g., one twentieth or even one hundredth of Top-4. In this case the number of the elements to be formed is,below 0.05, p _i Above 0.8, this means that a feature can sometimes only join one or a few, if not any, sub-data sets due to the low causal relationship between the feature and the result;

from the above calculations and analysis, CIR-ANFIS may use causal inference to create a subset dataset reflecting the relative importance of various features, with the goal of using more data from more features and improving the performance of the overall model.

The ANFIS is a classical TSK fuzzy model defined as using Takagi-Sugeno-Kang type first order rules

If x ₁ Is A ₁ And..and x _n Is A _n

Then

Wherein x is _i Is the ith feature of the data, A _i Is fuzzy set, y is the result of rule, C _i Is the result coefficient of the TSK rule, the result part in the ANFIS is a linear function of the input vector, rather than a fuzzy rule of the Mamdani type, and the use of a linear function makes the ANFIS easier to understand than other more complex functions;

typical ANFIS consists of 5 layers;

blurring the input samples using membership functions in a first layer, the number of membership functions being adjustable prior to training the ANFIS model, in which various function shapes are widely used; in this study, triangular membership functions were used, i.e

Wherein a, b, c are three vertices of a triangle, and a feature has three membership functions describing the low, medium, and high levels of the feature;

the first layer may be described as

The second layer uses membership to calculate excitation intensity of the sample according to all rules, in this context, all sub-ANFIS models have 4 input features and the model has rules; the output of the second layer is the product of the membership of all features:

the third layer completes normalization of the output of the previous layer:

is the normalized excitation intensity;

the fourth layer computes the results of all rules, using a linear function to combine the inputs:

finally, the fifth layer completes the deblurring work:

for ANFIS, the result of a sample calculation is a floating point number, but a binary 0-1 variable is required, so rounding can solve this problem.

The integration learning is combined with a plurality of independent models to improve generalization performance, and three strategies of the current integration learning technology are bagging, lifting and stacking;

unlike most models, the ANFIS model results are twice as many as indicated, with determinations above 0.5 positive and below 0.5 negative, several bagging strategies will be tried:

(1) Voting mode

Where m is the number of sub-models, in CIR-ANFIS, m is the feature number minus 3; res is the prediction of each sub-model, the result is the prediction of the whole model, if more than or equal to 50% of the sub-models correctly predict the positive class result of one sample, the final result is positive class;

(2) Persuasion mode

Wherein variables are the same as the above, when using this strategy, when a sub-model appears to strongly favor either a positive or negative class, one positive or negative class may affect the outcome of the overall model;

(3) Persuasion mode with suppression function

Use of continuous functions to reduce the adverse effects of extrema on classification performance includes square root functions and logarithmic functions:

(4) Semi-persuasive mode

Where a, b are upper and lower limits, which limit the ability to "convince" other sub-models. They may be floating point numbers or infinity. In this context, two semi-convincing will be tried: a is 0, b is infinity, or a is 0 and b is 1;

after five steps, the CIR-ANFIS model was obtained.

The self-adaptive neural fuzzy system manufactured by the technical scheme is applied to diabetes analysis and integration, (1) a causal reasoning method is introduced into feature selection, and a causal coefficient is taken as a weight to randomly generate a data subset for training, so that the model training cost is reduced, and better model efficiency can be achieved;

(2) Unlike conventional integration methods, the "semi-convincing" method is used to combine the predicted results of all sub-models;

(3) Data experimental results were used to demonstrate the performance of the model.

Drawings

FIG. 1 is a flow chart of a method of the present invention for use in an adaptive neural-fuzzy system for diabetes analysis and integration;

Detailed Description

The invention is specifically described below with reference to the accompanying drawings, as shown in fig. 1, and the invention is applied to an adaptive neural fuzzy system for diabetes analysis and integration, and the patent realizes an integrated stochastic causal reasoning-adaptive neural fuzzy system (CIR-ANFIS) model. Inputting a data set into the model, normalizing the data set by the model, calculating a causal coefficient of each feature corresponding to a predicted result by adopting a causal reasoning mode, taking the causal coefficient as a weight of the feature, randomly selecting a plurality of different subsets, respectively constructing a self-adaptive fuzzy reasoning system for each different subset, obtaining the predicted result of each reasoning system, and finally obtaining the predicted result of the whole model by adopting an integrated learning mode.

The method comprises the following steps:

the read-in data set needs to be normalized before the CIR-ANFIS model is trained.

Causal reasoning

The weight of each feature may be determined by a causal relationship between it and the result, this indication being referred to as causal effect. In the statistical field, causal inference is an experimental study that determines causal relationships. The following are four steps of causal inference:

problem modeling based on assumptions

Since the causal model is developed using problem assumptions, a priori knowledge will be used to construct. Models can be used to contain past information and create a chart.

Without prior knowledge of the dataset, all features are affected by unobserved confounding factors, assuming no causal relationship between the features. When determining a causal relationship between a particular feature and a result, it is assumed that there is a causal relationship if the feature is higher than the average and the result is positive. To this end, the normalized features are converted to a 0-1 variable and the causal relationship is calculated using the variable. If there is a priori knowledge of the dataset, this knowledge can be used to make finer causal inference graphs, which help to get more accurate results.

Determining causal relationship (causal estimation)

The definition of causal analysis shows that the change in intervention will have an effect on the outcome when other variables remain unchanged. After determining the causal effect of the causal model, an estimate may be made based on previous assumptions. In this study, the back door standard was mainly used.

Estimation using statistical methods

In this step, a causal relationship between each feature and the result may be calculated. Many techniques, such as trend score matching, trend score stratification, inverse probability weighting based on trend scores, linear regression, generalized linear models (e.g., logistic regression), tool variables, and point regression, have been used to determine statistical causal effects.

The causal effect ranges between-1 and 1, where negative means that there is a negative correlation between the feature and the result.

Verifying credibility of causal effects using multiple robustness testing methods

Although statistical estimates of the data are used in the third section to calculate the causal relationship, the causal relationship itself is based on previous assumptions rather than the data. Thus, several robustness tests are utilized, the purpose of which is to demonstrate the previous assumptions. Three robustness tests used in this study were as follows:

1) Random confounding factor

In this test, independent random variables were introduced as common influencing factors for the dataset. If the assumption is accurate, the estimation result does not change naturally.

2) Data subset verification

In this test, a randomly selected subset is used instead of the provided data set; if the assumption is true, the estimation will not change significantly.

3) Placebo intervention

Generating random subsets

Causal effects can be considered as the effect of each feature on classification. Features with high levels of causal effects may contribute more than features with low levels, with causal effects as feature weights, a random subset is generated by the following operations.

In the CIR-ANFIS model, n-3 sub-models are used, where n is the number of input features, four input features in each sub-model. For the 2CIR-ANFIS model, a two-fold (2 n-6) sub-model was used. The following is an algorithm for selecting the input features of each sub-model:

(1) For each feature, a weight score WS _i The definition is as follows:

(2) The four features with the highest weight scores are selected.

(3) If this is not the first sub-model, the selected result is compared with the previous result. If all four features match, the selection result is invalid; returning to the step 1 and executing calculation again; otherwise, continuing to calculate each subsequent sub-model until all sub-data are calculated.

Through the above procedure, the step of preparing the sub-data set for sub-model training is completed. To demonstrate that this approach can use more features, it is compared to the traditional simple selection of four features.

In CIR-ANFIS, for a dataset with n features, the probability of not being selected by the submodel may be approximated as

Probability P not selected by all submodels _{all_not} Is that

The discussion continues in various situations.

(1)|Cas _i The i is significantly higher than most other common traits, top 4. This feature is likely to be added to the sub-data set multiple times, meaning that it can be applied toMany sub-models contribute, obviously, to the overall CIR-ANFIS model.

(2)|Cas _i I is lower or higher than Top-4, but the gap is not large. In this case the number of the elements to be formed is,the value range is 0.1 to 0.2, p _i The values range from 0.4 to 0.6, which indicates that it is still possible to add some sub-data sets to the feature, i.e. the CIR-ANFIS model allows more features to be incorporated, rather than only selecting a limited number of features.

(3)|Cas _i The I is significantly lower than Top-4, e.g., one twentieth or even one hundredth of Top-4. In this case the number of the elements to be formed is,below 0.05, p _i Above 0.8, this means that the feature sometimes can only join one or a few, or even none, of the sub-data sets due to the low causal relationship between the feature and the result.

Training ANFIS submodels

ANFIS is a classical TSK blur model. The model adopts Takagi-Sugeno-Kang first order rule, which is defined as

If x ₁ Is A ₁ And..and x _n Is A _n

Then

Wherein x is _i Is the ith feature of the data, A _i Is fuzzy set, y is the result of rule, C _i Is the result coefficient of the TSK rule. The resulting part in ANFIS is a linear function of the input vector, not a fuzzy rule of the Mamdani type. And other more complexThe use of a linear function makes ANFIS easier to understand than a function of (a). The structure of a typical ANFIS is shown in fig. 1.

A typical ANFIS consists of 5 layers.

The input samples are blurred in the first layer using membership functions. The number of membership functions may be adjusted prior to training the ANFIS model. Various functional shapes are widely used in this layer. In this study, triangular membership functions were used, i.e

Where a, b, c are three vertices of a triangle, and a feature has three membership functions describing the low, medium, and high levels of the feature.

The first layer may be described as

The second layer uses membership to calculate excitation intensity of the sample according to all rules. In this context, all sub-ANFIS models have 4 input features and the model has rules. The output of the second layer is the product of the membership of all features:

the third layer completes normalization of the output of the previous layer:

is the normalized excitation intensity.

finally, the fifth layer completes the deblurring work:

Integrated learning

Multiple independent models are combined through ensemble learning to improve generalization performance. Three strategies of today's ensemble learning technology are bagging, lifting and stacking.

Unlike most models, the ANFIS model results in twice that already indicated, with determinations above 0.5 positive and below 0.5 negative. In this context, several bagging strategies will be tried:

(1) Voting mode

Where m is the number of submodels. In CIR-ANFIS, m is the feature number minus 3.res is the prediction of each sub-model, and the result is the prediction of the entire model. If more than or equal to 50% of the sub-models correctly predict the positive class result for one sample, the final result is a positive class.

(2) Persuasion mode

Wherein the variables are as defined above. Using this strategy, when sub-models appear that strongly favor either positive or negative classes, one positive or negative class may affect the outcome of the overall model.

(3) Persuasion mode with suppression function

(4) Semi-persuasive mode

Where a, b are upper and lower limits, which limit the ability to "convince" other sub-models. They may be floating point numbers or infinity. In this context, two semi-convincing will be tried: a is 0, b is infinity, or a is 0 and b is 1.

After five steps, the CIR-ANFIS model was obtained.

The characteristic of this embodiment is that (1) the CIR-NFIS model is provided, which can be applied to disease diagnosis and prediction of diabetes, and can also be used to classify binary disease prediction situations, especially prediction situations based on assay indexes.

(2) The causal reasoning method is introduced into feature selection, and the causal coefficient is used as weight to randomly generate a data subset for training, so that the model training cost is reduced, and better model efficiency can be achieved; some of the attributes and results in disease prediction based on assay metrics are likely to have causal relationships, causal inference can help the model perform better while preserving its interpretability.

(3) Unlike conventional integration methods, the "semi-convincing" method is used to combine the predicted results of all sub-models;

(4) Data experimental results were used to demonstrate the performance of the model.

In this embodiment, a dataset is created using Python, causal inference is performed, and normalization is performed from the first portion to the third portion. For the remainder, an ANFIS sub-model is trained using MATLAB, ensemble learning is performed, and the results are calculated. All tests were performed on a Hua-Chen computer having an Intel Kuri 7-10510 processor operating at 2.30GHz and 8GB memory.

Data set

Herein, the performance of CIR-ANFIS will be tested using 9 datasets of UCI. The characteristics and sample numbers of each dataset are shown in table 1. All data sets were cross-validated 10 times.

Table 1 data set overview

TABLE 2 experiment accuracy

TABLE 3 model G-average

TABLE 4 model F1-score

Table 5 model recall

TABLE 6 model accuracy

Evaluation index and parameter setting

To fully evaluate the performance of the CIR-ANFIS, five metrics are used to describe performance, including accuracy, F1 score, G mean, precision, and recall. Considering binary classification, based on the confusion matrix, five indices can be calculated as follows:

the accuracy determines the percentage of correctly predicted samples to all samples.

The accuracy can find the ratio of the true class instance in all positive class matches.

Recall is defined as the number of correctly predicted positive class samples divided by the actual positive class samples.

The G-means and F1 scores are defined as follows:

five ensemble learning methods listed in the CIR-ANFIS methodology section were applied and compared to using causal inference only on all data sets. No 2CIR-ANFIS is created for the remaining two data sets, as there is not enough functionality to create two separate subsets.

Experimental results

Tables 2 to 6 show the accuracy, precision, recall, G-means and F1 scores of the model over all 9 datasets.

The table shows that using semi-convincing CIR-ANFIS when a is 0 and b is 1 has the best accuracy over 7 of the 11 data sets, especially over the table data set, which improves the classification accuracy by more than a factor of 10. The difference between the best score and 0-1 half convinces was negligible for the remaining 4 data sets. In general, when creating an ANFIS model by causal inference, the use of CIR-ANFIS may improve efficiency while maintaining interpretability.

The number of sub-models of 2CIR-ANFIS and CIR-ANFIS were compared. According to the experimental results, 2CIR-ANFIS performed slightly better on data sets with more than 10 features than CIR-ANFIS, while CIR-ANFIS performed better on data sets with less than 10 features

The above technical solution only represents the preferred technical solution of the present invention, and some changes that may be made by those skilled in the art to some parts of the technical solution represent the principles of the present invention, and the technical solution falls within the scope of the present invention.

Claims

1. An adaptive neural fuzzy system for diabetes analysis and integration, comprising the steps of S1: inputting the data set into the model, and normalizing the data set by the model;

2. The adaptive neural fuzzy system of claim 1, wherein the step S1 is performed by normalizing the read-in data set prior to training the CIR-ANFIS model;

3. An adaptive neural fuzzy system for use in diabetes analysis and integration of claim 1, wherein the weight of each feature in step S2 is determined by the causal relationship between it and the outcome, which indication is called causality; in the statistical field, causal inference is an experimental study that determines causal relationships; the following are four steps of causal inference:

step 1, modeling a problem according to a hypothesis;

step 2, determining causal relation (causal estimation);

step 3, estimating by using a statistical method;

1) Random confounding factors;

2) Verifying the data subset;

3) Placebo intervention;

4. An adaptive neural fuzzy system for use in diabetes analysis and integration of claim 1, wherein the causal effect can be considered as the effect of each feature on classification, features with high levels of causal effect may contribute more than features with low levels, and a random subset is generated using causal effect as a feature weight by;

(1) For each feature, a weight score WS _i The definition is as follows:

(2) The four features with the highest weight scores are selected.

Probability P not selected by all submodels _{all_not} Is that

Continuing the analysis in each case;

5. The adaptive neural fuzzy system of claim 1, wherein the ANFIS is a classical TSK fuzzy model defined as using Takagi-Sugeno-Kang type first order rules

If x ₁ Is A ₁ And..and x _n Is A _n

Then

typical ANFIS consists of 5 layers;

the first layer may be described as

the third layer completes normalization of the output of the previous layer:

is the normalized excitation intensity;

finally, the fifth layer completes the deblurring work:

6. An adaptive neural fuzzy system for use in diabetes analysis and integration according to claim 1, wherein the integration learning combines multiple independent models to enhance generalization performance, and three strategies of today's integration learning technology are bagging, boosting and stacking;

(1) Voting mode

(2) Persuasion mode

(3) Persuasion mode with suppression function

(4) Semi-persuasive mode

after five steps, the CIR-ANFIS model was obtained.