CN117409963A

CN117409963A - Premature infant feeding intolerance risk prediction method and system

Info

Publication number: CN117409963A
Application number: CN202310123452.8A
Authority: CN
Inventors: 徐惠; 周瑞; 付连国; 杨丽娟; 陈信
Original assignee: First Affiliated Hospital of Bengbu Medical College
Current assignee: First Affiliated Hospital of Bengbu Medical College
Priority date: 2023-02-07
Filing date: 2023-02-07
Publication date: 2024-01-16

Abstract

The invention discloses a premature infant feeding intolerance risk prediction method, which comprises the following steps: obtaining gastrointestinal feeding intolerance information according to preset premature infant case information; according to whether gastrointestinal feeding is intolerant or not, intolerant variables are obtained, correlation analysis is carried out on each group of characteristic variables and intolerant variables, and characteristic variables with high correlation degree, namely sample characteristic variables, are obtained; calculating the shape value of each group of sample characteristic variables, and selecting sample characteristic variables with large shape values to obtain input characteristic variables; dividing a training set and a testing set by adopting a 10-fold cross validation method for input characteristic variables, training a training set model based on an XGBoost algorithm, and constructing a model function of the XGBoost; presetting a classification threshold, calculating the prediction probability of a sample to be predicted by using a model function, and if the prediction probability is larger than the classification threshold, judging that the premature infant corresponding to the sample to be predicted is intolerant to gastrointestinal feeding. The invention also discloses a system adopting the method.

Description

Premature infant feeding intolerance risk prediction method and system

Technical Field

The invention relates to the technical field of intelligent prediction, in particular to a premature infant feeding intolerance risk prediction method and system.

Background

The premature infant is intolerant to feed, is a clinically common digestive system multiple symptom, is easier to feed than the term infant because of the immature gastrointestinal development and relatively slower development of gastrointestinal motility than digestion and absorption functions, and is manifested by vomiting, abdominal distension, gastric retention and the like after starting gastrointestinal feeding, which seriously affects early nutrition support treatment of the premature infant and brings great challenges to reasonable feeding of the premature infant. The research shows that the incidence rate of feeding intolerance of premature infants in China is 33.80-53.45%, and the incidence rate in foreign countries is about 25%. Feeding intolerance will lead to insufficient nutrient intake in premature infants, a long development delay of Gong Waisheng, and prolonged parenteral nutrition will also increase the incidence of complications such as nosocomial infections, metabolic disorders, liver damage, etc. Meanwhile, the hospitalization time is prolonged, the social and family economic burden is increased, and the survival rate and the life quality of premature infants are influenced.

Feeding intolerance is a common clinical symptom, the pathogenesis is complex, the influence factors are numerous, and the high-risk factors influencing feeding intolerance can be accurately identified and targeted prevention is a key measure for reducing feeding intolerance. Domestic Chen Qiong et al use logistic regression analysis modeling to predict the occurrence of premature infant feeding intolerance; li Yan et al, discuss the relationship of arterial blood flow changes on the mesenteric of premature infants before and after a meal to feeding tolerance using Spearmans-related regression analysis in the hope of predicting whether feeding intolerance occurs by gastrointestinal kinetics changes; in abroad, carlo developed early effective biomarkers for prediction, measuring visceral tissue oxygenation fraction with near infrared spectroscopy to predict premature feeding tolerance; valentina adopts a generalized linear model to evaluate the relationship between visceral blood oxygen saturation, superior mesenteric artery Doppler blood flow velocity measurement and feeding tolerance; bozzetti predicts the feeding tolerance of intrauterine limited newborns using logstin regression modeling.

From the above, most of the research of modeling by using the data mining method at home and abroad only uses a simple data mining method, the obtained model may not be the best result, and most of the models mainly use the research of the biomarker, so that the cost is high, the operation is complex, the requirements on manpower and material resources are high, and part of evaluation tools need to be predicted by combining with an imaging examination means, so that the economic burden of patients is increased, and the method does not accord with the hope of masses for a convenient and quick screening mode. Therefore, it is needed to construct a systematic, convenient and accurate prediction method to make up for the defects of the existing researches.

Disclosure of Invention

An object of the present invention is to propose a method for predicting the risk of feeding intolerance of premature infants which can be accurately predicted.

A method of predicting risk of feeding intolerance in premature infants comprising the steps of:

obtaining gastrointestinal feeding intolerance information according to preset premature infant case information, and obtaining a plurality of groups of characteristic variables in the gastrointestinal feeding intolerance information;

according to whether gastrointestinal feeding is intolerant or not, assigning a value to each premature infant case information to obtain intolerant variables, and performing correlation analysis on each group of characteristic variables and intolerant variables to obtain characteristic variables with high correlation degree, namely sample characteristic variables;

calculating the shape value of each group of sample characteristic variables, and selecting sample characteristic variables with large shape values to obtain input characteristic variables;

dividing a training set and a testing set by adopting a 10-fold cross validation method for input characteristic variables, training a training set model based on an XGBoost algorithm, and constructing a model function of the XGBoost;

presetting a classification threshold, calculating the prediction probability of a sample to be predicted by using a model function, and if the prediction probability is larger than the classification threshold, judging that the premature infant corresponding to the sample to be predicted is intolerant to gastrointestinal feeding.

According to the premature infant feeding intolerance risk prediction method provided by the invention, the significance of the features is ordered by calculating the SHAP value of each feature, and the features are selected for model training, so that the problems of dimension disasters and noise caused by more features and the over-fitting problem caused by the increase of model complexity are overcome.

In addition, the premature infant feeding intolerance risk prediction method provided by the invention can also have the following additional technical characteristics:

further, the characteristic variables include the following sets:

body weight, gestational age, 1 minute apgar score, resuscitation history, neonatal asphyxia, NRDS, infection, PDA, PS use, probiotic use, blood transfusion, apnea, hyperthermia, abnormal interval between bowel movements, milk opening time and mechanical ventilation.

Further, the step of assigning a value to each of the premature infant case information based on whether the gastrointestinal feeding is intolerant, respectively, comprises:

if the premature infant is intolerant to feeding, assigning a first identification value;

otherwise, the second identification value is assigned.

Further, the step of performing correlation analysis on each group of characteristic variables and intolerant variables to obtain characteristic variables with high correlation, namely sample characteristic variables, includes:

inputting each group of characteristic variables and intolerance variables into statistical software, and executing spearman correlation analysis to obtain a plurality of correlation coefficients ρi;

and acquiring a characteristic variable of ρi <0.05, and identifying the characteristic variable as a characteristic variable with high correlation, namely a sample characteristic variable.

Further, the step of calculating the shape value of each group of sample feature variables, and selecting the sample feature variables with large shape values to obtain the input feature variables includes:

calculating the shape values of each group of sample characteristic variables, and sorting all the shape values according to the order from large to small;

and acquiring n sample characteristic variables before the Shapley value ranking, and taking the sample characteristic variables as input characteristic variables, wherein n is a positive integer.

Further, the step of presetting the classification threshold value includes:

calculating about sign indexes P of the prediction model under different data sets by using about sign rules;

the critical point is determined by the maximum of the about-step index, and the average value of the about-step index P is taken as the best classification threshold Bestp of the model.

Further, the step of calculating the about sign index P of the predictive model under different data sets using about sign law includes:

calculating the sensitivity and specificity of the prediction model by using the two classification confusion matrix;

the method for calculating the about sign index P comprises the following steps: p=sensitivity+specificity-1.

Another object of the invention is to propose a system for predicting the risk of feeding intolerance of premature infants, comprising:

the characteristic variable acquisition module is used for acquiring gastrointestinal feeding intolerance information according to preset premature infant case information and acquiring a plurality of groups of characteristic variables in the gastrointestinal feeding intolerance information;

the sample characteristic variable acquisition module is used for respectively assigning a value to each premature infant case information according to whether gastrointestinal feeding is intolerant or not to obtain intolerant variables, and performing correlation analysis on each group of characteristic variables and intolerant variables to obtain characteristic variables with high correlation degree, namely sample characteristic variables;

the input characteristic variable acquisition module is used for calculating the shape value of each group of sample characteristic variables, and selecting the sample characteristic variables with large shape values to obtain the input characteristic variables;

the model construction module is used for dividing a training set and a testing set by adopting a 10-fold cross validation method on input characteristic variables, training a training set model based on an XGBoost algorithm, and constructing a model function of the XGBoost;

the prediction module is used for presetting a classification threshold, calculating the prediction probability of the sample to be predicted by using a model function, and judging that the premature infant corresponding to the sample to be predicted is intolerant to gastrointestinal feeding if the prediction probability is larger than the classification threshold.

The beneficial effects of the invention are as follows: XGBoost belongs to Boosting integrated learning algorithm, and is formed by integrating CART regression tree models together, so that a strong classifier is formed, the accuracy is high, the running speed is high, the overfitting is reduced by using regularization technology, and abnormal value interference is avoided.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a schematic flow chart of a first embodiment of the present invention;

FIG. 2 is a flow chart showing the steps of a method for predicting feeding intolerance of premature infants based on SHAP feature selection and XGBoost in accordance with a first embodiment of the present invention;

FIG. 3 is a schematic diagram showing SHAP additivity results;

FIG. 4 is a schematic diagram of feature importance ranking based on SHAP values;

FIG. 5 is a schematic diagram of SHAP-based feature abstracts;

FIG. 6 is a schematic diagram of a 10-fold cross-validation principle;

FIG. 7 is a schematic representation of the ROC curve of the method of the invention;

fig. 8 is a block diagram of a second embodiment of the present invention.

Detailed Description

In order that the objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Several embodiments of the invention are presented in the figures. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.

Referring to fig. 1 and 2, a first embodiment of the present invention provides a method for predicting feeding intolerance risk of premature infants, comprising the following steps.

S1, obtaining gastrointestinal feeding intolerance information according to preset premature infant case information, and obtaining multiple groups of characteristic variables in the gastrointestinal feeding intolerance information.

In this example, the preset premature infant case information is premature infant case information of Neonatal Intensive Care Unit (NICU) hospitalization analyzed by hospital electronic case system, and infants suffering from gastrointestinal Feeding Intolerance (FI) and infants suffering from gastrointestinal feeding tolerance are selected.

Further, inclusion criteria for the pre-set premature infant case information are: (1) gestational age <37 weeks; (2) admission is made within 24 hours after birth; (3) the hospitalization time is more than or equal to 7 days. The exclusion criteria were: (1) serious digestive tract deformity, congenital heart disease, genetic metabolic disease and the like; (2) the infant who is not milked and is automatically abandoned in treatment.

In this embodiment, the plurality of sets of characteristic variables in the gastrointestinal feeding intolerance information may be collecting related risk factors affecting feeding intolerance of premature infants, the related risk factors being: (1) general condition of infant (sex, gestational age, birth weight, gestational time, apgar score one minute after birth, whether test tube infant, whether there is post-birth resuscitation history, body temperature condition); (2) infant mother conditions (gestational complications, amniotic fluid abnormality, placenta abnormality, umbilical cord abnormality, fetal membrane abnormality, production mode, assisted reproduction, whether multiple fetuses, fetal position, mother age); (3) diseases of infant after birth (neonatal asphyxia, neonatal respiratory distress syndrome, neonatal hypoxic ischemic encephalopathy, neonatal infection, neonatal hyperbilirubinemia, arterial catheter patent); (4) drug use cases (antibiotics, probiotics, lung surfactant PS, caffeine); (5) others (first milk break time, interval between two stool, ventilator use, blood transfusion, apnea). In other embodiments, the feature variable may be selected according to the actual situation.

S2, according to whether gastrointestinal feeding is intolerant or not, assigning value to each premature infant case information to obtain intolerant variables, and performing correlation analysis on each group of characteristic variables and intolerant variables to obtain characteristic variables with high correlation degree, namely sample characteristic variables.

In this example, sample characteristic variables include body weight, gestational age, 1 minute apgar score, resuscitation history, neonatal asphyxia, NRDS, infection, PDA, PS use, probiotic use, blood transfusion, apnea, hyperthermia, abnormal time between bowel movements, milk opening time, and mechanical ventilation. In practical operation, there are different situations for the sample feature variables, and the embodiment is not limited.

Specifically, the step of assigning a value to each of the premature infant case information based on whether gastrointestinal feeding is intolerant includes:

s21, if the premature infant is intolerant to feeding, assigning a first identification value;

s22, otherwise, assigning a second identification value.

In this embodiment, the case data acquired in step S1 is recorded by using the dual recording system and the automatic logic error correction system of the data of epidata3.1, and then is imported into the spss26.0 statistical software, and all variables are assigned from 0, so that the classified variables are changed into numerical variables.

Specifically, the first identification value is 1, and the second identification value is 0. In other embodiments, the identification value may be selected according to the actual situation.

In this embodiment, when assigning a value to each premature infant, the value is also assigned to the corresponding characteristic variable, and the assignment mode is shown in table 1.

TABLE 1

Variable(s)	Assignment of value
		Weight of body	0＝≤1.5kg，1＝<2.5kg，2＝≥2.5kg
Gestational age	0＝<34w，1＝≥34w
		1 minute apgar score	0 =. Ltoreq.6 min, 1 =. Gtoreq.7 min
History of resuscitation	0 = none, 1 = have
		Newborn chamber rest	0 = none, 1 = have
NRDS	0 = none, 1 = have
		Infection with	0 = none, 1 = have
PDA	0 = none, 1 = have
		PS usage	0 = none, 1 = have
Probiotics	0 = none, 1 = have
		Blood transfusion	0 = none, 1 = have
Apnea	0 = none, 1 = have
		High body temperature	0 = none, 1 = have
Abnormal time between two stool	0 = none, 1 = have
		Time for milk opening	0＝<24h，1＝≥24h
Mechanical ventilation	0 = none, 1 = have
		With or without intolerance of feeding	0 = none, 1 = have

Specifically, the step of performing a correlation analysis on each set of feature variables and intolerant variables to obtain feature variables with high correlation, i.e., sample feature variables, includes:

s23, inputting each group of characteristic variables and intolerance variables into statistical software, and executing spearman correlation analysis to obtain a plurality of correlation coefficients ρi;

s24, obtaining characteristic variables with ρi less than 0.05, and identifying the characteristic variables as characteristic variables with high correlation degree, namely sample characteristic variables.

It should be noted that, the sample characteristic variable can be understood as a factor having statistical significance, and the influence of the variation is more remarkable than other factors.

S3, calculating the shape value of each group of sample characteristic variables, and selecting sample characteristic variables with large shape values to obtain input characteristic variables.

Specifically, the step of calculating the shape value of each group of sample characteristic variables, and selecting the sample characteristic variables with large shape values to obtain the input characteristic variables comprises the following steps:

s31, calculating the shape values of each group of sample characteristic variables, and sequencing all the shape values from large to small;

s32, acquiring n sample characteristic variables before the Shapley value ranking, and taking the sample characteristic variables as input characteristic variables, wherein n is a positive integer.

In this embodiment, n=13, and in other embodiments, n=13 may be selected according to practical situations.

Referring to fig. 4, the feature importance ranking implementation method of the present invention calculates Shapley values of each feature through SHAP, and then ranks the importance of the sample features. FIG. 5 shows a summary of features based on SHAP, which can reveal not only the importance of the effect, but also the general direction of the effect, providing great assistance in selecting features.

It can be understood that in this step, the statistically significant factors in step S2 are imported into the R language, and Shapley values of each sample feature are calculated by SHAP, where the larger the Shapley values, the more significant the influence on the model, so as to measure the importance of each feature on the final prediction result.

Referring to fig. 3, the main idea of shap is to solve the problem of distribution balance in the cooperative game theory, and for a machine learning model, the model obtains a predicted value for each sample, where the predicted result is a result commonly determined by the respective contribution value (Shapley value) of each feature. Let the ith sample be x _i The kth feature of the ith sample is x _ik The predicted value of the model to be explained on the sample is y _i Then SHAP interpretation obeys the following equation:

wherein,as a baseline of the whole model output, the prediction result expectation of all training samples on the original model is that is, the prediction mean value of all samples, +.>For the final predicted value y for the kth feature pair in the ith sample _i The contribution value of (i.e. shape), when +.>Indicating that this feature improves the predicted value, has a positive effect on the output, whereas when +.>This feature is described as decreasing the predicted value and acting in the opposite direction. Therefore, SHAP can not only reveal the importance of the effect, but also reflect the overall direction of the effect, and characterize the overall positive and negative relationship between the characteristic variable and the intolerance of premature feeding.

S4, dividing the training set and the testing set by adopting a 10-fold cross validation method for the input characteristic variables, training a training set model based on an XGBoost algorithm, and constructing a model function of the XGBoost.

Referring to fig. 6, the "10-fold cross-validation method" is a common method for validating classifier performance. The original data set is divided into 10 parts on average, 9 different parts are selected as training sets each time, 1 part is left as a test set, then training, prediction and evaluation of the model are carried out, and the result of each time is recorded. The above procedure was repeated 10 times and the recorded 10 results were averaged as a final indicator of the quality of the assessment model. The 10-fold cross validation ensures that each sample is validated once, so that the influence of data set division can be reduced, and the stability and generalization capability of the model can be conveniently inspected.

It should be noted that XGBoost is a lifting tree model, which integrates CART regression tree models together to form a strong classifier. The idea is to continuously add trees, and continuously perform feature splitting to grow a tree, and add one tree at a time, which is actually to learn a new function to fit the residual error of the last prediction.

Assuming a total of t trees, F represents the tree model, then the predicted valueCan be expressed as:

the objective function is:

wherein l is a loss function representing an error between the predicted value and the actual value; omega is a regularization function that prevents model overfitting.

The regularization function in XGBoost is expressed as follows:

where T represents the number of leaf nodes per tree, w represents the weight of the leaves per tree, and gamma and lambda are added in order to suppress the growth of the tree and prevent model overfitting. λ is the L2 regularization coefficient and γ is the splitting threshold. According to the objective function, the optimal scoring function is obtained by means of a solution, wherein the smaller the output value of the function is, the better the tree model is:

one tree model can be evaluated according to a scoring function, but the candidate tree is endless and it is impossible to score all candidate trees. The XGBoost algorithm uses a greedy algorithm to solve this problem, starting from the root node of the tree, calculates whether the post-split and pre-split objective function values decrease, assuming the pre-split node is j,

its contribution to the objective function is:

after the node splits, the objective function contributions of the two child nodes are:

at this time, the objective function is changed to:

finally, the information gain of the objective function after each split is obtained:

wherein GL and GR are respectively left and right She Ziyi step statistics sums during splitting, and HL and HR are information gains of left and right leaf node second-order gradient statistics sums.

In addition, grid search is used to find optimal parameters when training the model. Different parameter settings can have a great influence on the prediction effect of the model, and the grid search establishes a search space according to the parameter values, so that the parameters are comprehensively searched, and the best effect is obtained once.

Referring to fig. 7, the present application further uses an ROC curve to verify the constructed model, where the ROC curve uses a false positive rate (1-specificity) as a horizontal axis, a true positive rate (sensitivity) as a vertical axis, and the points are connected according to points generated by different boundary values, and the area under the curve AUC can reflect the accuracy of the diagnostic test. The index value range is between 0.5 and 1. It is considered that when AUC is 0.5 to 0.7, diagnostic accuracy is generally indicated, when AUC is 0.7 to 0.8, diagnostic is moderate, and when AUC >0.8, diagnostic is better. As can be seen from fig. 7, the model constructed by the present invention is better diagnostic.

S5, presetting a classification threshold, calculating the prediction probability of the sample to be predicted by using a model function, and if the prediction probability is larger than the classification threshold, judging that the premature infant corresponding to the sample to be predicted is intolerant to gastrointestinal feeding.

Specifically, the step of presetting the classification threshold value includes:

s51, calculating about sign indexes P of the prediction model under different data sets by adopting about sign rules;

s52, determining a critical point by using the about step index maximization, taking the average value of about step index P for 10 times, and taking the average value as the best classification threshold Bestp of the model.

Further, the step of calculating the about sign index P of the predictive model under different data sets using about sign rules includes:

s511, calculating the sensitivity and specificity of a prediction model by using a two-class confusion matrix;

s512, the method for calculating the about step index P comprises the following steps: p=sensitivity+specificity-1.

The confusion matrix is a visual tool, particularly used for supervised learning, and is a standard format for representing precision evaluation, and is represented by a matrix form of n rows and n columns, as shown in table 2 below.

TABLE 2

(1) Sensitivity: the identified positive examples are the proportion of all positive examples, i.e. the patient is judged as the patient, and no missed diagnosis occurs.

(2) Specificity: the identified negative examples are the proportion of all negative examples, namely, normal people are judged as normal people, and misjudgment does not occur.

The method for calculating the about step index P comprises the following steps: sensitivity + specificity-1 is the ability to find real patients and non-patients by integrating diagnostic methods to be evaluated minus the base "1". The larger the value, the better the diagnostic method to be evaluated.

The method for calculating the Bestp comprises the following steps:

it should be noted that constructing the model function y=f (x) of XGBoost, the model outputs a predictive probability P for each sample _i . The data set provided by the invention has K characteristics, and each sample characteristic can be expressed as follows:

X _i ＝(X _i1 ，X _i2 ，X _i3 …X _ik )

x in the above formula _i For the ith sample, X _ik For the kth feature of the ith sample, the predictive probability P for each sample _i ＝f(X _i ). When P _i Above the Bestp value, the sample is predicted to be intolerant to feeding, otherwise it is predicted to be intolerant to feeding, resulting in a risk prediction of intolerance to feeding in premature infants.

In summary, the data are preprocessed based on SHAP feature selection, then the importance of the features is ordered by calculating the SHAP value of each feature, and the features are selected for model training, so that the problems of dimension disasters and noise caused by more features and over-fitting caused by increased complexity of the model are overcome; the model is trained and predicted by adopting a 10-fold cross validation method, so that the influence of data division on the model prediction result is reduced, and the accuracy of model prediction is more reliable and the generalization capability is stronger; in addition, the XGBoost integrated learning model is built, the grid search is used for parameter tuning, the maximum approximate index principle is used for determining the optimal critical point, so that the reliability and the prediction accuracy of the model are further improved, and the XGBoost integrated learning model has great significance in preventing feeding intolerance of premature infants.

Referring to fig. 8, a second embodiment of the present invention provides a system for predicting risk of feeding intolerance of premature infants, comprising:

In particular, the plurality of sets of characteristic variables in the gastrointestinal feeding intolerance information include

(1) General condition of infant (sex, gestational age, birth weight, gestational time, apgar score one minute after birth, whether test tube infant, whether there is post-birth resuscitation history, body temperature condition); (2) infant mother conditions (gestational complications, amniotic fluid abnormality, placenta abnormality, umbilical cord abnormality, fetal membrane abnormality, production mode, assisted reproduction, whether multiple fetuses, fetal position, mother age); (3) diseases of infant after birth (neonatal asphyxia, neonatal respiratory distress syndrome, neonatal hypoxic ischemic encephalopathy, neonatal infection, neonatal hyperbilirubinemia, arterial catheter patent); (4) drug use cases (antibiotics, probiotics, lung surfactant PS, caffeine); (5) others (first milk opening time, second stool interval time, breathing machine use, blood transfusion, and apnea)

In the sample characteristic variable acquisition module, the data are recorded by a double-recording system and an automatic logic error correction system of the data of the case acquired by the characteristic variable acquisition module by adopting the epidata3.1, and then the data are imported into spss26.0 statistical software, all variables are assigned from 0, and the classified variables are changed into numerical variables.

the method for calculating the about sign index P of the prediction model under different data sets by adopting the about sign rule, and calculating the sensitivity and the specificity of the prediction model by utilizing a two-class confusion matrix comprises the following steps: p = sensitivity + specificity-1;

X _i ＝(X _i1 ，X _i2 ，X _i3 ...X _ik )

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims

1. A method of predicting risk of feeding intolerance in premature infants, comprising the steps of:

2. The method of claim 1, wherein the characteristic variables comprise the following sets of:

3. The method of claim 1, wherein assigning each premature case information based on whether gastrointestinal feeding is intolerant comprises:

otherwise, the second identification value is assigned.

4. A method of predicting risk of feeding intolerance in premature infants according to claim 3 wherein the step of performing a correlation analysis on each set of characteristic variables and intolerance variables to obtain a characteristic variable of high correlation, i.e. a sample characteristic variable, comprises:

5. The method of claim 1, wherein the step of calculating Shapley values for each set of sample characteristic variables, selecting sample characteristic variables having large Shapley values, and obtaining input characteristic variables comprises:

6. The method of claim 1, wherein the step of pre-setting a classification threshold comprises:

7. The method of claim 6, wherein the step of calculating the about mount index P of the predictive model under different data sets using about mount law comprises:

8. A system for predicting risk of feeding intolerance in premature infants, comprising: