CN114283937A - Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method - Google Patents

Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method Download PDF

Info

Publication number
CN114283937A
CN114283937A CN202111161003.XA CN202111161003A CN114283937A CN 114283937 A CN114283937 A CN 114283937A CN 202111161003 A CN202111161003 A CN 202111161003A CN 114283937 A CN114283937 A CN 114283937A
Authority
CN
China
Prior art keywords
risk assessment
risk
months
percentage
assessment model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111161003.XA
Other languages
Chinese (zh)
Inventor
赵明辉
陈旻
李志盈
王晋伟
王瑞雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University First Hospital
Original Assignee
Peking University First Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University First Hospital filed Critical Peking University First Hospital
Priority to CN202111161003.XA priority Critical patent/CN114283937A/en
Publication of CN114283937A publication Critical patent/CN114283937A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present disclosure provides a device and model training method for predicting the risk of kidney development of anti-neutrophil cytoplasmic antibody associated microangioitis (ANCA). The device comprises: an input unit configured to input, as input variables, values of three clinical pathology parameters of a target user having anti-neutrophil cytoplasmic antibody-associated small vessel inflammation, the three clinical pathology parameters including a normal glomerular percentage as a first continuous variable, an estimated glomerular filtration rate as a second continuous variable, and interstitial fibrosis/tubular atrophy as a third categorical variable; a risk assessment unit configured to input three input variables into a pre-trained risk assessment model based on a proportional risk regression model, the risk assessment model calculating one or more percentage values indicating a likelihood of the target user progressing to end stage renal disease after a predetermined one or more time periods from the three input variables; and an output unit configured to output one or more percentage values.

Description

Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method
Technical Field
The present disclosure relates to the field of disease prognosis models. In particular, the present disclosure relates to a device for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody (ANCA) -associated microangiopathy and a method of training a risk assessment model for predicting the risk of renal development of ANCA-associated microangiopathy.
Background
Anti-neutrophil cytoplasmic antibodies (ANCA) -associated microangiositis (AAV) is a class of diseases characterized by necrotizing microangiositis, including Microscopic Polyangiitis (MPA), Granulomatous Polyangiitis (GPA), and Eosinophilic Granulomatous Polyangiitis (EGPA), often accompanied by elevation of autoantibodies in serum (anti-protease 3 antibody and anti-myeloperoxidase antibody). The kidney is one of the most common affected organs of AAV, with 80-100% of MPA patients and 38-70% of GPA patients presenting glomerulonephritis, the typical pathological manifestation being oligoimmune complex-deposited crescentic nephritis. Although current immunotherapeutic regimens significantly improve the prognosis of AAV patients, some patients progress to end-stage renal disease (ESRD), i.e., require maintenance dialysis or kidney transplantation.
Factors found to be relevant to renal prognosis include age, AAV typing, initial renal function, renal recurrence, normal glomerular proportion, interstitial fibrosis/tubular atrophy (IF/TA).
However, in the prior art, either independent risk factors related to kidney prognosis are screened out, and risks cannot be evaluated visually, or simple classification variables are used as factor variables used in a scoring model. The classification variables are assigned values (0, 2, 4, 6, etc. according to different grades) and summed in the scoring model. Finally, patients were classified into low-risk group (score 0), medium-risk group (score 2-7) and high-risk group (score 8-11) according to total score.
However, because the scoring model uses classification variables, information in the data cannot be sufficiently efficiently mined, and the consistency of model predicted values and actual observed values of such scoring models is poor upon analysis. Furthermore, such scoring models ultimately only give descriptive results such as "low risk", "high risk", and patients are unable to make more accurate and effective treatment decisions.
Therefore, it is desirable to provide a new prognostic assessment tool that can more accurately assess ESRD risk in ANCA-related small vessel inflammation patients.
Disclosure of Invention
The present disclosure has been made in view of the above problems. It is an object of the present disclosure to provide an apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis, a method, an electronic device and a computer readable medium for training a risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis.
Embodiments of the present disclosure provide an apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis, comprising:
an input unit configured to input as input variables values of three clinical pathology parameters of a target user with anti-neutrophil cytoplasmic antibody-associated small vessel inflammation, the three clinical pathology parameters including percentage of normal glomeruli as a first continuous variable, estimated glomerular filtration rate (eGFR) as a second continuous variable, and interstitial fibrosis/tubular atrophy (IF/TA) as a third categorical variable;
a risk assessment unit configured to input three input variables into a pre-trained risk assessment model based on a proportional risk regression model, the risk assessment model calculating one or more percentage values indicating a likelihood of a target user progressing to End Stage Renal Disease (ESRD) after a predetermined one or more time periods from the three input variables; and
an output unit configured to output the one or more percentage values.
Optionally, the apparatus further comprises:
an obtaining unit configured to obtain, as a training dataset, prognosis-related data of a plurality of users having anti-neutrophil cytoplasmic antibody-associated small vessel inflammation, the prognosis-related data including at least a normal glomerular percentage, an estimated glomerular filtration rate and interstitial fibrosis/tubular atrophy (IF/TA), data indicative of a renal prognosis after a predetermined time of follow-up,
wherein the risk assessment unit is further configured to train the risk assessment model using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model.
Optionally, the first continuous variable has any value within a first range of values, the first continuous variable has any value within a second range of values, and the third categorical variable has a value corresponding to one of three categories, the three categories including a first category < 25%, a second category 25% -50%, and a third category > 50%.
Optionally, the plurality of time periods comprises 36 months, 60 months, and 120 months, and
the first weighting factor has a first value corresponding to 36 months, a second value corresponding to 60 months, and a third value corresponding to 120 months;
the second weighting factor has a fourth value corresponding to 36 months, a fifth value corresponding to 60 months, and a sixth value corresponding to 120 months;
the third weighting factor has a seventh value corresponding to 36 months, an eighth value corresponding to 60 months, and a ninth value corresponding to 120 months.
Optionally, the risk assessment unit is further configured to train the risk assessment model using the training dataset to determine an average of normal glomerular percentage, an average of estimated glomerular filtration rate and population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a basis survival function corresponding to each of a plurality of time periods.
Optionally, the risk assessment model calculates one or more percentage values indicative of a likelihood of the target user to progress to End Stage Renal Disease (ESRD) after a predetermined one or more time periods using the average of the normal glomerular percentage, the average of the estimated glomerular filtration rate, and the population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA), the value of the base survival function corresponding to each of the plurality of time periods, and the obtained value of the normal glomerular percentage, the value of the estimated glomerular filtration rate, and the grade of interstitial fibrosis/tubular atrophy (IF/TA) of the target user.
Optionally, the risk assessment unit is further configured to calculate a Harrell's C index indicative of a degree of discrimination of the risk assessment model, the Harrell's C index including a plurality of values corresponding to each of a plurality of time periods, and each of the plurality of values being greater than a first threshold.
Optionally, the risk assessment unit is further configured to calculate a value of a Hosmer-Lemeshow test indicating a degree of calibration of the risk assessment model, the value of the Hosmer-Lemeshow test having a plurality of values corresponding to each of a plurality of time periods, and each of the plurality of values being less than a second threshold.
Another embodiment of the present disclosure provides a method of training a risk assessment model for predicting the risk of kidney development of anti-neutrophil cytoplasmic antibody associated microangioitis, comprising:
obtaining as a training data set relevant data for a plurality of users having anti-neutrophil cytoplasmic antibody associated microangioitis, said relevant data comprising at least as a first continuous variable a percentage of normal glomeruli, as a second continuous variable an estimated glomerular filtration rate and as a third categorical variable interstitial fibrosis/tubular atrophy (IF/TA), data indicative of the prognosis of the kidney after a predetermined follow-up time,
training the risk assessment model using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model, wherein the risk assessment model calculates one or more percentage values indicative of a likelihood of the target user to progress to End Stage Renal Disease (ESRD) after a predetermined one or more time periods based on three input variables, including a percentage of normal glomeruli as the first continuous variable, an estimated glomerular filtration rate as the second continuous variable, and interstitial fibrosis/tubular atrophy (IF/TA) as the third categorical variable.
Optionally, the method further comprises:
training the risk assessment model using the training dataset to determine an average of normal glomerular percentage, an average of estimated glomerular filtration rate, and a population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a base survival function corresponding to each of a plurality of time periods.
Another embodiment of the present disclosure provides an electronic device including a memory and a processor, wherein the memory has stored thereon program code readable by the processor, which when executed by the processor performs the method of the above embodiment.
Another embodiment of the present disclosure provides a computer-readable storage medium having stored thereon computer-executable instructions for performing the method of the above embodiment.
Therefore, according to the apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis and the method of training the risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis according to the embodiments of the present disclosure, the complexity of the model is greatly simplified by screening out the percentage of normal glomeruli, estimating the glomerular filtration rate and interstitial fibrosis/tubular atrophy (IF/TA) as variables of the risk assessment model from a plurality of factors associated with the renal prognosis, and further, by entering the percentage of normal glomeruli, eGFR as continuous variables into the model and IF/TA as three classification variables into the model, information in data can be more effectively mined, creating a more predictive model. In addition, the scoring model of the embodiment of the disclosure can calculate the risk of the patient progressing to ESRD after a certain time, and the risk is presented in the form of percentage value, compared with descriptions such as "low risk, low risk" and the like, the result is more individualized and visualized, and the treatment decision of the patient can be conveniently guided.
Drawings
Fig. 1 is a flow chart illustrating a method of training a risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis according to a first embodiment of the present disclosure.
Fig. 2 is a flow diagram illustrating patient screening as a training data set.
Fig. 3 is a block diagram illustrating an apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis according to a second embodiment of the present disclosure.
Fig. 4 is an operation diagram showing an apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis according to a second embodiment of the present disclosure.
Fig. 5 is a schematic view showing the operation results of the apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis according to the second embodiment of the present disclosure.
Fig. 6 is a diagram showing the degree of calibration of the model.
Detailed Description
Technical solutions in embodiments of the present disclosure will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only some embodiments, but not all embodiments, of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The terms used in the present specification are those general terms currently widely used in the art in consideration of functions related to the present disclosure, but they may be changed according to the intention of a person having ordinary skill in the art, precedent, or new technology in the art. Also, specific terms may be selected by the applicant, and in this case, their detailed meanings will be described in the detailed description of the present disclosure. Therefore, the terms used in the specification should not be construed as simple names but based on the meanings of the terms and the overall description of the present disclosure.
Although the present disclosure makes various references to certain modules in an apparatus according to embodiments of the present disclosure, any number of different modules may be used and run on a user terminal and/or server. The modules are merely illustrative and different aspects of the apparatus and methods may use different modules.
Flowcharts are used in this disclosure to illustrate the operations performed by an apparatus according to embodiments of the present disclosure. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
< first embodiment >
Fig. 1 shows a flow chart of a method of training a risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis according to a first embodiment of the present disclosure.
Before explaining the method according to the first embodiment of the present disclosure, a Cox proportional hazards regression model is first briefly introduced. The risk assessment model according to embodiments of the present disclosure is a Cox model based on a proportional risk regression model. The Cox proportional hazards regression model was used to study the effect of certain factors on life span. The main objective of survival analysis is to study the effect of variables X (such as proteinuria, creatinine, blood pressure, age, and the multiple factor variables in this disclosure) on the observed outcome of a survival event (such as death, end stage renal disease), which is generally expressed as a survival function.
The Cox proportional risk regression model takes a risk rate function h (t, X) as a dependent variable, and establishes an exponential regression equation:
h(t,X)=h0(t)exp(β1X12X2+…+βmXm). (1)
equation 1 can be converted to equation 2
ln[h(t,X)/h0(t)]=ln RR=β1X12X2+…+βmXm· (2)
In COX regression, the effect of each risk factor (i.e., variable X) does not change with time (t), i.e., h (t, X)/h0(t) does not change with time.
In the above formula, β is called a regression coefficient, and the influence of the risk factor Xi on the relative risk HR can be calculated from β.
HR, the power β of the natural index e, EXP (β), is called relative risk, or risk ratio (denoted HR in COX regression and RR in logistic regression).
As shown in fig. 1, a method 100 of training a risk assessment model according to a first embodiment of the present disclosure includes:
step S101: obtaining as a training dataset data relating data of a plurality of users having anti-neutrophil cytoplasmic antibody associated microangioitis, said data comprising at least as a first continuous variable a percentage of normal glomeruli, as a second continuous variable an estimated glomerular filtration rate and as a third categorical variable interstitial fibrosis/tubular atrophy (IF/TA), data indicating a renal prognosis following a predetermined visit
Step S102: training the risk assessment model using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model, wherein the risk assessment model computes one or more percentage values indicative of a likelihood of the target user to progress to End Stage Renal Disease (ESRD) after a predetermined one or more time periods from three input variables, including a percentage of normal glomeruli as the first continuous variable, an estimated glomerular filtration rate as the second continuous variable, and interstitial fibrosis/tubular atrophy (IF/TA) as the third categorical variable;
step S103: training the risk assessment model using the training dataset to determine an average of normal glomerular percentage, an average of estimated glomerular filtration rate, and a population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a base survival function corresponding to each of a plurality of time periods.
Specifically, in step S101, the relevant data of a plurality of users having anti-neutrophil cytoplasmic antibody-associated microangioitis is first acquired as a training data set. The relevant data includes at least the percentage of normal glomeruli as a first continuous variable, the estimated glomerular filtration rate as a second continuous variable and interstitial fibrosis/tubular atrophy (IF/TA) as a third categorical variable. In addition, the relevant data also includes the renal prognosis, i.e., whether ESRD, after a predetermined time (e.g., at least 12 months) by the user.
It should be noted that not any data of the user may be used as the training data set, but the relevant data of the user satisfying the predetermined condition may be used as the training data set.
Fig. 2 shows a flow of screening users.
First, the applicant collected clinical and pathological data of AAV patients who had been hospitalized in the first hospital of the university of beijing for kidney biopsy and had a follow-up visit for > 12 months, all of whom met ANCA-related microangiopathy chcc (Chapel Hill Consensus) classification criteria (Jennette JC, et al.2012 reviewed International Chapel Hill Consensus Nomenclature of vasculitides, arthritis Rheum,2013.65(1): p.1-11.), excluded patients who incorporated other renal diseases such as IgA nephropathy, diabetic nephropathy, lupus nephritis, excluded patients with eosinophilic granulomatosis vasculitis (EGPA, because of the large difference in disease characteristics from MPA, GPA), and excluded patients who had entered ESRD at the time of diagnosis. Patients with biopsy specimens with glomeruli less than 10 were also rejected for accuracy of pathological outcome. The study was in accordance with the declaration of helsinki and was approved by the ethical committee of the first hospital, Beijing university.
Specifically, as shown in fig. 2, for 1229 AAV patients diagnosed in 1998 and 2019, the first step excluded 129 patients with ANCA negatives or combined with other renal diseases. The second step excluded 597 patients who entered ESRD following a visit less than 12 months or at diagnosis. The third step excluded 6 patients with EGPA. The fourth step excluded 211 patients without pathological data. The fifth step excluded 14 patients with less than 10 pathologically punctured glomeruli.
As a result, 272 AAV patients were finally grouped, as detailed in FIG. 2. The median follow-up period was 54.5 months (IQR32.0-89.0), 125 men, 147 women, the median age was 61.0(IQR 51.0-68.0) years, 214 MPA, 58 GPA, 246 MPO-ANCA, 26 PR3-ANCA, the median creatinine was 288.5. mu. mol/L (IQR 175.2-556.0), and the median eGFR was 16.9ml/min/1.73m2(IQR 7.6-31.8). A total of 82 patients progressed to ESRD at a median follow-up period of 21.5(IQR3.0-40.0) months.
See table 1 for baseline data for patients.
Figure BDA0003290167470000081
TABLE 1 Baseline data
As shown in table 1, the relevant data of the user includes at least the percentage of normal glomeruli as a first continuous variable, the estimated glomerular filtration rate as a second continuous variable and interstitial fibrosis/tubular atrophy (IF/TA) as a third categorical variable.
Then, the applicant organizes 2 pathologists to perform double-blind interpretation, and if the pathological results of the two pathologists are inconsistent, the pathological results are re-interpreted until the two pathologists achieve consensus. Normal glomeruli refer to glomeruli without vasculitis damage and sclerosis, and additionally, glomeruli with mild lesions due to ischemia or with infiltration of a few inflammatory cells are classified as normal glomeruli. Crescent refers to the cellular or fibrous component of Bowman's capsule of more than 10%. The cellular crescent is more than 10% of the crescent in terms of cellular components, and the fibrous crescent is more than 90% of the crescent in terms of fibrous components. By ball hardening is meant a hardening range exceeding 80% of the pellets. IF/TA was assessed semi-quantitatively and was divided into 3 grades: < 25%, 25-50% and > 50%.
As a result: the median glomerulus number is 26 (IQR 19-36), the median of normal glomerular proportion is 25% (IQR 10.6% -46.6%), the median of spherical sclerosis proportion is 14.8% (IQR 4.4% -29.8%), the cellular crescent proportion is 42.5% +/-22.8%, and the median of fibrous crescent proportion is 14.3% (IQR 4.4% -29.2%). In the semiquantitative scores for IF/TA, 153, 85 and 34 patients had lesions ranging < 25%, 25% -50%, and > 50%, respectively. See table 1 for details.
According to the training data set obtained in step S101, the average Ln eGFR is 2.80969392, the average percentage of normal glomeruli is 0.30635067, and the ratio of different levels of IF/TA population is: < 25%: 0.5625; 25% -50%: 0.3125; > 50%: 0.125.
then, in step S102, the risk assessment model is trained using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model, wherein the risk assessment model calculates one or more percentage values indicative of a likelihood of the target user to progress to End Stage Renal Disease (ESRD) after a predetermined one or more time periods from three input variables including a normal glomerular percentage as the first continuous variable, an estimated glomerular filtration rate as the second continuous variable, and an interstitial fibrosis/tubular atrophy (IF/TA) grade as the third categorical variable.
That is, in step S102, a Cox proportional hazards regression model is built using the normal glomerular percentage, IF/TA and eGFR obtained from the training data set.
Figure BDA0003290167470000091
Figure BDA0003290167470000101
TABLE 2
Fitting of COX regression, calculation of beta value and HR, etc., can be performed on the obtained training set data through statistical software. In this example, all values were calculated by statistical software from the inclusion 272 visit queue, which is the necessary basis for the calculation.
Taking 36 months as an example, the following Table 3 shows the calculation results of the variables
Figure BDA0003290167470000102
TABLE 3 variable calculation results
In conjunction with tables 2and 3, β corresponds to B in table 3, HR ═ β to the natural index e, i.e., EXP (β), and each variable has a corresponding HR (relative risk or risk ratio), HR response variable has a degree of impact on outcome, e.g., HR > 1, and explanatory variable is a contributor or risk factor to the end event.
95% CI: 95% confidence intervals, the 95% confidence intervals for the HR values are indicated in the table. The 95% CI is the basic concept of statistics, and the Confidence interval (Confidence interval) refers to the estimated interval of the overall parameter constructed from the sample statistics. In statistics, the confidence interval for a probability sample is an interval estimate for some overall parameter of the sample. A confidence level of 95% means that 95% of the confidence intervals in the multiple samples contain unknown parameter values and the other 5% do not contain true values. For example at 36 months, the 95% CI for each HR of this study is shown in the last two columns (lower, upper) of Table 3.
The P value is a basic concept of statistics, namely "significance" in table 3. The P value (P value) refers to the probability of the occurrence of a more extreme result than the resulting sample observation when the original assumption is true. Typically, P values of < 0.05 or 0.01 are considered statistically different.
In step S103, the risk assessment model is trained using the training data set to determine an average of normal glomerular percentage, an average of estimated glomerular filtration rate and population fraction of different grade interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a basis survival function corresponding to each of a plurality of time periods.
That is, in step S103, the model is translated to a specific ESRD risk.
Specifically, the ESRD risk is calculated using the following equation (3):
Figure BDA0003290167470000111
in the formula (3), β i is a regression coefficient, and HR is β power of natural index e in table 2, i.e., EXP (β). S0(t) is the basis survival function calculated by statistical software from the 272 people' S follow-up cohort at inclusion. The calculated values are respectively calculated in 36 months, 60 months and 120 months.
X beta is the product of regression coefficient and variable, and sigma is the sum of all variables multiplied by corresponding beta values, and the following calculation process is specifically demonstrated.
Xi are independent variables, X in the present invention corresponding to the three variables included in Table 2iRefers to the average of the variables.
In the present disclosure, Xi corresponds to the index of a specific patient, such as Zhangyi, known Ln eGFR value, normal glomerular percentage value, IF/TA grade, and is directly substituted into Zhangsan value.
Figure BDA0003290167470000112
Means the average of this variable, and specific values are shown below in table 4.
For different time periods (i.e., 36 months, 60 months, 120 months), the β i values are shown in table 4:
Figure BDA0003290167470000113
β in table 4 corresponds to B in table 3.
As described above, according to the training data set obtained in step S101, the average Ln eGFR is 2.80969392, the average percentage of normal glomeruli is 0.30635067, and the ratio of different levels of IF/TA population is: < 25%: 0.5625; 25% -50%: 0.3125; > 50%: 0.125.
thus, the risk of ESRD for 36 months, 60 months and 120 months is specifically calculated as follows:
36 month ESRD model
S0(t)=0.7472532093
36 month ESRD risk 1-0.7472532093exp(E+4.856920208)
Figure BDA0003290167470000121
(IF IF/TA < 25%) -0.989993 XLn eGFR-7.466535 XNormal glomerular percentage
(IF IF/TA 25% -50%) -0.989993 XLn eGFR-7.466535 XNormal glomerular percentages + 0.338392X 1
(IF IF/TA > 50%) -0.989993 XLn eGFR-7.466535 XNormal glomerular percentages + 0.849557X 1
60 months ESRD Risk
S0(t)=0.6938088719
60 months ESRD risk 1-0.6938088719exp(E+3.976369006)
Figure BDA0003290167470000122
(IF IF/TA < 25%) -0.805472 XLn eGFR-6.470339 XNormal glomerular percentage
(IF IF/TA 25% -50%) -0.805472 XLn eGFR-6.470339 XNormal glomerular percentages + 0.469069X 1
(IF IF/TA > 50%) -0.805472 XLn eGFR-6.470339 XNormal glomerular percentages + 0.978350X 1
③ 120 months ESRD Risk
S0(t)=0.5783374501
120Monthly ESRD risk of 1-0.5783374501exp(E+3.13006068)
Figure BDA0003290167470000123
(IF IF/TA < 25%) -0.590424 XLn eGFR-5.353108 XNormal glomerular percentage
(IF IF/TA 25% -50%) -0.590424 XLn eGFR-5.353108 XNormal glomerular percentages + 0.229645X 1
(IF IF/TA > 50%) -0.590424 XLn eGFR-5.353108 XNormal glomerular percentages + 0.775670X 1
Through the above method, a trained Cox proportional hazards regression model can be obtained. In subsequent uses, the target user's 36 month ESRD risk, 60 month ESRD risk, and 120 month ESRD risk may be calculated by simply entering the target user's eGFR, normal glomerular percentage, and IF/TA ratings.
< second embodiment >
Fig. 3 is a block diagram illustrating an apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis according to a second embodiment of the present disclosure.
The apparatus according to the second embodiment may be an electronic device of a mobile electronic device, a notebook computer, a desktop computer, a server, etc., as long as the electronic device has computing capabilities.
As shown in fig. 3, the apparatus 300 according to the second embodiment includes:
an input unit 301 configured to input, as input variables, values of three clinical pathology parameters of a target user having anti-neutrophil cytoplasmic antibody-associated microangioitis, the three clinical pathology parameters including a percentage of normal glomeruli as a first continuous variable, an estimated glomerular filtration rate as a second continuous variable, and interstitial fibrosis/tubular atrophy (IF/TA) as a third categorical variable;
a risk assessment unit 302 configured to input three input variables into a pre-trained risk assessment model based on a proportional risk regression model that calculates one or more percentage values indicating a likelihood of a target user progressing to End Stage Renal Disease (ESRD) after a predetermined one or more time periods from the three input variables; and
an output unit 303 configured to output the one or more percentage values.
Specifically, the input unit 301 may be, for example, a mouse, a keyboard, or a touch input element, or the like. As shown in fig. 4, the user can enter the percentage of normal glomeruli, the estimated glomerular filtration rate, and the level of interstitial fibrosis/tubular atrophy (IF/TA) on a graphical interface displayed on the display screen of the device 300.
As previously described, the normal glomerular percentage is a first continuous variable, having any value within a first interval of values. The estimated glomerular filtration rate (eGFR) is a second continuous variable having any value within a second interval of values. Interstitial fibrosis/tubular atrophy (IF/TA) is the third categorical variable with one numerical value corresponding to one of three categories, including < 25% of the first category, 25% -50% of the second category, and > 50% of the third category.
As shown in FIG. 5, the user-entered eGFR value is 34ml/min/1.73m2. The percentage of normal glomeruli input was 22.66%. The input IF/TA is<25%。
The risk assessment unit 302 inputs the three input variables into a pre-trained risk assessment model based on a proportional risk regression model. The pre-trained risk assessment model may be a Cox risk assessment model obtained by the model training method of the first embodiment.
The risk assessment unit 302 trains the risk assessment model using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model. I.e., β i in table 4.
The risk assessment model calculates one or more percentage values indicative of a likelihood of the target user progressing to End Stage Renal Disease (ESRD) after a predetermined one or more time periods based on three input variables.
As shown in fig. 5, the plurality of time periods include 36 months, 60 months, and 120 months. The risk assessment model calculates three percentage values, i.e., 18.97%, 23.11%, and 37.14%, indicating the likelihood of the target user progressing to End Stage Renal Disease (ESRD) at 36 months, 60 months, and 120 months, respectively, based on three input variables.
The output unit 303 may output the percentage value.
The first weighting factor has a first value corresponding to 36 months, a second value corresponding to 60 months, and a third value corresponding to 120 months,
the second weighting factor has a fourth value corresponding to 36 months, a fifth value corresponding to 60 months, and a sixth value corresponding to 120 months,
the third weighting factor has a seventh value corresponding to 36 months, an eighth value corresponding to 60 months, and a ninth value corresponding to 120 months.
Furthermore, the apparatus 300 according to the second embodiment comprises an obtaining unit 304 configured to obtain, as a training dataset, prognosis-related data of a plurality of users having anti-neutrophil cytoplasmic antibody-associated microangioitis, the prognosis-related data comprising at least a normal glomerular percentage, an estimated glomerular filtration rate and interstitial fibrosis/tubular atrophy (IF/TA), and data indicative of a renal prognosis after a predetermined time of follow-up.
The risk assessment unit 302 is further configured to train the risk assessment model using the training dataset to determine an average of normal glomerular percentage, an average of estimated glomerular filtration rate and population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a basis survival function corresponding to each of a plurality of time periods.
The risk assessment model calculates one or more percentage values indicative of a likelihood of the target user to progress to End Stage Renal Disease (ESRD) after a predetermined period or periods of time using the average of the normal glomerular percentage, the average of the estimated glomerular filtration rate, and the population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA), the value of the basis survival function corresponding to each of the plurality of periods of time, and the obtained value of the normal glomerular percentage, the value of the estimated glomerular filtration rate, and the grade of interstitial fibrosis/tubular atrophy (IF/TA) of the target user. That is, the percentage value indicating the predicted ESRD is calculated by the above-described formula (4).
In addition, the risk assessment unit 302 may also assess the predictive value of the ANCA-related small vasculitis ESRD risk assessment model. And the prediction efficiency evaluation index of the model adopts discrimination and calibration.
For example, risk assessment unit 302 may calculate a Harrell's C index that indicates the degree of discrimination of the risk assessment model. The Harrell's C index includes a plurality of values corresponding to each of a plurality of time periods, and each of the plurality of values is greater than a first threshold.
Harrell's C index is the C statistic, also called Discrimination (Discrimination). A good disease risk prediction model can correctly distinguish different populations with high and low future morbidity risks, the prediction model judges morbidity by setting a certain risk threshold value when the risk threshold value is higher than the threshold value, and judges no morbidity when the risk threshold value is lower than the threshold value, so that whether an individual can have an ending event or not is correctly distinguished, and the distinguishing degree of the prediction model is the distinguishing degree of the prediction model. The index for evaluating the distinguishing capability of the prediction model is the most common C statistic (range 0.5-1), and the larger the C statistic, the better the distinguishing capability of the prediction model is. Generally, the discrimination is considered to be poor when the C statistic is less than 0.6, the discrimination is considered to be certain in the models from 0.6 to 075, and the discrimination is considered to be good when the statistic is more than 0.75. The closer Harrell's C is to 1, the better the discrimination. 0.75 is an example of the first threshold.
In this example, the results were Harrell's C indices 0.8936,0.8786, and 0.8655 at 36, 60, and 120 months, respectively, indicating that the model is well differentiated and allows for the separation of populations at different risks. 0.85 is a second example of the first threshold.
Further, the risk assessment unit 302 may further calculate a value of a Hosmer-Lemeshow test indicating a degree of calibration of the risk assessment model, the value of the Hosmer-Lemeshow test having a plurality of values corresponding to each of a plurality of time periods, and each of the plurality of values being smaller than the second threshold.
The Hosmer-Lemeshow test is a common statistical method for assessing the degree of model Calibration (Calibration). The Calibration degree (Calibration) of the prediction model is an important index for evaluating the accuracy of the probability of the occurrence of the fate event of a certain individual in the future of a disease risk model, and reflects the consistency degree of the model prediction risk and the actual occurrence risk, so the model can also be called consistency. The calibration degree is good, the accuracy of the prediction model is high, and the calibration degree is poor, so that the model can possibly overestimate or underestimate the occurrence risk of diseases. P of the Hosmer-Lemeshow test is greater than 0.05, and the prediction result is not different from the actual result, namely the prediction model has better accuracy. 0.05 is an example of the second threshold.
In this example, the results of 36, 60 and 120 month Hosmer-Lemeshow tests were P-0.2070, P-0.0092 and P <0.0001, respectively, indicating that the 36 month ESRD model was well calibrated. As shown in fig. 6.
The Harrell's C index and the Hosmer-Lemeshow test were calculated by statistical software.
Therefore, according to the apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis according to the embodiments of the present disclosure, the complexity of the model is greatly simplified by screening out the percentage of normal glomeruli from a plurality of factors related to the prognosis of the kidney, estimating the glomerular filtration rate and interstitial fibrosis/tubular atrophy (IF/TA) as variables of the risk assessment model, and furthermore, by entering the percentage of normal glomeruli, eGFR as continuous variables into the model, and IF/TA as three classification variables into the model, information in data can be more effectively mined, and a more predictive model can be built. In addition, the scoring model of the embodiment of the disclosure can calculate the risk of the patient progressing to ESRD after a certain time, and the risk is presented in the form of percentage value, compared with descriptions such as "low risk, low risk" and the like, the result is more individualized and visualized, and the treatment decision of the patient can be conveniently guided.
Based on the above embodiments, the embodiments of the present disclosure also provide electronic devices of another exemplary implementation. In some possible embodiments, an electronic device in the embodiments of the present disclosure may include a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program may implement the method for training a risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis in the embodiments described above.
Embodiments of the present disclosure also provide a computer-readable storage medium. The computer-readable storage medium has stored thereon computer-executable instructions. When executed by a processor, the computer-executable instructions may perform a method of training a risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis according to embodiments of the present disclosure described with reference to the above figures.
Embodiments of the present disclosure also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform a method of training a risk assessment model for predicting risk of renal development of anti-neutrophil cytoplasmic antibody-associated microangioitis according to an embodiment of the present disclosure.
Those skilled in the art will appreciate that the disclosure of the present disclosure is susceptible to numerous variations and modifications. For example, the various devices or components described above may be implemented in hardware, or may be implemented in software, firmware, or a combination of some or all of the three.
Further, while the present disclosure makes various references to certain elements of a system according to embodiments of the present disclosure, any number of different elements may be used and run on a client and/or server. The units are illustrative only, and different aspects of the systems and methods may use different units.
It will be understood by those skilled in the art that all or part of the steps of the above methods may be implemented by instructing the relevant hardware through a program, and the program may be stored in a computer readable storage medium, such as a read only memory, a magnetic or optical disk, and the like. Alternatively, all or part of the steps of the above embodiments may be implemented using one or more integrated circuits. Accordingly, each module/unit in the above embodiments may be implemented in the form of hardware, and may also be implemented in the form of a software functional module. The present disclosure is not limited to any specific form of combination of hardware and software.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The foregoing is illustrative of the present disclosure and is not to be construed as limiting thereof. Although illustrative embodiments of the present disclosure have been described, those skilled in the art will readily appreciate that many modifications are possible in the illustrative embodiments without materially departing from the novel teachings and advantages of the present disclosure. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the claims. It is to be understood that the foregoing is illustrative of the present disclosure and is not to be construed as limited to the specific embodiments disclosed, and that modifications to the disclosed embodiments, as well as other embodiments, are intended to be included within the scope of the appended claims. The present disclosure is defined by the claims and their equivalents.

Claims (12)

1. An apparatus for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis, comprising:
an input unit configured to input, as input variables, values of three clinical pathology parameters of a target user having anti-neutrophil cytoplasmic antibody-associated microangioitis, the three clinical pathology parameters including a percentage of normal glomeruli as a first continuous variable, an estimated glomerular filtration rate as a second continuous variable, and interstitial fibrosis/tubular atrophy (IF/TA) as a third categorical variable;
a risk assessment unit configured to input three input variables into a pre-trained risk assessment model based on a proportional risk regression model, the risk assessment model calculating one or more percentage values indicating a likelihood of a target user progressing to End Stage Renal Disease (ESRD) after a predetermined one or more time periods from the three input variables; and
an output unit configured to output the one or more percentage values.
2. The apparatus of claim 1, further comprising:
an obtaining unit configured to obtain, as a training dataset, prognosis-related data of a plurality of users having anti-neutrophil cytoplasmic antibody-associated small vessel inflammation, the prognosis-related data including at least a normal glomerular percentage, an estimated glomerular filtration rate and interstitial fibrosis/tubular atrophy (IF/TA), data indicative of a renal prognosis after a predetermined time of follow-up,
wherein the risk assessment unit is further configured to train the risk assessment model using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model.
3. The apparatus of claim 2, wherein the first continuous variable has any value within a first range of values, the second continuous variable has any value within a second range of values, and the third categorical variable has a value corresponding to one of three categories, the three categories including a first category < 25%, a second category 25% -50%, and a third category > 50%.
4. The apparatus of claim 3, wherein the plurality of time periods comprise 36 months, 60 months, and 120 months, and
the first weighting factor has a first value corresponding to 36 months, a second value corresponding to 60 months, and a third value corresponding to 120 months,
the second weighting factor has a fourth value corresponding to 36 months, a fifth value corresponding to 60 months, and a sixth value corresponding to 120 months,
the third weighting factor has a seventh value corresponding to 36 months, an eighth value corresponding to 60 months, and a ninth value corresponding to 120 months.
5. The apparatus of claim 4, wherein the risk assessment unit is further configured to train the risk assessment model using the training dataset to determine an average of normal glomerular percentages, an average of estimated glomerular filtration rates and population fractions of different grades of interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a basis survival function corresponding to each of a plurality of time periods.
6. The apparatus of claim 5, wherein the risk assessment model calculates one or more percentage values indicative of a likelihood of the target user progressing to End Stage Renal Disease (ESRD) after a predetermined one or more time periods using the average of normal glomerular percentages, the average of estimated glomerular filtration rates and the population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA), the value of the base survival function corresponding to each of the plurality of time periods, and the obtained value of normal glomerular percentage, the value of estimated glomerular filtration rates and the grade of interstitial fibrosis/tubular atrophy (IF/TA) of the target user.
7. The apparatus of claim 6, wherein the risk assessment unit is further configured to calculate a Harrell's C index indicative of a degree of discrimination of the risk assessment model, the Harrell's C index including a plurality of values corresponding to each of a plurality of time periods, and each of the plurality of values being greater than a first threshold.
8. The apparatus of claim 7, wherein the risk assessment unit is further configured to calculate a value of a Hosmer-Lemeshow test indicative of a degree of calibration of the risk assessment model, the value of the Hosmer-Lemeshow test corresponding to a plurality of values for each of a plurality of time periods, and each of the plurality of values being less than a second threshold.
9. A method of training a risk assessment model for predicting the risk of renal development of anti-neutrophil cytoplasmic antibody associated microangioitis, comprising:
obtaining, as a training dataset, relevant data for a plurality of users having anti-neutrophil cytoplasmic antibody associated microangioitis, the relevant data comprising at least a percentage of normal glomeruli as a first continuous variable, an estimated glomerular filtration rate as a second continuous variable, interstitial fibrosis/tubular atrophy (IF/TA) as a third categorical variable, and data indicative of the prognosis of the kidney after a predetermined time of follow-up,
training the risk assessment model using the training data set to determine a first weight factor for a first continuous variable, a second weight factor for a second continuous variable, and a third weight factor for a third categorical variable of the risk assessment model, wherein the risk assessment model calculates one or more percentage values indicative of a likelihood of the target user to progress to End Stage Renal Disease (ESRD) after a predetermined one or more time periods based on three input variables, including a percentage of normal glomeruli as the first continuous variable, an estimated glomerular filtration rate as the second continuous variable, and interstitial fibrosis/tubular atrophy (IF/TA) as the third categorical variable.
10. The method of claim 9, further comprising:
training the risk assessment model using the training dataset to determine an average of normal glomerular percentage, an average of estimated glomerular filtration rate, and a population fraction of different grades of interstitial fibrosis/tubular atrophy (IF/TA) for the risk assessment model, and a value of a base survival function corresponding to each of a plurality of time periods.
11. An electronic device comprising a memory and a processor, wherein the memory has stored thereon program code readable by the processor, which when executed by the processor, performs the method of any of claims 9-10.
12. A computer-readable storage medium having stored thereon computer-executable instructions for performing the method of any one of claims 9-10.
CN202111161003.XA 2021-09-30 2021-09-30 Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method Pending CN114283937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111161003.XA CN114283937A (en) 2021-09-30 2021-09-30 Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111161003.XA CN114283937A (en) 2021-09-30 2021-09-30 Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method

Publications (1)

Publication Number Publication Date
CN114283937A true CN114283937A (en) 2022-04-05

Family

ID=80868657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111161003.XA Pending CN114283937A (en) 2021-09-30 2021-09-30 Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method

Country Status (1)

Country Link
CN (1) CN114283937A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114974598A (en) * 2022-06-29 2022-08-30 山东大学 Lung cancer prognosis prediction model construction method and lung cancer prognosis prediction system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114974598A (en) * 2022-06-29 2022-08-30 山东大学 Lung cancer prognosis prediction model construction method and lung cancer prognosis prediction system
CN114974598B (en) * 2022-06-29 2024-04-16 山东大学 Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system

Similar Documents

Publication Publication Date Title
US20090226916A1 (en) Automated Analysis of DNA Samples
JP6246889B1 (en) Device, method and program for selecting explanatory variables
TWI803765B (en) Detecting, evaluating and predicting system for cancer risk
CN105229471A (en) For determining the system and method for preeclampsia risk based on biochemical biomarker analysis
US20110137897A1 (en) Systems and methods for data analysis
TW201248425A (en) Comprehensive glaucoma determination method utilizing glaucoma diagnosis chip and deformed proteomics cluster analysis
CN114283937A (en) Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method
Li et al. Dynamic prediction of motor diagnosis in Huntington’s disease using a joint modeling approach
Fisher et al. Dementia Population Risk Tool (DemPoRT): study protocol for a predictive algorithm assessing dementia risk in the community
Cournane et al. Predicting outcomes in emergency medical admissions using a laboratory only nomogram
CN107169264A (en) A kind of complex disease diagnostic method and system
Thuluvath et al. Acute liver failure in Budd–Chiari syndrome and a model to predict mortality
US20230386665A1 (en) Method and device for constructing autism spectrum disorder (asd) risk prediction model
Ling et al. A prediction model for length of stay in the ICU among septic patients: A machine learning approach
Liu et al. High-dimensional variable selection in meta-analysis for censored data
CN114936204A (en) Feature screening method and device, storage medium and electronic equipment
CN115346674A (en) Constipation risk prediction model for amyotrophic lateral sclerosis patient and application of constipation risk prediction model
Lee et al. Immigration and adherence to cervical cancer screening: a provincewide longitudinal matched cohort study using multistate transitional models
CN115035974A (en) Psychological assessment data management system and method
CN112184415A (en) Data processing method and device, electronic equipment and storage medium
Ji et al. Predicting post-stroke cognitive impairment using machine learning: A prospective cohort study
CN112614595A (en) Survival analysis model construction method and device, electronic terminal and storage medium
Ke et al. Influence analysis for the area under the receiver operating characteristic curve
Yördan et al. Hybrid AI-Based Chronic Kidney Disease Risk Prediction
Pesta Does IQ Cause Race Differences in Well-being?

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination