CN115206458A - Method and device for predicting plasma concentration of cyclosporin a - Google Patents

Method and device for predicting plasma concentration of cyclosporin a Download PDF

Info

Publication number
CN115206458A
CN115206458A CN202210707457.0A CN202210707457A CN115206458A CN 115206458 A CN115206458 A CN 115206458A CN 202210707457 A CN202210707457 A CN 202210707457A CN 115206458 A CN115206458 A CN 115206458A
Authority
CN
China
Prior art keywords
cyclosporin
concentration
plasma
variable
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210707457.0A
Other languages
Chinese (zh)
Inventor
张津源
于泽
高飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Medicinovo Technology Co ltd
Original Assignee
Beijing Medicinovo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Medicinovo Technology Co ltd filed Critical Beijing Medicinovo Technology Co ltd
Priority to CN202210707457.0A priority Critical patent/CN115206458A/en
Publication of CN115206458A publication Critical patent/CN115206458A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention provides a method and a device for predicting blood concentration of cyclosporin a, wherein the method comprises the following steps: acquiring clinical data of an HSCT patient to be predicted; inputting the clinical data into a cyclosporin a blood concentration prediction model to obtain a cyclosporin a blood concentration prediction value of the patient to be predicted of the HSCT patient output by the cyclosporin a blood concentration prediction model; the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label. The method has high prediction efficiency, high accuracy and wide coverage population, thereby predicting whether the plasma concentration of the cyclosporine a is normal or not, and carrying out medication adjustment by referring to the plasma concentration of the cyclosporine a, and reducing the risk of adverse reaction of a patient.

Description

Method and device for predicting plasma concentration of cyclosporin a
Technical Field
The invention relates to the technical field of big data processing, in particular to a method and a device for predicting cyclosporine a blood concentration.
Background
Hematopoietic Stem Cell Transplantation (HSCT) is an effective treatment for most malignant and non-malignant blood diseases, including acute lymphocytic leukemia, acute myelogenous leukemia, severe aplastic anemia, and the like. However, graft Versus Host Disease (GVHD) is a common complication after transplantation, significantly affecting patient survival and prognosis, limiting the clinical application of HSCT.
Cyclosporin a (CsA) is a calcineurin inhibitor, an important component of the immunosuppressive therapy of aGVHD (allo-GVHD, acute graft versus host disease), and has been widely used in hematology. Low concentrations of CsA (especially early after HSCT) are associated with GVHD incidence, while CsA overexposure is one of the risk factors for nephrotoxicity and even relapse. In addition to the therapeutic window narrowing, the inter-individual variability of CsA is also higher. The concentration of CsA can be influenced by many factors, such as age, sex, hematocrit (HCT), triazole antifungal agents, and pharmacogenomics. Therefore, it is necessary and meaningful to monitor CsA concentrations during treatment.
Therapeutic Drug Monitoring (TDM) is the primary method for clinical Monitoring of CsA concentration. But there is a hysteresis effect in the monitoring of the concentration. The CsA concentration can be predicted in advance by establishing a prediction model, a prospective suggestion can be provided for dose adjustment, and the incidence rate of adverse drug reactions is reduced.
In clinical prediction of post-operative CsA concentration studies in HSCT patients, popPK (population pharmacokinetics) models are currently established for various populations and diseases. However, specialized software and technicians are required in the process of establishing or using the popPK model, and there are some limitations in terms of adaptability, functionality, and ease of use.
Disclosure of Invention
The invention provides a method and a device for predicting blood drug concentration of cyclosporine a, which are used for solving the defects that the CsA concentration prediction by using a popPK model in the prior art is high in requirement and has limitation in the aspects of adaptability, functionality and usability, and realizing the rapid and accurate prediction of the blood drug concentration of cyclosporine a based on deep learning.
The invention provides a method for predicting blood concentration of cyclosporin a, which comprises the following steps:
acquiring clinical data of an HSCT patient to be predicted;
inputting the clinical data into a plasma concentration prediction model of cyclosporine a to obtain a plasma concentration prediction value of cyclosporine a of the HSCT patient to be predicted, which is output by the plasma concentration prediction model of cyclosporine a;
the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
According to the method for predicting the plasma concentration of the cyclosporine a, provided by the invention, the clinical data comprise demographic information, medication information, inspection information, diagnosis information, operation information, treatment scheme and adverse reaction.
According to the method for predicting the plasma concentration of cyclosporin a provided by the invention, before the step of inputting the clinical data into a plasma concentration prediction model of cyclosporin a and obtaining the predicted value of the plasma concentration of cyclosporin a of the patient to be predicted, which is output by the plasma concentration prediction model of cyclosporin a, the method further comprises the following steps:
acquiring the deletion rate of each characteristic variable in the clinical data of the HSCT patient, and deleting the characteristic variables of which the deletion rate is greater than a first preset threshold value;
primary screening is carried out on the deleted characteristic variables based on a statistical method;
re-screening the primarily screened characteristic variables based on the characteristic engineering;
and constructing a plasma concentration prediction model of the cyclosporine a based on a Wide & Deep model by taking the actual plasma concentration value of the cyclosporine a of the HSCT patient as a target variable and taking the re-screened characteristic variable as an independent variable.
According to the cyclosporine a blood concentration prediction method provided by the invention, the step of primary screening of the deleted characteristic variables based on a statistical method comprises the following steps:
judging whether the relation between the target variable and the continuous variable is obvious or not based on a Pearson inspection method under the condition that the deleted characteristic variable is the continuous variable;
if the relation is obvious, the continuous variable is reserved, otherwise, the continuous variable is deleted;
obtaining the discreteness of each deleted feature variable, and taking the logarithm of the feature variable of which the discreteness is greater than a second preset threshold value;
under the condition that the deleted characteristic variables are classified variables, judging whether the relation between the target variables and the classified variables is obvious or not based on a Mann-Whitney U test method;
and if the relation is remarkable, keeping the classification variable, and otherwise, deleting the classification variable.
According to the method for predicting the plasma concentration of the cyclosporine a, provided by the invention, the step of screening the characteristic variables of the primary screening again based on the characteristic engineering comprises the following steps:
taking the actual value of the plasma concentration of cyclosporine a of the HSCT patient as a target variable, taking the primarily screened characteristic variable as a covariate, and constructing a characteristic importance prediction model based on a Catboost algorithm;
adjusting parameters of the feature importance prediction model based on six-fold cross validation so that the feature importance prediction model is optimal;
based on the optimal feature importance prediction model, acquiring the importance of each primarily screened feature variable, and screening the feature variables with the maximum importance by a preset number;
and the XGboost-based feature sequence forward selection algorithm screens out the optimal feature variables again from the screened feature variables.
According to the method for predicting the plasma concentration of the cyclosporine a, the step of constructing the plasma concentration prediction model of the cyclosporine a based on the Wide & Deep model comprises the following steps:
according to Wide&Evaluation index R of Deep model 2 RMSE and MAE, for said Wide&Performing joint training on a Wide module and a Deep module in the Deep model; the method comprises the following steps that FTRL and L1 regularization algorithms are used for learning the Wide module, and AdaGrad algorithms are used for learning the Deep module;
and taking the trained Wide & Deep model as the plasma cyclosporine a concentration prediction model.
The invention also provides a device for predicting the plasma concentration of cyclosporine a, which comprises:
the acquisition module is used for acquiring clinical data of the HSCT patient to be predicted;
the prediction module is used for inputting the clinical data into a cyclosporin a blood concentration prediction model and acquiring a cyclosporin a blood concentration prediction value of the patient to be predicted of the HSCT patient, which is output by the cyclosporin a blood concentration prediction model;
the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the plasma concentration prediction method of the cyclosporine a.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for predicting plasma cyclosporin a concentration as described in any one of the above.
The present invention also provides a computer program product comprising a computer program, which when executed by a processor, implements a method for predicting plasma cyclosporin a concentration as described in any one of the above.
According to the method and the device for predicting the plasma concentration of the cyclosporin a provided by the invention, the special model for predicting the plasma concentration of the cyclosporin a is established by using the real world clinical data of the HSCT patient, the trained model is used for inputting the clinical data of the patient, so that the predicted value of the plasma concentration of the cyclosporin a of the patient can be automatically predicted, the prediction efficiency is high, the accuracy is high, the coverage population is wide, and therefore whether the plasma concentration of the cyclosporin a is normal or not is predicted, medication adjustment is carried out by referring to the plasma concentration of the cyclosporin a, and the risk of adverse reaction of the patient is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for predicting plasma cyclosporin a concentration according to the present invention;
FIG. 2 is a second schematic flow chart of the plasma cyclosporin a concentration prediction method of the present invention;
FIG. 3 is a third schematic flow chart of a plasma cyclosporin a concentration prediction method according to the present invention;
fig. 4 is a schematic structural view of a plasma cyclosporin a concentration prediction apparatus according to the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
The blood plasma cyclosporin a concentration prediction method of the present invention will be described with reference to fig. 1, which comprises:
step 101, acquiring clinical data of an HSCT patient to be predicted;
wherein, the HSCT patient to be predicted is the HSCT patient needing to predict the plasma concentration of the cyclosporine a.
Clinical data of the HSCT patient to be predicted in the real world, such as demographic information, medication information, diagnosis information, operation information, blood routine, urine routine and the like, of the patient are acquired.
102, inputting the clinical data into a cyclosporin a blood concentration prediction model, and obtaining a cyclosporin a blood concentration prediction value of the patient to be predicted of the HSCT patient output by the cyclosporin a blood concentration prediction model;
the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
Clinical data of a sample HSCT patient are used to construct a plasma cyclosporin a concentration prediction model, and this embodiment is not limited to the type of plasma cyclosporin a concentration prediction model.
Clinical data of a sample HSCT patient, such as the actual plasma cyclosporin a concentration value, demographic information, medication information, operation information, diagnosis information, blood routine, urine routine and the like, of the patient are collected. According to the requirements of patients in different treatment periods, the clinical data of the patients are spliced by taking the plasma concentration monitoring of the cyclosporine a of the patients as a reference, and a real world database of the hematopoietic stem cell transplantation patients is established.
Optionally, after data cleaning is performed on the original data in the real world database of the HSCT patient, a plasma concentration prediction model of cyclosporine a is constructed.
As shown in fig. 2, clinical data of a patient to be predicted in the HSCT is input into the plasma cyclosporin a concentration prediction model, and the model can output a predicted value of plasma cyclosporin a concentration, so that the predictability and controllability of the administration effect of cyclosporin a are improved, and adverse drug reactions are reduced.
According to the embodiment, the real world clinical data of the HSCT patient is used for establishing the exclusive model for predicting the plasma concentration of the cyclosporin a, the trained model is used for inputting the clinical data of the patient, the predicted value of the plasma concentration of the cyclosporin a of the patient can be automatically predicted, the prediction efficiency is high, the accuracy is high, the coverage population is wide, and therefore whether the plasma concentration of the cyclosporin a is normal or not is predicted, medication adjustment is carried out by referring to the plasma concentration of the cyclosporin a, and the risk of adverse reaction of the patient is reduced.
On the basis of the above embodiments, the clinical data in this embodiment includes demographic information, medication information, test information, diagnosis information, surgical information, treatment plan, and adverse reaction.
The demographic information includes, among other things, the patient's name, identification number, and age. The medication information includes a medication name and a dosage. The verification information includes a verification term and a verification result. The diagnostic information includes a diagnostic result.
On the basis of the foregoing embodiments, as shown in fig. 3, in this embodiment, before the step of inputting the clinical data into a plasma concentration prediction model of cyclosporin a and obtaining a predicted value of the plasma concentration of cyclosporin a output by the plasma concentration prediction model of cyclosporin a of the HSCT patient to be predicted, the method further includes:
obtaining the deletion rate of each characteristic variable in the clinical data of the HSCT patient, and deleting the characteristic variable with the deletion rate larger than a first preset threshold value;
clinical data of HSCT patients are characterized by high data dimension and serious deletion. Therefore, the embodiment not only processes the data missing problem, but also quickly and effectively screens out the characteristic variables from the high-dimensional data, so that the optimal model prediction effect is achieved by using the least characteristic variables.
For example, if the first preset threshold is set to 50%, the feature variables with the missing rate greater than 50% are deleted.
Primary screening is carried out on the deleted characteristic variables based on a statistical method; re-screening the primarily screened characteristic variables based on the characteristic engineering;
and (4) combining a statistical method and characteristic engineering, screening the characteristic variables for multiple times, and obtaining important characteristics of the patient and test detection results. And taking the finally screened characteristic variables as final modeling variables to construct a new modeling data set.
And constructing a plasma concentration prediction model of the cyclosporine a based on a Wide & Deep model by taking the actual plasma concentration value of the cyclosporine a of the HSCT patient as a target variable and taking the re-screened characteristic variable as an independent variable.
The Wide & Deep model not only has strong model memory capacity, but also has good model generalization capacity, so that the concurrent features can be learned, the collinearity existing in the historical data can be explored, and a new feature combination can be explored through the migration correlation, therefore, the model can well analyze and model the form data.
The advantage of the Wide & Deep algorithm is as follows:
1. the method can process large-scale medical data, is low in memory usage, and has a faster training speed.
2. The low-order and high-order combination characteristics are learned simultaneously, and a linear model and a Deep model are mixed.
3. The method can not only efficiently learn specific combinations, but also learn feature combinations which do not appear in a training set, and mine data patterns hidden behind the features.
4. The method has the capabilities of autonomous learning and incremental learning, and has higher model accuracy.
5. Carrying out unified modeling by utilizing memory and generalization force;
6. the response speed is high, and the service performance is optimized by multithreading.
In the embodiment, table data established by clinical data of the real world is converted into regression task learning, the Wide & Deep technology is applied to CsA concentration prediction after HSCT patients operate, important variables influencing the CsA concentration model are mined through a statistical technology and a machine learning technology, an individualized intelligent prediction model is established, model prediction time is reduced, specific combinations can be efficiently learned, feature combinations which do not appear in a training set can be learned, and data patterns hidden behind the features are mined; meanwhile, the model can be updated in an online learning mode. In addition, by effectively screening the characteristic variables and processing the missing data, the model can maximize the utilization of the characteristic data, namely, the optimal prediction effect can be obtained by using a small amount of characteristics.
On the basis of the foregoing embodiment, the step of primarily screening the deleted feature variables based on a statistical method in this embodiment includes:
judging whether the relation between the target variable and the continuous variable is obvious or not based on a Pearson inspection method under the condition that the deleted characteristic variable is the continuous variable;
if the relation is obvious, the continuous variable is reserved, otherwise, the continuous variable is deleted;
for example, blood pressure is a continuous variable. And judging whether the relation between the target variable and the continuous variable is obvious or not by carrying out Pearson test on the target variable. If significant, it is retained, otherwise it is deleted.
Obtaining the discreteness of each deleted characteristic variable, and taking the logarithm of the characteristic variable of which the discreteness is greater than a second preset threshold value;
for example, the maximum value minus the minimum value of each feature variable is used as the dispersion of each feature variable, and the present embodiment is not limited to the discrete type calculation method.
Judging whether the relation between the target variable and the classification variable is obvious or not based on a Mann-Whitney U test method under the condition that the deleted characteristic variable is the classification variable;
and if the relation is remarkable, keeping the classification variable, and otherwise, deleting the classification variable.
For example, classifications of diagnostic results include mild, moderate, and severe schizophrenia. And carrying out Mann-Whitney U test on the target variable, judging whether the relation between the target variable and the classification variable is obvious, if so, retaining, and if not, deleting.
On the basis of the foregoing embodiment, in this embodiment, the step of performing secondary screening on the primarily screened feature variables based on the feature engineering includes:
taking the actual value of the plasma concentration of cyclosporine a of the HSCT patient as a target variable, taking the primarily screened characteristic variable as a covariate, and constructing a characteristic importance prediction model based on a Catboost algorithm;
adjusting parameters of the feature importance prediction model based on six-fold cross validation so that the feature importance prediction model is optimal;
because the dimensionality of the patient data is high, the dimensionality obtained after variable primary screening is still high, and a plurality of useless variables and missing values exist. In order to reduce the dimensionality of data and retain the originality of the data, a Catboost algorithm is selected to perform modeling analysis on the HSCT patient data, and the method specifically comprises the following steps:
1. and (3) taking the plasma concentration of the cyclosporine a as a target variable, taking the characteristic variable obtained by primary screening as a covariate, and constructing a characteristic importance prediction model based on a Catboost algorithm.
2. Performing 6-fold cross validation on the model and performing parameter adjustment to ensure that R is 2 The (R-square index), RMSE (Root Mean square Error) and MAE (Mean square Error) are optimal.
3. And calculating the importance scores of the characteristic variables and sorting the importance scores in descending order, wherein the higher the importance score is, the greater the influence of the characteristic on the model is.
Based on the optimal feature importance prediction model, acquiring the importance of each primarily screened feature variable, and screening the feature variables with the maximum importance by a preset number;
and selecting a preset number of the characteristic variables, such as 30 characteristic variables, with the importance scores ranked in the top from the characteristic variables ranked according to the importance scores, and realizing secondary screening of the characteristic variables.
And the XGboost-based feature sequence forward selection algorithm screens out the optimal feature variables again from the screened feature variables.
And performing third feature variable screening by using a feature sequence forward selection algorithm of XGboost. The algorithm searches for features starting from an empty set, adds one feature variable at a time to a subset of features, and models so that an evaluation index, such as R, is evaluated 2 And (4) achieving the optimal effect, and finally screening out the optimal characteristic combination. And if the evaluation indexes corresponding to the candidate feature subsets are not as good as those of the feature subsets in the previous round, stopping iteration and taking the feature subsets in the previous round as an optimal feature selection result.
On the basis of the above embodiment, the step of constructing the plasma cyclosporin a concentration prediction model based on the Wide & Deep model in this embodiment includes:
according to Wide&Evaluation index R of Deep model 2 The components of the group consisting of RMSE and MAE,for the Wide&Performing joint training on a Wide module and a Deep module in the Deep model; the method comprises the following steps that FTRL and L1 regularization algorithms are used for learning the Wide module, and AdaGrad algorithms are used for learning the Deep module;
and taking the trained Wide & Deep model as the plasma cyclosporine a concentration prediction model.
The Wide module is a generalized linear model, i.e. y = W T Form + b. y is the predicted value of the model, w = (w) 1 ,w 2 ,...,w j ) As parameters of the model, x = (x) 1 ,x 2 ,...,x j ) Is the variable corresponding to j features, and b is the deviation. And adding a sigmoid function as an output on the basis of y, and in the process of inputting the characteristics, performing outer product transformation which is defined as:
Figure BDA0003705913820000111
wherein, c ki Is a boolean variable.
The Deep module is a feed-forward neural network, and needs to convert sparse and high-dimensional class-type features into low-dimensional dense real-valued vectors, which are generally called embedded vectors. The dimension of the embedding vector is typically between O (10) and O (100). During model training, the embedded vectors are randomly initialized and the vector parameters are learned according to a function that minimizes the final loss. These low-dimensional dense embedded vectors are then input into the hidden layer in the feed-forward path of the neural network. Each hidden layer is calculated as follows:
a l =f(W l a l +b l )
where f is the activation function ReLU and l is the layer sequence number.
The combination of the Wide and Deep modules relies on a weighted sum of the log probabilities of their outputs as predictions, which are then input into the logic loss function for joint training. The original sparse feature is used for both the two components, and training is performed in a joint training mode. When The method is used for solving, the Wide component is learned by using an online learning algorithm FTRL (Follow The regulated Lead) and L1 regularization, and The Deep component is learned by using an AdaGrad algorithm.
The joint training model is simultaneously trained and produced, and in the joint training, the Wide module needs to realize the memory capacity of the Deep module through small-scale cross features. During training, the gradient is calculated through a loss function and is reversely propagated to the Wide model and the Deep model, and the parameters of the Wide model and the Deep model are respectively trained by applying a small-batch random optimization technology. The features used by the Deep part are respectively continuous features, embedded discrete features and classification features, and generalization is carried out through embedding; while the Wide part uses features that are combined features generated by an outer product transform and remembers those sparse, specific rules.
In the joint model, the outputs of the Wide and Deep parts are combined together in a weighting manner, and are adjusted and optimized through a loss function, and then the final output, namely the final output is carried out
Figure BDA0003705913820000121
Obtaining Wide when calculating loss function&Evaluation index R of Deep model 2 RMSE and MAE.
Wherein R is 2 The formula of (1) is as follows:
Figure BDA0003705913820000122
the formula for RMSE is as follows:
Figure BDA0003705913820000123
the equation for MSE is as follows:
Figure BDA0003705913820000124
where n denotes the number of samples, y i The true value of the ith sample is represented,
Figure BDA0003705913820000126
representing the predicted value of the ith sample. R 2 The larger the better, the smaller the RMSE and MAE the better.
To avoid overestimating the fitness due to invalid arguments, the formula for R2 is modified as:
Figure BDA0003705913820000125
wherein beta is a preset adjusting coefficient, k is the number of independent variables, and n is the number of the schizophrenia patients in the sample. In the case of other constant values, the larger k in the above formula, the smaller R2 after correction, and k is similar to a penalty term.
And performing parameter optimization on a loss function obtained by the cyclosporine a blood concentration prediction model according to the RMSE, the MAE and the corrected R2 to ensure that the difference value obtained by subtracting the RMSE and the MAE from the corrected R2 is minimum, thereby obtaining an optimized Wide & Deep model, improving the prediction precision and efficiency, and updating the model in an online learning mode.
The plasma cyclosporin a concentration prediction apparatus provided in the present invention will be described below, and the plasma cyclosporin a concentration prediction apparatus described below and the plasma cyclosporin a concentration prediction method described above may be referred to in correspondence with each other.
As shown in fig. 4, the apparatus comprises an acquisition module 401 and a prediction module 402, wherein:
the acquisition module 401 is used for acquiring clinical data of an HSCT patient to be predicted;
the prediction module 402 is configured to input the clinical data into a cyclosporin a blood concentration prediction model, and obtain a cyclosporin a blood concentration prediction value of the to-be-predicted HSCT patient output by the cyclosporin a blood concentration prediction model;
the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
According to the embodiment, the real world clinical data of the HSCT patient is used for establishing the exclusive model for predicting the plasma concentration of the cyclosporine a, the trained model is used for inputting the clinical data of the patient, the predicted value of the plasma concentration of the cyclosporine a of the patient can be automatically predicted, the prediction efficiency is high, the accuracy is high, the coverage population is wide, and therefore whether the plasma concentration of the cyclosporine a is normal or not is predicted, the drug administration is adjusted by referring to the plasma concentration of the cyclosporine a, and the risk of adverse reaction of the patient is reduced.
Fig. 5 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 5: a processor (processor) 510, a communication Interface (Communications Interface) 520, a memory (memory) 530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a cyclosporine a blood concentration prediction method comprising: acquiring clinical data of an HSCT patient to be predicted; inputting the clinical data into a cyclosporin a blood concentration prediction model to obtain a cyclosporin a blood concentration prediction value of the patient to be predicted of the HSCT patient output by the cyclosporin a blood concentration prediction model; the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer readable storage medium, wherein when the computer program is executed by a processor, the computer is capable of executing the plasma cyclosporin a concentration prediction method provided by the above methods, the method comprising: acquiring clinical data of an HSCT patient to be predicted; inputting the clinical data into a cyclosporin a blood concentration prediction model to obtain a cyclosporin a blood concentration prediction value of the patient to be predicted of the HSCT patient output by the cyclosporin a blood concentration prediction model; the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program when executed by a processor implementing a method for plasma concentration prediction of cyclosporin a provided by the above methods, the method comprising: acquiring clinical data of an HSCT patient to be predicted; inputting the clinical data into a cyclosporin a blood concentration prediction model to obtain a cyclosporin a blood concentration prediction value of the patient to be predicted of the HSCT patient output by the cyclosporin a blood concentration prediction model; the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for predicting plasma cyclosporin a concentration, comprising:
acquiring clinical data of an HSCT patient to be predicted;
inputting the clinical data into a plasma concentration prediction model of cyclosporine a to obtain a plasma concentration prediction value of cyclosporine a of the HSCT patient to be predicted, which is output by the plasma concentration prediction model of cyclosporine a;
the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
2. The method of predicting cyclosporin a blood concentration according to claim 1, wherein the clinical data include demographic information, medication information, test information, diagnostic information, surgical information, treatment regimen, and adverse reaction.
3. The method for predicting blood drug concentration of cyclosporin a in accordance with claim 1 or 2, wherein said step of inputting said clinical data into a blood drug concentration prediction model of cyclosporin a and obtaining a predicted value of blood drug concentration of cyclosporin a in said HSCT patient to be predicted output by said blood drug concentration prediction model of cyclosporin a further comprises:
obtaining the deletion rate of each characteristic variable in the clinical data of the HSCT patient, and deleting the characteristic variable with the deletion rate larger than a first preset threshold value;
primary screening is carried out on the deleted characteristic variables based on a statistical method;
re-screening the primarily screened characteristic variables based on the characteristic engineering;
and constructing a plasma concentration prediction model of the cyclosporine a based on a Wide & Deep model by taking the actual plasma concentration value of the cyclosporine a of the HSCT patient as a target variable and taking the re-screened characteristic variable as an independent variable.
4. The method for predicting plasma cyclosporin a concentration of blood in claim 3, wherein the step of initially screening the removed characteristic variables based on a statistical method comprises:
judging whether the relation between the target variable and the continuous variable is obvious or not based on a Pearson test method under the condition that the deleted characteristic variable is the continuous variable;
if the relation is obvious, the continuous variable is reserved, otherwise, the continuous variable is deleted;
obtaining the discreteness of each deleted feature variable, and taking the logarithm of the feature variable of which the discreteness is greater than a second preset threshold value;
under the condition that the deleted characteristic variables are classified variables, judging whether the relation between the target variables and the classified variables is obvious or not based on a Mann-Whitney U test method;
and if the relation is remarkable, keeping the classification variable, and otherwise, deleting the classification variable.
5. The method for predicting plasma drug cyclosporin a concentration of claim 3 wherein said step of rescreening the primary screened characteristic variables based on characteristic engineering comprises:
taking the actual value of the plasma concentration of cyclosporine a of the HSCT patient as a target variable, taking the primarily screened characteristic variable as a covariate, and constructing a characteristic importance prediction model based on a Catboost algorithm;
adjusting parameters of the feature importance prediction model based on six-fold cross validation so that the feature importance prediction model is optimal;
based on the optimal feature importance prediction model, acquiring the importance of each primarily screened feature variable, and screening the feature variables with the maximum importance by a preset number;
and the XGboost-based feature sequence forward selection algorithm screens out the optimal feature variables again from the screened feature variables.
6. The method for predicting blood drug concentration of cyclosporin a of claim 3, wherein said step of constructing said blood drug concentration prediction model of cyclosporin a based on Wide & Deep model comprises:
according to Wide&Evaluation index R of Deep model 2 RMSE and MAE, for said Wide&Performing joint training on a Wide module and a Deep module in the Deep model; the method comprises the following steps that a Wide module is learned by using an FTRL and L1 regularization algorithm, and a Deep module is learned by using an AdaGrad algorithm;
and taking the trained Wide & Deep model as the plasma cyclosporine a concentration prediction model.
7. A plasma cyclosporin a concentration prediction apparatus comprising:
the acquisition module is used for acquiring clinical data of the HSCT patient to be predicted;
the prediction module is used for inputting the clinical data into a cyclosporin a blood concentration prediction model and acquiring the cyclosporin a blood concentration prediction value of the HSCT patient to be predicted, which is output by the cyclosporin a blood concentration prediction model;
the plasma cyclosporin a concentration prediction model is obtained by taking clinical data of a sample HSCT patient as a sample and taking an actual value of the plasma cyclosporin a concentration of the sample HSCT patient as a label.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the method for predicting plasma cyclosporin a concentration of any one of claims 1 to 6.
9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method for predicting plasma cyclosporin a concentration according to any one of claims 1 to 6.
10. A computer program product comprising a computer program, wherein the computer program when executed by a processor implements a method for predicting plasma cyclosporin a concentration according to any one of claims 1 to 6.
CN202210707457.0A 2022-06-21 2022-06-21 Method and device for predicting plasma concentration of cyclosporin a Pending CN115206458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210707457.0A CN115206458A (en) 2022-06-21 2022-06-21 Method and device for predicting plasma concentration of cyclosporin a

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210707457.0A CN115206458A (en) 2022-06-21 2022-06-21 Method and device for predicting plasma concentration of cyclosporin a

Publications (1)

Publication Number Publication Date
CN115206458A true CN115206458A (en) 2022-10-18

Family

ID=83575457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210707457.0A Pending CN115206458A (en) 2022-06-21 2022-06-21 Method and device for predicting plasma concentration of cyclosporin a

Country Status (1)

Country Link
CN (1) CN115206458A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018212711A1 (en) * 2017-05-19 2018-11-22 National University Of Singapore Predictive analysis methods and systems
CN110363427A (en) * 2019-07-15 2019-10-22 腾讯科技(深圳)有限公司 Model quality evaluation method and apparatus
CN111430029A (en) * 2020-03-24 2020-07-17 浙江达美生物技术有限公司 Multi-dimensional stroke prevention screening method based on artificial intelligence
CN111613289A (en) * 2020-05-07 2020-09-01 浙江大学医学院附属第一医院 Individualized drug dose prediction method, individualized drug dose prediction device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018212711A1 (en) * 2017-05-19 2018-11-22 National University Of Singapore Predictive analysis methods and systems
CN110363427A (en) * 2019-07-15 2019-10-22 腾讯科技(深圳)有限公司 Model quality evaluation method and apparatus
CN111430029A (en) * 2020-03-24 2020-07-17 浙江达美生物技术有限公司 Multi-dimensional stroke prevention screening method based on artificial intelligence
CN111613289A (en) * 2020-05-07 2020-09-01 浙江大学医学院附属第一医院 Individualized drug dose prediction method, individualized drug dose prediction device, electronic equipment and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
XIAOHUI HUANG ET.AL: "An Ensemble Model for Prediction of Vancomycin Trough Concentrations in Pediatric Patients", 《DOVEPRESS》 *
ZE YU ET.AL: "Predicting Lapatinib Dose Regimen Using Machine Learning and Deep Learning Techniques Based on a Real-World Study", 《FRONT. ONCOL.》 *
余俊先等: "人工神经网络建立的环孢素A血药浓度预测模型", 《中国药物应用与监测》 *
李珊等: "人工神经网络在环孢素血药浓度预测中的研究", 《科技信息(学术版)》 *
齐巧娜等: "机器学习XGBoost算法在医学领域的应用研究进展", 《分子影像学杂志》 *

Similar Documents

Publication Publication Date Title
Arumugam et al. Multiple disease prediction using Machine learning algorithms
Zeebaree et al. Machine Learning Semi-Supervised Algorithms for Gene Selection: A Review
Misra et al. Improving the classification accuracy using recursive feature elimination with cross-validation
Srivastava et al. Prediction of diabetes using artificial neural network approach
Zeebaree et al. Gene selection and classification of microarray data using convolutional neural network
Alirezaei et al. A bi-objective hybrid optimization algorithm to reduce noise and data dimension in diabetes diagnosis using support vector machines
Yarasuri et al. Prediction of hepatitis disease using machine learning technique
US9043326B2 (en) Methods and systems for biclustering algorithm
Pujianto et al. Comparison of Naïve Bayes Algorithm and Decision Tree C4. 5 for Hospital Readmission Diabetes Patients using HbA1c Measurement.
Estevez-Velarde et al. AutoML strategy based on grammatical evolution: A case study about knowledge discovery from text
Sekaran et al. Predicting autism spectrum disorder from associative genetic markers of phenotypic groups using machine learning
CN112052874A (en) Physiological data classification method and system based on generation countermeasure network
Al-Sideiri et al. Machine learning algorithms for diabetes prediction: A review paper
Khanna et al. A decision support system for osteoporosis risk prediction using machine learning and explainable artificial intelligence
Wieneke et al. Principles of artificial intelligence and its application in cardiovascular medicine
Jhalia et al. A critical review on the application of artificial neural network in bioinformatics
Oztekin An analytical approach to predict the performance of thoracic transplantations
CN115206458A (en) Method and device for predicting plasma concentration of cyclosporin a
CN115295115A (en) Sodium valproate blood concentration prediction method and device based on deep learning
CN114023395A (en) Method and device for predicting mycophenolic acid drug exposure of kidney transplantation patient
Amutha et al. A Survey on Machine Learning Algorithms for Cardiovascular Diseases Predic-tion
Armya et al. Leukemia Diagnosis using Machine Learning Classifiers Based on Correlation Attribute Eval Feature Selection
Gupta et al. Human-machine interface system for pre-diagnosis of diseasesusing machine learning
Chezhiyan et al. An Efficient Pre-Processing Method Using Optimization Techniques For Heart Disease Prediction
CN115206537A (en) Risperidone blood concentration prediction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20221018