CN113270146A - Bronchopulmonary dysplasia data processing method and device and related equipment - Google Patents

Bronchopulmonary dysplasia data processing method and device and related equipment Download PDF

Info

Publication number
CN113270146A
CN113270146A CN202110672468.5A CN202110672468A CN113270146A CN 113270146 A CN113270146 A CN 113270146A CN 202110672468 A CN202110672468 A CN 202110672468A CN 113270146 A CN113270146 A CN 113270146A
Authority
CN
China
Prior art keywords
data
gene
bronchopulmonary dysplasia
risk
infant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110672468.5A
Other languages
Chinese (zh)
Other versions
CN113270146B (en
Inventor
钱莉玲
代丹
董欣然
陈辉耀
周文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Childrens Hospital of Fudan University
Original Assignee
Childrens Hospital of Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Childrens Hospital of Fudan University filed Critical Childrens Hospital of Fudan University
Priority to CN202110672468.5A priority Critical patent/CN113270146B/en
Publication of CN113270146A publication Critical patent/CN113270146A/en
Application granted granted Critical
Publication of CN113270146B publication Critical patent/CN113270146B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a data processing method, a device and related equipment for bronchopulmonary dysplasia, and belongs to the technical field of medicine. After the data processing result of the bronchopulmonary dysplasia of the target infant is obtained, the medical staff can judge the possibility of the bronchopulmonary dysplasia of the target infant according to the data processing result, and therefore the bronchopulmonary dysplasia is predicted. According to the method, the data processing result of the bronchopulmonary dysplasia of the target infant patient is obtained by combining the risk gene data and the basic clinical characteristic data for the first time, so that the accuracy is met, the time is saved, and the popularization is improved.

Description

Bronchopulmonary dysplasia data processing method and device and related equipment
Technical Field
The invention belongs to the technical field of medicine, and particularly relates to a data processing method and device for bronchopulmonary dysplasia and related equipment.
Background
Clinical bronchopulmonary dysplasia (BPD) is a complex disease caused by genetic and environmental factors, is one of the most serious complications of premature infants, and brings huge economic and medical burden to sick families. The incidence of premature death rate and complications of BPD is significantly higher than that of general premature infants, so that the early prediction of BPD has important guiding significance for the clinical prevention and accurate management of BPD.
In the related art, a BPD prediction model is generally used to predict BPD, and existing BPD prediction models mostly use a single clinical feature for prediction or use a combination of multiple clinical features for score prediction. However, in the former, since BPD is a complex disease, the genetic background and clinical heterogeneity of individuals are large, and prediction of BPD by using a single clinical feature is generally poor, and the AUC (Area Under the ROC Curve and enclosed by coordinate axes) value is generally less than 0.8; the latter scoring system usually needs to include a large number of clinical features to achieve an AUC of more than 0.9, and simultaneously evaluates a large number of clinical indicators, and satisfies a plurality of different evaluation time nodes, which consumes a large amount of manpower and material resources, and thus is poor in clinical practicability and operability.
Therefore, how to quickly and accurately predict bronchopulmonary dysplasia becomes a technical problem to be solved urgently in the prior art.
Disclosure of Invention
The invention provides a data processing method and device for bronchopulmonary dysplasia and related equipment, and aims to solve the technical problems that BPD data processing is inaccurate or consumes manpower and material resources, and the practicability and operability are poor in the prior art.
The technical scheme provided by the invention is as follows:
in one aspect, a method of data processing for bronchopulmonary dysplasia includes:
acquiring basic clinical characteristic data and risk gene data of a target infant patient;
and inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the target infant suffering from the bronchopulmonary dysplasia.
Optionally, the method for constructing the pre-constructed data processing model of bronchopulmonary dysplasia includes:
screening a standard according to preset basic clinical characteristics, and screening basic clinical characteristic data of the bronchopulmonary dysplasia patient from a bronchopulmonary dysplasia patient data group; and the number of the first and second groups,
screening the risk gene data of the bronchopulmonary dysplasia patient from the bronchopulmonary dysplasia patient data group according to a preset risk gene screening benchmark;
and constructing a data processing model of the bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard.
Optionally, the constructing a data processing model of bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard includes:
constructing a first bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data of the children with bronchopulmonary dysplasia and a preset construction benchmark; and a process for the preparation of a coating,
constructing a second bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard; and a process for the preparation of a coating,
constructing a third bronchopulmonary dysplasia data processing model according to all clinical characteristic data of the children with bronchopulmonary dysplasia and preset construction standards;
and performing performance evaluation on the first bronchopulmonary dysplasia data processing model, the second bronchopulmonary dysplasia data processing model and the third bronchopulmonary dysplasia data processing model to determine the bronchopulmonary dysplasia data processing model.
Optionally, the preset risk gene screening benchmark comprises: presetting a risk gene screening calculation rule; the data group of children with bronchopulmonary dysplasia comprises: data of the infant patient; further comprising: collecting a data group of the non-sick children of the control group; the infant data comprises infant gene detection data, and the non-infant data comprises non-infant gene detection data; according to a preset risk gene screening benchmark, screening the risk gene data of the bronchopulmonary dysplasia infant from the bronchopulmonary dysplasia infant data group comprises the following steps:
respectively acquiring the gene detection data of the infant and the gene detection data of the non-infant;
analyzing the infant gene detection data and the non-infant gene detection data, screening mutant genes which exist in the infant gene detection data and do not exist in the non-infant gene detection data on each gene, and counting the number of mutations carried by each gene; and the number of the first and second groups,
detecting the gene load to obtain the probability value of the loss-of-function mutation carried by each gene and the probability value of the missense mutation carried by each gene;
and calculating the risk probability of the corresponding gene according to the mutation number carried by each gene, the probability value of the loss-of-function mutation carried by each gene, the probability value of the missense mutation carried by each gene and the preset risk gene screening calculation rule to obtain the risk gene data.
Optionally, the preset risk gene screening calculation rule includes:
Score=NSVn+2*(-log10(PLOF))+(-logl0(PMIS));
wherein, Score is the risk value of the gene as the risk gene, NSVn is the mutation number carried by the gene, PLOFProbability of loss-of-function mutation carried by the Gene, PMISIs the probability value of missense mutation carried by the gene.
Optionally, the screening of the risk gene data of the bronchopulmonary dysplasia patient from the bronchopulmonary dysplasia patient data group according to a preset risk gene screening criterion includes:
and if the risk value of the risk gene is more than 2, determining that the corresponding gene is the risk gene.
Optionally, the data of the child patient is data of the child patient in double or multiple births; the child patient data is child patient data in the double or multiple births; and/or the presence of a gas in the gas,
the risk gene data comprising: bronchopulmonary dysplasia risk gene data and severe bronchopulmonary dysplasia risk gene data.
In yet another aspect, a data processing apparatus for bronchopulmonary dysplasia, comprising: the device comprises an acquisition module and a processing module;
the acquisition module is used for acquiring basic clinical characteristic data and risk basic factors of the target infant;
and the processing module is used for inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the bronchopulmonary dysplasia of the target infant patient.
In yet another aspect, a storage medium stores a computer program, and when the computer program is executed by a processor, the method for processing bronchopulmonary dysplasia data includes the steps of any one of the above methods.
In yet another aspect, a data processing apparatus for bronchopulmonary dysplasia, comprising: a processor, and a memory coupled to the processor;
the memory is used for storing a computer program for executing at least the bronchopulmonary dysplasia data processing method of any one of the above;
the processor is used for calling and executing the computer program in the memory.
The invention has the beneficial effects that:
according to the data processing method, device and related equipment for bronchopulmonary dysplasia, provided by the embodiment of the invention, when the target infant is subjected to bronchopulmonary dysplasia data processing, basic clinical characteristic data and risk gene data of the target infant can be obtained; and inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the target infant suffering from the bronchopulmonary dysplasia. After the data processing result of the bronchopulmonary dysplasia of the target child patient is obtained, the medical staff can judge the possibility of the bronchopulmonary dysplasia of the target child patient according to the data processing result, and therefore the bronchopulmonary dysplasia is predicted. According to the method, the data processing result of the bronchopulmonary dysplasia of the target infant patient is obtained by combining the risk gene data and the basic clinical characteristic data for the first time, so that the accuracy is met, the time is saved, and the popularization is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a data processing method for bronchopulmonary dysplasia according to an embodiment of the present invention;
fig. 2 is a schematic view of a case collecting process according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a clinical feature collection and gene detection time point provided in an embodiment of the present invention;
fig. 4 is a schematic diagram of ROC curves of data processing results of bronchopulmonary dysplasia data respectively obtained by using 3 basic clinical features, 3 basic clinical features and BPD risk genes and using all clinical features according to a verification embodiment of the present invention;
fig. 5 is a schematic diagram of ROC curves of data processing results of performing severe bronchopulmonary dysplasia respectively by using 3 basic clinical features, 3 basic clinical features and sBPD risk genes and all clinical features according to the verification embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data processing apparatus for bronchopulmonary dysplasia according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a data processing apparatus for bronchopulmonary dysplasia according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
In the related art, a BPD prediction model is generally used to predict BPD, and existing BPD prediction models mostly use a single clinical feature for prediction or use a combination of multiple clinical features for score prediction. However, in the former, since BPD is a complex disease, the genetic background and clinical heterogeneity of individuals are large, and prediction of BPD by using a single clinical feature is generally poor, and the AUC (Area Under the ROC Curve and enclosed by coordinate axes) value is generally less than 0.8; the latter scoring system usually needs to include a large number of clinical features to achieve an AUC of more than 0.9, and simultaneously evaluates a large number of clinical indicators, and satisfies a plurality of different evaluation time nodes, which consumes a large amount of manpower and material resources, and thus is poor in clinical practicability and operability.
Based on this, the embodiment of the present invention provides a data processing method for bronchopulmonary dysplasia.
Fig. 1 is a schematic flow chart of a data processing method for bronchopulmonary dysplasia according to an embodiment of the present invention, as shown in fig. 1, the method according to the embodiment of the present invention may include the following steps:
s11, obtaining basic clinical characteristic data and risk gene data of the target infant.
In a specific implementation process, any infant needing bronchopulmonary dysplasia prediction can be defined as a target infant, and the data processing method for bronchopulmonary dysplasia provided by the application is applied to process the data of the target infant, so that the illness probability of the target infant can be predicted according to the data processing result.
For example, the basic clinical characteristic data and the risk gene data of the target child patient may be obtained, wherein the types of the basic clinical characteristic data and the risk gene data may be preset. For example, the underlying clinical profile data may be: birth gestational age, birth weight, invasive mechanical ventilation; the risk gene data may be: loss of function mutations (LOF variants, including frameshift mutations, stop codon mutations, classical splice site mutations) and missense mutations, etc., it is to be understood that the basic clinical profile and risk gene data are merely exemplary and not limiting.
And S12, inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the bronchopulmonary dysplasia of the target infant patient.
After the basic clinical characteristic data and the risk gene data are obtained, the basic clinical characteristic data and the risk gene data are input into a pre-constructed data processing model of the bronchopulmonary dysplasia, so that a data processing result of the bronchopulmonary dysplasia of the target infant patient is obtained, and medical workers can predict whether the target infant patient has the bronchopulmonary dysplasia according to the data processing result.
In some embodiments, optionally, the method for constructing a pre-constructed data processing model of bronchopulmonary dysplasia includes:
screening a standard according to preset basic clinical characteristics, and screening basic clinical characteristic data of the bronchopulmonary dysplasia patient from a bronchopulmonary dysplasia patient data group; and the number of the first and second groups,
screening the risk gene data of the bronchopulmonary dysplasia patient from the bronchopulmonary dysplasia patient data group according to a preset risk gene screening benchmark;
and constructing a data processing model of the bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard.
For example, the patient information may be collected to construct a data group of children with bronchopulmonary dysplasia, for example, the collected patient sample may be 370 cases, 1000 cases, 2000 cases, etc., and the data is ensured to be large enough, and 370 cases are exemplified in the present application for description.
Fig. 2 is a schematic diagram of a case collection process according to an embodiment of the present invention, referring to fig. 2, when a case is taken, 370 cases of hospitalized premature infants (i.e., N370) that require respiratory support after the collection and are less than 32 weeks in gestational age can be collected, and in order to ensure the accuracy of data, cases can be excluded according to known conditions, for example, 370 cases of patients can be excluded, such as fatal illness outside the respiratory system (N55), congenital malformation or syndrome (N4), death within 7 days after the collection (N7), unexplained exon sequencing (N33), and refusal to participate in the test (N17), so that 254 cases are left for testing. The gene test was performed on 254 samples, and based on the gene test results, cases of inherited metabolic diseases (N ═ 3), visits during the study (N ═ 3), death due to non-respiratory causes (N ═ 2), primary immunodeficiency (N ═ 1) were excluded again, and 245 cases remained. In 245 samples, infants not diagnosed with BPD (N ═ 114); infants diagnosed with BPD (N ═ 131), 44 mild BPD, 20 moderate BPD, and 67 severe BPD.
In order to ensure the accuracy of the data, a preset acquisition time rule needs to be followed when the data is acquired. Fig. 3 is a schematic diagram of a clinical characteristic collection and gene detection time point according to an embodiment of the present invention, referring to fig. 3, gene data can be collected after a case is taken, that is: and (3) collecting a peripheral blood sample to perform gene detection, and obtaining an exon sequencing result 14 days after the sample is sick with children. The clinical characteristics of the sample can be collected within 7 days after the birth, from 7 days to 28 days after the birth and from 28 days after the birth to 36 weeks of corrected gestational age, and after the clinical characteristics are collected, basic clinical characteristics are screened from the clinical characteristics.
For example, wherein the clinical signature acquisition within 7 days of the day may be:
maternal and birth history (N ═ 17): tube babies, prenatal dexamethasone therapy, fetal distress, intrauterine growth restriction, gestational hypertension, gestational diabetes mellitus, hypothyroidism in pregnancy, eclampsia or preeclampsia, amniotic fluid abnormality, placental premolarity, premature rupture of fetal membranes, multiple pregnancy, gender, gestational age, birth weight, asphyxia, low muscle tone;
complications that occur within 1 week after birth (N ═ 6): neonatal respiratory distress syndrome, patent ductus arteriosus, atrial/ventricular septal defects, respiratory failure, hyperbilirubinemia, early-onset infection (infection occurring within 72 hours prenatal, neonatal pneumonia and sepsis occurring within 7 days postnatal);
early postnatal therapeutic measures (N ═ 2): neonatal resuscitation, exogenous pulmonary surfactant.
Clinical characteristic results from days 7 to 28 after birth may be:
complications during hospitalization (N ═ 4) airway abnormalities (tracheobronchial malacia, laryngeal malacia or subglottic stenosis), hypoglycemia, thrombocytopenia, hypothyroidism;
the therapeutic measures (N is 2) that the invasive mechanical ventilation is carried out for more than or equal to 7 days.
Clinical signature outcomes from 28 days postnatal to 36 weeks corrected gestational age may be:
complications during hospitalization (N ═ 3) late-onset infections (infections occurring 7 days after birth), ventricular hemorrhage, coagulopathy;
therapeutic measures (N-3) caffeine, intravenous glucocorticoids, diuretics.
In some embodiments, optionally, constructing a data processing model of bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction benchmark, including:
constructing a first bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data of the children with bronchopulmonary dysplasia and a preset construction benchmark; and a process for the preparation of a coating,
constructing a second bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard; and a process for the preparation of a coating,
constructing a third bronchopulmonary dysplasia data processing model according to all clinical characteristic data of the children with bronchopulmonary dysplasia and preset construction standards;
and performing performance evaluation on the first bronchopulmonary dysplasia data processing model, the second bronchopulmonary dysplasia data processing model and the third bronchopulmonary dysplasia data processing model to determine the bronchopulmonary dysplasia data processing model.
For example, in the present application, 3 data processing models can be constructed based on a regression model (LASSO). Wherein, the data that 3 data processing models adopted respectively are: 3 confirmed basic clinical characteristic data, 3 confirmed basic clinical characteristic data + a confirmed number of risk gene data, all clinical characteristic data. After 3 data processing models were constructed, 10-fold cross-validation was performed to select the best penalty function. Then, the prediction performance of the 3 LASSO models is evaluated in a centralized mode through independent test data, the area under an ROC curve (AUC) is used for model evaluation, and Delong's test is used for statistical differences among the models; and selecting the second model as the final model.
Optionally, the preset risk gene screening benchmark comprises: presetting a risk gene screening calculation rule; the data group of children with bronchopulmonary dysplasia comprises: data of the infant patient; further comprising: collecting a data group of the non-sick children of the control group; the infant data comprises infant gene detection data, and the non-infant data comprises non-infant gene detection data; screening the risk gene data of the bronchopulmonary dysplasia infant from the bronchopulmonary dysplasia infant data group according to a preset risk gene screening benchmark, wherein the risk gene data comprises the following steps:
respectively acquiring gene detection data of a child patient and gene detection data of a non-child patient;
analyzing the infant gene detection data and the non-infant gene detection data, screening mutant genes which exist in the infant gene detection data and do not exist in the non-infant gene detection data on each gene, and counting the number of mutations carried by each gene; and the number of the first and second groups,
detecting the gene load to obtain the probability value of the loss-of-function mutation carried by each gene and the probability value of the missense mutation carried by each gene;
and calculating the risk probability of the corresponding gene according to the mutation number carried by each gene, the probability value of the loss-of-function mutation carried by each gene, the probability value of the missense mutation carried by each gene and a preset risk gene screening calculation rule to obtain risk gene data.
Optionally, the preset risk gene screening calculation rule includes:
Score=NSVn+2*(-log10(PLOF))+(-log10(PMIS));
wherein, Score is the risk value of the gene as the risk gene, NSVn is the mutation number carried by the gene, PLOFProbability of loss-of-function mutation carried by the Gene, PMISIs the probability value of missense mutation carried by the gene.
Optionally, the data of the infant patient is data of the infant patient in double or multiple births; the data of the infant patients are data of the infant patients in double or multiple births; and/or the presence of a gas in the gas,
risk gene data comprising: bronchopulmonary dysplasia risk gene data and severe bronchopulmonary dysplasia risk gene data.
For example, a two-or multi-tire sample will be described. For example, two or more births in which one sibling suffers from BPD and the other sibling does not suffer from BPD are divided into a BPD group and a non-BPD group (here, the non-BPD group is a control group), the sequencing data thereof is analyzed, for each gene, the gene which is present in the sample of the BPD group on the gene and which is not present in the sample of the non-BPD group is screened, and the number of mutations carried by each gene is counted as NSVn. Further, for the BPD group, there may be divided into mild BPD, moderate BPD and severe BPD, three groups may be used, and for each gene, the gene on which there is a mild BPD, moderate BPD or severe BPD in the sample and no mutation in the other samples is screened, and the number of mutations carried by the gene is counted.
For example, in this example, the number of loss-of-function mutations (LOF mutations, including frameshift mutations, stop codon mutations, classical splice site mutations) and missense mutations (MIS mutations) carried on each gene between the BPD and non-BPD groups can be compared using fisher's exact test. When the BPD groups are mild, moderate, and severe, the BPD groups are compared with all other cases, and detailed description thereof is omitted here. The fisher exact test was set to a threshold of P <0.05, all tests were one-way tests, and only genes carrying too many mutations in the case group were selected, so each gene gave a P-value for LOF and a P-value for MIS.
And after acquiring the number NSVn of the mutation carried by each gene, the probability of the loss-of-function mutation carried by each gene and the probability value of the missense mutation carried by each gene, calculating the risk value of the gene as a risk gene according to a formula.
Optionally, screening the risk gene data of the bronchopulmonary dysplasia patient from the bronchopulmonary dysplasia patient data group according to a preset risk gene screening benchmark, including:
and if the risk value of the risk gene is more than 2, determining that the corresponding gene is the risk gene.
For example, when the risk value calculated that a gene is a risk gene is greater than 2, then the gene can be determined to be a risk gene. When the BPD groups are divided into three groups, namely, the mild BPD group, the moderate BPD group and the severe BPD group, the mild BPD risk gene, the moderate BPD risk gene and the severe BPD risk gene can be obtained correspondingly, and the obtaining process is not described herein again.
For example, in constructing a model, a score of 1 may be recorded for each sample carrying an LOF in the BPD risk gene, and 0 otherwise. When the three groups of BPD risk genes are used for distinguishing, namely the risk genes are a mild BPD risk gene, a moderate BPD risk gene and a severe BPD risk gene, the record of the sample carrying MIS in the mild BPD risk gene, the moderate BPD risk gene and the severe BPD risk gene is 1, and otherwise, the record is 0.
When the model is constructed by selecting the basic clinical characteristics, the basic clinical characteristics can be selected to be the birth gestational age, the birth weight and invasive mechanical ventilation. Therefore, a data processing model of bronchopulmonary dysplasia is constructed according to the basic clinical characteristic data and the risk gene data.
According to the data processing method for the bronchopulmonary dysplasia, provided by the embodiment of the invention, when the target infant is subjected to the bronchopulmonary dysplasia data processing, basic clinical characteristic data and risk gene data of the target infant can be obtained; and inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the target infant suffering from the bronchopulmonary dysplasia. After the data processing result of the bronchopulmonary dysplasia of the target child patient is obtained, the medical staff can judge the possibility of the bronchopulmonary dysplasia of the target child patient according to the data processing result, and therefore the bronchopulmonary dysplasia is predicted. According to the method, the data processing result of the bronchopulmonary dysplasia of the target infant patient is obtained by combining the risk gene data and the basic clinical characteristic data for the first time, so that the accuracy is met, the time is saved, and the popularization is improved.
In order to verify the data processing method for bronchopulmonary dysplasia provided by the embodiment of the present invention, the embodiment of the present invention further provides a verification embodiment:
in this verification example, verification was performed using a BPD risk gene set (BPD-RGS) and a severe BPD risk gene set (sBPD-RGS) as examples:
table 1 shows BPD-RGS-30
OBSL1 NTRK1 CHRNA4 PDE11A FRG1 SPTAN1 DCC BDP1 C5 AGRN
AGXT TSHZ1 COL1A2 TERT DDHD1 PTPRQ PEX12 SACS CLCN1 GAA
VPS13A NPHS1 GIGYF2 VSX2 LAMB2 DMPK NDUFS7 CEBPJ DHTKD1 TSC2
TABLE 2 sBPD-RGS-20
ACADSB TCIRG1 OBSL1 FGFR3 BDP1 RBBP8 SPG7 GNAS ELP2 POMT1
MKKS CLCN1 DDHD1 HPS4 EFHC1 IL4R PTF1A CCT5 CFD NDUFS7 PALB2
According to the data processing method for bronchopulmonary dysplasia, provided by the embodiment of the application, BPD prediction and severe BPD (sBPD) prediction are carried out:
fig. 4 is a schematic diagram of ROC curves of data processing results of bronchopulmonary dysplasia data respectively obtained by using 3 basic clinical features, 3 basic clinical features and BPD risk genes and using all clinical features according to a verification embodiment of the present invention; fig. 5 is a schematic diagram of ROC curves of data processing results of bronchopulmonary dysplasia data obtained by using 3 basic clinical features, 3 basic clinical features and sBPD risk genes and using all clinical features according to the verification embodiment of the present invention.
Among them, 3 basic clinical features adopt: birth gestation age, birth weight, invasive mechanical ventilation.
Through the graphs in fig. 4 and 5, it can be seen that the BPD risk gene set (BPD-RGS) combined with 3 basic clinical characteristics (birth age, birth weight, invasive mechanical ventilation) predicts the BPD with significantly improved prediction effect compared with the prediction effect using only 3 basic clinical characteristics, and the prediction accuracy of the BPD risk gene set combined with the prediction method using 3 basic clinical characteristics is close to the accuracy of prediction using all clinical characteristics. Therefore, the risk genes and basic clinical characteristics are combined to perform data processing, the BPD probability is predicted, results can be obtained quickly on the premise of ensuring the accuracy of prediction, and time and labor are saved.
Based on a general inventive concept, the embodiments of the present invention further provide a data processing apparatus for bronchopulmonary dysplasia.
Fig. 6 is a schematic structural diagram of a data processing apparatus for bronchopulmonary dysplasia according to an embodiment of the present invention, and referring to fig. 6, the apparatus according to an embodiment of the present invention may include the following structures: an acquisition module 61 and a processing module 62.
The acquisition module 61 is used for acquiring basic clinical characteristic data and risk base factors of the target infant;
and the processing module 62 is configured to input the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of bronchopulmonary dysplasia, so as to obtain a data processing result of bronchopulmonary dysplasia of the target infant patient.
Optionally, in some embodiments, the method further includes: and constructing a module. The construction module is used for screening the basic clinical characteristic data of the bronchopulmonary dysplasia patient from the bronchopulmonary dysplasia patient data group according to a preset basic clinical characteristic screening benchmark; screening the risk gene data of the children with bronchopulmonary dysplasia from the data group of the children with bronchopulmonary dysplasia according to a preset risk gene screening benchmark; and constructing a data processing model of the bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard.
The construction module is used for constructing a first bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data of the children with bronchopulmonary dysplasia and a preset construction benchmark; constructing a second bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard; and constructing a third bronchopulmonary dysplasia data processing model according to all clinical characteristic data of the children with bronchopulmonary dysplasia and preset construction standards; and performing performance evaluation on the first bronchopulmonary dysplasia data processing model, the second bronchopulmonary dysplasia data processing model and the third bronchopulmonary dysplasia data processing model to determine the bronchopulmonary dysplasia data processing model.
Optionally, the construction module is configured to obtain the infant gene detection data and the non-infant gene detection data respectively; analyzing the infant gene detection data and the non-infant gene detection data, screening mutant genes which exist in the infant gene detection data and do not exist in the non-infant gene detection data on each gene, and counting the number of mutations carried by each gene; detecting the gene load to obtain the probability value of the loss-of-function mutation carried by each gene and the probability value of the missense mutation carried by each gene; and calculating the risk probability of the corresponding gene according to the mutation number carried by each gene, the probability value of the loss-of-function mutation carried by each gene, the probability value of the missense mutation carried by each gene and the preset risk gene screening calculation rule to obtain the risk gene data.
Optionally, the risk gene screening calculation rule used in the building module includes:
Score=NSVn+2*(-log10(PLOF))+(-log10(PMIS));
wherein, Score is the risk value of the gene as the risk gene, NSVn is the mutation number carried by the gene, PLOFProbability of loss-of-function mutation carried by the Gene, PMISIs the probability value of missense mutation carried by the gene.
Optionally, the construction module is configured to determine that the corresponding gene is a risk gene when the risk value of the risk gene is greater than 2.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
According to the data processing device for the bronchopulmonary dysplasia, provided by the embodiment of the invention, when the target infant is subjected to the bronchopulmonary dysplasia data processing, basic clinical characteristic data and risk gene data of the target infant can be obtained; and inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the target infant suffering from the bronchopulmonary dysplasia. After the data processing result of the bronchopulmonary dysplasia of the target child patient is obtained, the medical staff can judge the possibility of the bronchopulmonary dysplasia of the target child patient according to the data processing result, and therefore the bronchopulmonary dysplasia is predicted. According to the method, the data processing result of the bronchopulmonary dysplasia of the target infant patient is obtained by combining the risk gene data and the basic clinical characteristic data for the first time, so that the accuracy is met, the time is saved, and the popularization is improved.
Embodiments of the present invention also provide a storage medium based on one general inventive concept.
The storage medium provided by the embodiment of the invention stores a computer program, and when the computer program is executed by a processor, the method realizes each step in any one of the above-mentioned data processing methods for bronchopulmonary dysplasia.
Based on one general inventive concept, embodiments of the present invention also provide a data processing apparatus for bronchopulmonary dysplasia.
Fig. 7 is a schematic structural diagram of a data processing apparatus for bronchopulmonary dysplasia according to an embodiment of the present invention, and referring to fig. 7, the data processing apparatus for bronchopulmonary dysplasia according to an embodiment of the present invention includes: a processor 71 and a memory 72 connected to the processor.
The memory 72 is used for storing a computer program at least for the data processing method of bronchopulmonary dysplasia described in any of the above embodiments;
the processor 71 is used to call and execute computer programs in the memory.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A method of processing bronchopulmonary dysplasia data, comprising:
acquiring basic clinical characteristic data and risk gene data of a target infant patient;
and inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the target infant suffering from the bronchopulmonary dysplasia.
2. The method of claim 1, wherein the method of constructing the pre-constructed bronchopulmonary dysplasia data processing model comprises:
screening a standard according to preset basic clinical characteristics, and screening basic clinical characteristic data of the bronchopulmonary dysplasia patient from a bronchopulmonary dysplasia patient data group; and the number of the first and second groups,
screening the risk gene data of the bronchopulmonary dysplasia patient from the bronchopulmonary dysplasia patient data group according to a preset risk gene screening benchmark;
and constructing a data processing model of the bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard.
3. The method according to claim 2, wherein the constructing the data processing model of bronchopulmonary dysplasia according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and the preset construction benchmark comprises:
constructing a first bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data of the children with bronchopulmonary dysplasia and a preset construction benchmark; and a process for the preparation of a coating,
constructing a second bronchopulmonary dysplasia data processing model according to the basic clinical characteristic data and risk gene data of the bronchopulmonary dysplasia infant and a preset construction standard; and a process for the preparation of a coating,
constructing a third bronchopulmonary dysplasia data processing model according to all clinical characteristic data of the children with bronchopulmonary dysplasia and preset construction standards;
and performing performance evaluation on the first bronchopulmonary dysplasia data processing model, the second bronchopulmonary dysplasia data processing model and the third bronchopulmonary dysplasia data processing model to determine the bronchopulmonary dysplasia data processing model.
4. The method of claim 2, wherein the pre-established risk gene screening benchmark comprises: presetting a risk gene screening calculation rule; the data group of children with bronchopulmonary dysplasia comprises: data of the infant patient; further comprising: collecting a data group of the non-sick children of the control group; the infant data comprises infant gene detection data, and the non-infant data comprises non-infant gene detection data; according to a preset risk gene screening benchmark, screening the risk gene data of the bronchopulmonary dysplasia infant from the bronchopulmonary dysplasia infant data group comprises the following steps:
respectively acquiring the gene detection data of the infant and the gene detection data of the non-infant;
analyzing the infant gene detection data and the non-infant gene detection data, screening mutant genes which exist in the infant gene detection data and do not exist in the non-infant gene detection data on each gene, and counting the number of mutations carried by each gene; and the number of the first and second groups,
detecting the gene load to obtain the probability value of the loss-of-function mutation carried by each gene and the probability value of the missense mutation carried by each gene;
and calculating the risk probability of the corresponding gene according to the mutation number carried by each gene, the probability value of the loss-of-function mutation carried by each gene, the probability value of the missense mutation carried by each gene and the preset risk gene screening calculation rule to obtain the risk gene data.
5. The method of claim 4, wherein the pre-set risk gene screening calculation rule comprises:
Score=NSVn+2*(-logl0(PLOF))+(-log10(PMIS));
wherein, Score is the risk value of the gene as the risk gene, NSVn is the mutation number carried by the gene, PLOFProbability of loss-of-function mutation carried by the Gene, PMISIs the probability value of missense mutation carried by the gene.
6. The method of claim 5, wherein the screening of the data set of risk genes for bronchopulmonary dysplasia infants from the data set of bronchopulmonary dysplasia infants according to the predetermined risk gene screening criteria comprises:
and if the risk value of the risk gene is more than 2, determining that the corresponding gene is the risk gene.
7. The method of claim 4, wherein the child data is child data in twins or multiple births; and/or the presence of a gas in the gas,
the risk gene data comprising: bronchopulmonary dysplasia risk gene data and severe bronchopulmonary dysplasia risk gene data.
8. A bronchopulmonary dysplasia data processing apparatus, comprising: the device comprises an acquisition module and a processing module;
the acquisition module is used for acquiring basic clinical characteristic data and risk basic factors of the target infant;
and the processing module is used for inputting the basic clinical characteristic data and the risk gene data into a pre-constructed data processing model of the bronchopulmonary dysplasia to obtain a data processing result of the bronchopulmonary dysplasia of the target infant patient.
9. A storage medium storing a computer program which, when executed by a processor, performs the steps of the method for processing bronchopulmonary dysplasia data according to any one of claims 1-7.
10. A bronchopulmonary dysplasia data processing apparatus, comprising: a processor, and a memory coupled to the processor;
the memory is used for storing a computer program for executing at least the bronchopulmonary dysplasia data processing method according to any one of claims 1-7;
the processor is used for calling and executing the computer program in the memory.
CN202110672468.5A 2021-06-17 2021-06-17 Bronchopulmonary dysplasia data processing method and device and related equipment Active CN113270146B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110672468.5A CN113270146B (en) 2021-06-17 2021-06-17 Bronchopulmonary dysplasia data processing method and device and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110672468.5A CN113270146B (en) 2021-06-17 2021-06-17 Bronchopulmonary dysplasia data processing method and device and related equipment

Publications (2)

Publication Number Publication Date
CN113270146A true CN113270146A (en) 2021-08-17
CN113270146B CN113270146B (en) 2022-09-13

Family

ID=77235212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110672468.5A Active CN113270146B (en) 2021-06-17 2021-06-17 Bronchopulmonary dysplasia data processing method and device and related equipment

Country Status (1)

Country Link
CN (1) CN113270146B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114736962A (en) * 2022-05-24 2022-07-12 江苏大学附属医院 Application of inhibitor of circDHTKD1 in preparation of medicine for regulating and controlling airway epithelial inflammation
WO2023155530A1 (en) * 2022-02-15 2023-08-24 苏州大学 Prediction markers for bronchopulmonary dysplasia in preterm infants, prediction model, and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767960A (en) * 2017-09-13 2018-03-06 温州悦康信息技术有限公司 Data processing method, device and the electronic equipment of clinical detection project
CN111028947A (en) * 2019-12-02 2020-04-17 布谷鸟吉因健康科技(北京)有限公司 Cancer prevention health management method and system
CN112831579A (en) * 2021-01-29 2021-05-25 苏州大学附属儿童医院 Application of intestinal microorganisms as premature infant bronchopulmonary dysplasia marker

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767960A (en) * 2017-09-13 2018-03-06 温州悦康信息技术有限公司 Data processing method, device and the electronic equipment of clinical detection project
CN111028947A (en) * 2019-12-02 2020-04-17 布谷鸟吉因健康科技(北京)有限公司 Cancer prevention health management method and system
CN112831579A (en) * 2021-01-29 2021-05-25 苏州大学附属儿童医院 Application of intestinal microorganisms as premature infant bronchopulmonary dysplasia marker

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JINGDI ZHANG 等: ""Development and Validation of a Nomogram for Predicting Bronchopulmonary Dysplasia in Very-Low-Birth-Weight Infants"", 《FRONTIERS IN PEDIATRICS》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023155530A1 (en) * 2022-02-15 2023-08-24 苏州大学 Prediction markers for bronchopulmonary dysplasia in preterm infants, prediction model, and system
CN114736962A (en) * 2022-05-24 2022-07-12 江苏大学附属医院 Application of inhibitor of circDHTKD1 in preparation of medicine for regulating and controlling airway epithelial inflammation
CN114736962B (en) * 2022-05-24 2023-01-24 江苏大学附属医院 Application of inhibitor of circDHTKD1 in preparation of medicine for regulating and controlling airway epithelial inflammation

Also Published As

Publication number Publication date
CN113270146B (en) 2022-09-13

Similar Documents

Publication Publication Date Title
Andrikopoulou et al. Symptoms and critical illness among obstetric patients with coronavirus disease 2019 (COVID-19) infection
Xiao et al. Development and validation of a deep learning-based model using computed tomography imaging for predicting disease severity of coronavirus disease 2019
Hoang et al. The congenital heart disease genetic network study: cohort description
JP6049620B2 (en) Medical scoring system and method
Kraemer et al. Effect of allergic bronchopulmonary aspergillosis on lung function in children with cystic fibrosis
CN113270146B (en) Bronchopulmonary dysplasia data processing method and device and related equipment
Marcolino et al. ABC2-SPH risk score for in-hospital mortality in COVID-19 patients: development, external validation and comparison with other available scores
Patel et al. Medical and obstetric complications among pregnant women with cystic fibrosis
Lio et al. Fetal Doppler velocimetry and bronchopulmonary dysplasia risk among growth-restricted preterm infants: an observational study
Zhang et al. Development and validation of a nomogram for predicting bronchopulmonary dysplasia in very-low-birth-weight infants
Can et al. Maternal and neonatal outcomes of expectantly managed pregnancies with previable preterm premature rupture of membranes
Jeong et al. Prognostic implications of CT feature analysis in patients with COVID-19: a nationwide cohort study
Adams et al. Neonatal and maternal outcomes of pregnancies with a fetal diagnosis of congenital heart disease using a standardized delivery room management protocol
Ngene et al. Maternal and fetal outcomes of HIV-infected and noninfected pregnant women admitted to two intensive care units in Pietermaritzburg, South Africa
Korkmaz et al. Can platelet mass index be a parameter to predict intraventricular hemorrhage in very-low-birth-weight newborns?
Kulkarni et al. Utility of neutrophil-lymphocyte ratio (NLR) as an indicator of disease severity and prognostic marker among patients with COVID-19 infection in a tertiary care centre in Bangalore–a retrospective study
Wang et al. Method of non-invasive parameters for predicting the probability of early in-hospital death of patients in intensive care unit
Goto et al. The usefulness of a combination of age, body mass index, and blood urea nitrogen as prognostic factors in predicting oxygen requirements in patients with coronavirus disease 2019
Ma et al. Application of a prediction model based on the laboratory index score in prelabor rupture of membranes with histologic chorioamnionitis during late pregnancy
Arcari et al. Semiquantitative Chest CT Severity Score Predicts Failure of Noninvasive Positive-Pressure Ventilation in Patients Hospitalized for COVID-19 Pneumonia
Erdoğan et al. Interleukin-6 level is an independent predictor of right ventricular systolic dysfunction in patients hospitalized with COVID-19
Walsh et al. Escalating care on labor and delivery
Sánchez-Becerra et al. Targeted neonatal echocardiography and lung ultrasound in preterm infants with chronic lung disease with and without pulmonary hypertension, screened using a standardized algorithm
Sekitoleko et al. The influence of fasting and post-load glucose levels on maternal and neonatal outcomes in women with hyperglycaemia in pregnancy in Uganda: A prospective observational cohort study
Knack et al. Early physician gestalt versus usual screening tools for the prediction of sepsis in critically ill emergency patients

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant