CN117637160A - Amyotrophic lateral sclerosis prediction and prognosis evaluation system - Google Patents

Amyotrophic lateral sclerosis prediction and prognosis evaluation system Download PDF

Info

Publication number
CN117637160A
CN117637160A CN202311509828.5A CN202311509828A CN117637160A CN 117637160 A CN117637160 A CN 117637160A CN 202311509828 A CN202311509828 A CN 202311509828A CN 117637160 A CN117637160 A CN 117637160A
Authority
CN
China
Prior art keywords
prediction
sals
lateral sclerosis
amyotrophic lateral
gene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311509828.5A
Other languages
Chinese (zh)
Inventor
何璐
周勤明
陈晟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd
Original Assignee
Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd filed Critical Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd
Priority to CN202311509828.5A priority Critical patent/CN117637160A/en
Publication of CN117637160A publication Critical patent/CN117637160A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/40ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention provides a amyotrophic lateral sclerosis prediction and prognosis evaluation system and a method for constructing a sALS prediction and prognosis evaluation model based on various proteins. The system comprises an input module, a prediction evaluation module and an output module. The input module is used for obtaining the content of various candidate ALS serum biomarkers. The prediction evaluation module is used for calculating and grading according to the sALS prediction and prognosis evaluation models based on various proteins, judging that amyotrophic lateral sclerosis is caused when the grading result is larger than or equal to a specific judgment threshold value, and judging that amyotrophic lateral sclerosis is not caused when the grading result is smaller than the specific judgment threshold value. And the output module is used for outputting the prediction evaluation result. After the technical scheme is adopted, the amyotrophic lateral sclerosis can be simply, quickly and accurately predicted and evaluated.

Description

Amyotrophic lateral sclerosis prediction and prognosis evaluation system
Technical Field
The invention relates to the technical field of disease prediction, in particular to a amyotrophic lateral sclerosis prediction and prognosis evaluation system.
Background
Amyotrophic Lateral Sclerosis (ALS) is a progressive neurodegenerative disease that can lead to severe disability and death. At present, the pathogenic mechanism of ALS is not clear, and an effective therapeutic target and a therapeutic scheme with satisfactory curative effect are not available. ALS has a prevalence of 5/100000, a prevalence of 1.7/100000, and a short average survival (median survival of 2-5 years). Although ALS was first described in 1869, its diagnosis is still currently dependent primarily on medical history and neurological examination. However, in the presence of characteristic clinical symptoms or electrophysiological changes, ALS has progressed to the mid-stage of neurodegeneration, a large number of motor neurons have been lost, and the course of the disease has been difficult to reverse. Early diagnosis and early intervention of ALS is therefore particularly important. The diagnosis of ALS is currently based mainly on the clinical manifestations of the patient and the clinical experience of the doctor, which results in missed or misdiagnosis of a number of patients. Therefore, on the one hand, there is a need to find sensitive and reliable biological markers, and on the other hand, there is a need for easy-to-operate and economical auxiliary examination means to help ALS diagnosis. In addition, ALS is a progressive disease that would be of great benefit to the clinical diagnosis and treatment of ALS if AL S biomarkers could be further used to monitor disease progression and predict patient prognosis.
ALS has two forms: familial and sporadic ALS (fALS and sALS). sALS accounts for 90% of ALS patients, while fALS accounts for the remaining 10%. fALS is diagnosed by gene mutation detection and family history, whereas sALS is difficult to diagnose early. The etiology of sALS is not clear and a number of different pathogenic mechanisms have been proposed. There is an urgent need to identify biomarkers of sALS.
Currently available studies on ALS biomarkers have focused mainly on clinical symptoms, electrophysiology, neuroimaging, and biochemistry:
1. clinical symptoms of ALS: motor symptoms are the main basis for ALS diagnosis at present, however, the symptoms of upper and lower motor neurons only appear after the neurons are obviously damaged, and cannot be used as a marker for early diagnosis. Non-motor symptoms such as behavioral changes, executive dysfunction, etc. may occur in some patients and precede motor symptoms, but are less sensitive and specific to ALS diagnosis.
2. Neurophysiology: the concentric needle electromyography examination is an important auxiliary examination for ALS diagnosis and differential diagnosis, and the generalized active nerve loss innervation and chronic nerve regeneration innervation can discover lower motor neuron lesions earlier than the physical examination, but similar to clinical symptoms, when a patient has obvious electromyography changes, the patient is often in middle and late stages of diseases, and the electromyography is not ideal for the specificity and the sensibility of early patients; magnetically stimulated motor evoked potentials are helpful in finding upper motor neuron lesions in ALS, but are less sensitive.
3. Neuroimaging: conventional imaging examination such as magnetic resonance imaging is difficult to provide a basis for confirming ALS, and mainly assists in differential diagnosis and eliminates structural damage; the functional magnetic resonance, cerebral motor cortex thickness analysis, magnetic resonance spectrum imaging, cone beam diffusion tensor imaging, positron emission computed tomography, single photon emission computed tomography and other technologies can be used as biomarkers to reflect the affected performance of the upper motor neuron, but the current research stage is still in the research stage, and the method is difficult to popularize due to high price, low practicality and poor specificity.
4. Biochemistry: biochemical marker studies on early diagnosis of ALS have involved a number of areas of immunity, inflammation, oxidative stress, apoptosis, etc., with neurofilament light chain proteins being the most potentially useful diagnostic aid. The increase of cerebrospinal fluid and serum neurofilament light chain proteins can prompt the pathological changes of upper motor neurons in ALS, but the marker lacks specificity, and is difficult to directly apply to early diagnosis and differential diagnosis of ALS.
The diagnosis of sALS is still based on clinical symptoms and neurological and electrophysiological examinations. Because of the large clinical heterogeneity and the lack of effective biomarkers, once patients reach the diagnostic criteria for sALS, most motor neurons have generally been lost, limiting the therapeutic potential of various potential therapies. In addition, there are currently difficulties in assessing disease progression and therapeutic efficacy in sALS. Some quantitative or semi-quantitative functional scales such as ALSFRS-R scores have been used to assess disease progression and effectiveness of treatment, however, the utility of these scales remains controversial due to low sensitivity. Therefore, a biomarker with high sensitivity and specificity for ALS diagnosis which can be widely applied in clinic has not been found so far, and the identification of candidate biomarkers can promote early diagnosis, early treatment, accurate disease progress assessment and treatment effect.
Cerebrospinal fluid is the most direct specimen reflecting pathological changes in the central nervous system. However, the cerebrospinal fluid is relatively invasive, high-risk operation is required, and repeated sample collection is difficult to obtain for patient's consent for disease monitoring; in contrast, blood tests are invasive, simple to operate, and inexpensive, making them more suitable for disease screening. Blood markers that have been previously reported to potentially aid in diagnosis of ALS include inflammatory factors, neurofilament light chain of plasma proteins, ubiquitin C-terminal hydrolase L1, and the like. This suggests that blood detection may reflect disease characteristics of ALS to some extent, and blood biochemical markers have the potential to aid in ALS diagnosis and prognosis. However, the potential biomarkers of ALS described above remain controversial due to the lack of validation, small sample size, etc. of the studies described above.
In recent years, proteomics has been increasingly applied to explore biomarkers and describe the general appearance of disease, helping to better understand the disease mechanism. Proteomic analysis has the advantages of high throughput, objectivity and quantification, helps understand the multifactorial pathophysiological processes, and helps develop effective therapeutic interventions. Some studies have used high-throughput proteomic methods to explore new ALS biomarkers. However, reproducibility of these studies is limited due to the small differences in methods and sample sizes. In addition, focusing on a single pathway or on a specific type of functional protein limits the ability to describe the general appearance of the disease. Previous reports of ALS biomarkers have focused primarily on a single protein or pathway. However, diagnostic efficiency of a single protein is limited, probably because sALS is a complex disease with multiple pathological mechanisms, whereas a single protein or pathway does not adequately reflect highly complex diseases.
Disclosure of Invention
In order to overcome the above technical drawbacks, a first aspect of the present invention provides a method for constructing a model for predicting and prognosis evaluation of amyotrophic lateral sclerosis, which includes:
step S1: preliminary screening a number of potential ALS serum biomarkers by label-free quantitative proteomics based on a test set consisting of ALS patient data and healthy control group data;
step S2: detecting the expression levels of the several potential ALS serum biomarkers by ELISA detection kit based on a validation set consisting of ALS patient data and healthy control group data, then sequentially performing a principal component dimension reduction analysis and an interaction analysis on ALS serum biomarkers whose expression levels are significantly up-regulated, thereby finally screening out several ALS serum biomarkers, and performing a subject operation profile analysis to evaluate the sensitivity and specificity of each biomarker in distinguishing sALS patients from healthy control groups;
step S3: and constructing a logistic regression model by using a likelihood ratio method based on two or more than two ALS serum biomarkers to generate a sALS prediction and prognosis evaluation model based on a plurality of proteins, wherein the sALS prediction and prognosis evaluation model based on the plurality of proteins is used for calculating a scoring result according to the concentration combination of the plurality of proteins, and judging that amyotrophic lateral sclerosis exists if the scoring result is greater than or equal to a specific judging threshold value.
Further, in step S1, differentially expressed proteins are identified using error-finding rate analysis; performing principal multicomponent analysis, gene ontology enrichment analysis, pathway enrichment analysis and protein interaction network analysis by using a STRING database to determine functional clusters of differentially expressed proteins; a number of potential ALS serum biomarkers were further screened based on the degree of expression differential, molecular function, and organ expression specificity.
Further, the number of candidate ALS serum biomarkers includes proteins encoded by FKBP1A, CD, CAMP, ZYX, HBA1, HBB, TLN1, TPT1 genes, respectively.
Further, the sALS prediction and prognosis evaluation model based on the multiple proteins is as follows:
Log(P)=0.002HBB+0.572CAMP+0.196TLN1+0.900ZYX+0.201TPT1-6.981
wherein HBB in the formula represents the protein concentration encoded by the HBB gene, CAMP represents the protein concentration encoded by the CAMP gene, TLN1 represents the protein concentration encoded by the TLN1 gene, ZYX represents the protein concentration encoded by the ZYX gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.426.
Further, the sALS prediction and prognosis evaluation model based on the multiple proteins is as follows:
Log(P)=0.403TLN1+0.177TPT1 -4.719
wherein TLN1 in the formula represents the protein concentration encoded by the TLN1 gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.207;
the model is used for prediction of early amyotrophic lateral sclerosis.
A second aspect of the present application provides a amyotrophic lateral sclerosis prediction and prognosis evaluation system, comprising:
the input module is used for acquiring the contents of various candidate ALS serum biomarkers;
the prediction evaluation module is used for calculating and grading according to the sALS prediction and prognosis evaluation models based on various proteins, judging that amyotrophic lateral sclerosis is caused when the grading result is larger than or equal to a specific judgment threshold value, and judging that amyotrophic lateral sclerosis is not caused when the grading result is smaller than the specific judgment threshold value;
and the output module is used for outputting the prediction evaluation result.
Further, the plurality of candidate ALS serum biomarkers is selected from two or more of the proteins encoded by FKBP1A, CD, CAMP, ZYX, HBA1, HBB, TLN1, TPT1 genes.
Further, the sALS prediction and prognosis evaluation model based on the multiple proteins is as follows:
Log(P)=0.002HBB+0.572CAMP+0.196TLN1+0.900ZYX+0.201TPT1-6.981
wherein HBB in the formula represents the protein concentration encoded by the HBB gene, CAMP represents the protein concentration encoded by the CAMP gene, TLN1 represents the protein concentration encoded by the TLN1 gene, ZYX represents the protein concentration encoded by the ZYX gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.426.
Further, the sALS prediction and prognosis evaluation model based on the multiple proteins is as follows:
Log(P)=0.403TLN1+0.177TPT1 -4.719
wherein TLN1 in the formula represents the protein concentration encoded by the TLN1 gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.207;
the model is used for prediction of early amyotrophic lateral sclerosis.
After the technical scheme is adopted, compared with the prior art, the method has the following beneficial effects:
the method adopts proteomics to perform initial screening of potential sALS biomarkers, further performs verification in patients with larger sample size, finally identifies more reliable multiple biomarker combinations, and establishes sALS prediction and prognosis evaluation models based on multiple proteins on the basis of the multiple biomarker combinations, so that rapid quantitative prediction evaluation is realized for early and prognosis diagnosis of the sALS. For example, (1) five protein-based logic models (HBB, CAMP, TLN, ZYX and TPT 1) showed significant effectiveness in distinguishing sALS from control (AUC: 0.811, p < 0.0001). (2) Since clinical symptoms are not apparent early in sALS, neurological and electrophysiological examination may not be identifiable. To diagnose early (disease course <6 months of onset) sALS, we developed a dual protein-based logistic model set (TLN 1 and TPT 1), which is more helpful for early diagnosis. (3) sALS patients with lower alsrs-R scores showed three higher proteins (FKBP 1A, CAMP and HBA 1) in combination with assessing clinical parameters than patients with higher alsrs-R scores, with significant potential in monitoring disease prognosis.
In addition, our proteomic analysis provides new insight into the pathogenesis of ALS, which on the one hand helps in the diagnosis of ALS and detection of disease progression, and on the other hand helps in improving risk prediction techniques and screening for intervention targets. Because sALS is a complex disease with multiple pathological mechanisms, a single protein or pathway does not adequately reflect highly complex diseases. Thus, we propose a combination of proteins that includes multiple pathways to help distinguish sALS patients from control.
Drawings
Fig. 1 shows that eight candidate biomarkers have differences in expression between sALS and normal controls (< 0.05, <0.01, <0.001, <0.0001, < one-way analysis of variance with minimal differences after testing);
FIG. 2 is a multivariate logistic regression analysis of eight candidate biomarkers suggesting that CAMP has optimal sALS diagnostic efficacy;
FIG. 3 is a graph of sALS diagnostic efficacy based on five biomarkers;
FIG. 4 shows the combination of markers for early diagnosis and prognosis of sALS. a and b are multivariate logistic regression analysis based on a combination of two biomarkers suggesting better early sALS diagnostic efficacy. c is the difference in protein expression levels and their correlation with different ALSFRS-R scores as determined by ELISA (< 0.05P < 0.01P < 0.001P <0.0001, statistical method is single factor anova with minimal difference after test).
Detailed Description
Advantages of the invention are further illustrated in the following description, taken in conjunction with the accompanying drawings and detailed description. It is to be understood by persons skilled in the art that the following detailed description is illustrative and not restrictive, and that this invention is not limited to the details given herein.
The present example provides a construction process of a sALS prediction and prognosis evaluation model based on various proteins, a construction process of a amyotrophic lateral sclerosis prediction and prognosis evaluation system, and applications thereof.
1. Construction of sALS prediction and prognosis evaluation model based on various proteins
To better develop diagnostic and prognostic systems for ALS, we used proteomics to perform initial screening of potential biomarkers and further validated in larger sample size patients, ultimately identifying more reliable biomarker combinations that aid in ALS diagnosis and prognosis.
The construction method for the amyotrophic lateral sclerosis prediction and prognosis evaluation model comprises the following steps of S1-S3:
step S1: several potential ALS serum biomarkers were initially screened by label-free quantitative proteomics based on a test set consisting of ALS patient data and healthy control data.
We recruited 10 sporadic ALS patients and 5 healthy controls, screening potential ALS serum biomarkers by label-free quantitative proteomics. Identifying Differentially Expressed Proteins (DEP) using a False Discovery Rate (FDR) analysis; major multivariate analysis (PCA), gene Ontology (GO) enrichment analysis, pathway enrichment analysis (KEGG) and protein interaction network analysis (PPI) were performed using the sting database to determine functional clustering of DEP. Candidate biomarkers are further selected based on the degree of expression differential, molecular function, and organ expression specificity.
Step S2: the expression levels of the several potential ALS serum biomarkers were detected by ELISA detection kit based on a validation set consisting of ALS patient data and healthy control group data, and then principal component dimension reduction analysis and interaction analysis were sequentially performed on ALS serum biomarkers whose expression levels were significantly up-regulated, thereby finally screening out several ALS serum biomarkers, and subject operation profile analysis was performed to evaluate the sensitivity and specificity of each biomarker in distinguishing sALS patients from healthy control groups. These biomarkers have potential for early diagnosis of sALS, monitoring of disease progression and quantitative assessment of therapeutic effects, and are of paramount importance in clinical practice.
The diagnostic value of the above serum marker validation candidate biomarkers was detected by ELISA in a validation cohort of 100 sporadic ALS patients and 100 controls, and a multiprotein combination kit was further established to improve diagnostic efficacy, achieve early diagnosis and evaluate disease progression.
1682 proteins were detected by label-free quantitative proteomic analysis, of which 387 were identified as DEP associated with sALS. Of these 387 proteins, 259 proteins were up-regulated and 128 proteins were down-regulated in sALS patients compared to the control group. Principal multicomponent analysis showed that DEP separated sALS patients from the control group. Furthermore, clustered heat map analysis generated from overall DEP expression trends can distinguish sALS patients. According to GO analysis, molecular function of DEPs is mainly related to antioxidant, protein disulfide reductase, carbooxygen lyase, hydrolase and peroxidase activities. These results indicate that antioxidant stress plays an important role in the pathogenesis of ALS, consistent with previous findings that oxidative stress injury is involved in the pathogenesis of ALS. KEGG pathway analysis showed that DEP is enriched in platelet activation, energy metabolism (glycolysis, tricarboxylic acid cycle, carbon metabolism, amino acid biosynthesis) and peroxisomes. Major component dimension reduction analysis of DEP significantly upregulated in ALS, setting FDR <0.05, expression levels differ by more than 5-fold, yielded 42 candidate biomarker proteins. The 42 proteins were subjected to interaction analysis to screen for biomarkers. Proteins representing different pathways were selected as much as possible and expressed in the central nervous system, allowing a reduction in the range to 8 DEPs (encoded by the genes FKBP1A, CD, CAMP, ZYX, HBA1, HBB, TLN1 and TPT1, respectively) which were considered potent candidate biomarkers and were further validated in a validation cohort.
The performance of 8 potential diagnostic biomarkers was verified in a cohort of 100 sALS patients and 100 healthy controls, which have potential for early diagnosis of sALS, disease progression monitoring, and quantitative assessment of therapeutic effects, and are of paramount importance in clinical practice. All eight potential biomarkers were expressed at significantly higher levels in sALS patients than in the control group (see fig. 1), up-regulated by more than 5-fold in the central nervous system of sALS patients, and they were involved in multiple functional pathways. Subject performance characterization was performed to evaluate the sensitivity and specificity of each biomarker in differentiating between sALS patients and control groups, and cathelicidin-related antibacterial peptide (CAMP) protein was found to be most capable of differentiating between sALS and control groups (AUC: 0.713, p < 0.0001) (see fig. 2).
Step S3: and constructing a logistic regression model by using a likelihood ratio method based on two or more than two ALS serum biomarkers to generate a sALS prediction and prognosis evaluation model based on a plurality of proteins, wherein the sALS prediction and prognosis evaluation model based on the plurality of proteins is used for calculating a scoring result according to the concentration combination of the plurality of proteins, and judging that amyotrophic lateral sclerosis exists if the scoring result is greater than or equal to a specific judging threshold value.
Because of the limited diagnostic efficiency of individual proteins, we next developed a multiprotein diagnostic panel by constructing a logistic regression model using likelihood ratio methods. Finally, a protein combination system (AUC: 0.811, P < 0.0001) comprising five markers with high discrimination (HBB, CAMP, TLN, ZYX and TPT 1) was obtained. A binary score was generated based on a logical model of the five proteins that could be used to accurately characterize each sample. The probability score P for each protein marker value for a sample diagnosed positively as sALS is defined as:
(1) Illustratively, in a preferred embodiment, the multiple protein-based sALS prediction and prognosis evaluation model is:
Log(P)=0.002HBB+0.572CAMP+0.196TLN1+0.900ZYX+0.201TPT1-6.981
wherein, HBB in the formula represents the protein concentration encoded by the HBB gene, CAMP represents the protein concentration encoded by the CAMP gene, TLN1 represents the protein concentration encoded by the TLN1 gene, ZYX represents the protein concentration encoded by the ZYX gene, and TPT1 represents the protein concentration encoded by the TPT1 gene.
The probability score for sALS group was significantly higher than for control group. The corresponding judgment threshold value of the sALS prediction and prognosis evaluation model based on the five proteins is 0.426. Sensitivity and specificity scores for diagnosis of sALS were 79% and 71%, respectively, at a threshold point of 0.426, AUC was 0.811, p <0.0001 (see fig. 3).
The predictive model based on five proteins (FKBP 1A, TLN, ZYX, HBA1 and TPT 1) was applicable not only to early prediction of sALS (disease course <6 months of onset), but also to prognostic evaluation.
(2) Illustratively, in another preferred embodiment, the multiple protein-based sALS prediction and prognosis evaluation model is:
Log(P)=0.403TLN1+0.177TPT1 -4.719
wherein TLN1 in the formula represents the protein concentration encoded by the TLN1 gene, and TPT1 represents the protein concentration encoded by the TPT1 gene. The model is particularly suitable for predicting early amyotrophic lateral sclerosis.
Logistic regression model based on the two protein combinations (TLN 1 and TPT 1) gave an AUC of 0.766 in early sALS (course of onset <6 months), contributing to early diagnosis of sALS (see fig. 4). The corresponding judgment threshold value of the sALS prediction and prognosis evaluation model based on the two proteins is 0.207. The probability score for early sALS was significantly higher than that for the control group, with sensitivity and specificity scores for diagnostic early sALS at the threshold point of 0.207 being 80% and 68%, respectively.
(3) Furthermore, the expression levels of the three proteins (FKBP 1A, CAMP and HBA 1) were significantly different between patients with different ALSFRS-R levels (low, medium and high), with higher protein expression being associated with lower ALSFRS-R scores (see fig. 4), which aids in prognostic judgment. sALS patients with lower alsrs-R scores showed higher expression of the three proteins (FKBP 1A, CAMP and HBA 1) than patients with higher alsrs-R scores, suggesting that these three proteins might be helpful in monitoring disease progression. FKBP1A, CAMP and HBA1 have significant potential in monitoring disease progression in combination with assessing clinical parameters.
The details of the implementation of the above construction steps are as follows:
1. patient entry group
110 sALS patients were enrolled from department of neurology and research at the department of reminiscent hospital at the university of transportation medical college in the Shanghai from month 1 to month 12 in 2021. All patients were enrolled according to revised El Escoreal diagnostic criteria to determine ALS. Inclusion patients excluded the incorporation of other peripheral nerve affecting diseases, including diabetes, etc. During the recruitment process, acquisition history, physical examination and biochemical analysis are performed. ALS function rating scale revision (ALSFRS-R) was used to assess the severity of the disease. The ALSFRS-R scale has 12 entries representing four functional sub-fields. Including bulbar, fine exercise, gross exercise and respiration, each item has a score varying from 0 (complete loss of function) to 4 (no loss of function). The total score is between 0 and 48 minutes; higher scores indicate better functionality. At the same time, we recruited 105 age and sex matched healthy controls from the Ruijin hospital physical examination center.
Finally, 10 sALS patients and 5 controls were randomly selected as discovery cohorts for quantitative label-free proteomic analysis. In addition, 100 sALS patients and 100 controls were randomly assigned to the validation cohort receiving ELISA to determine the target biomarkers identified from the discovery cohort. All 100 sALS patients were divided into early (disease course <6 months, n=35) and late symptomatic sALS (disease course >6 months, n=65). In addition, patients were divided into three groups according to ALSFRS-R scores during enrollment: low scoring group (. Ltoreq.25, n=24), medium scoring group (26-32, n=51) and high scoring group (. Gtoreq.33, n=25). The human ethical committee of the Ruijin Hospital approved the study (approval number 2020-No. 50), and all participants signed informed consent.
2. Collection of serum samples
Blood samples were collected by venipuncture in a serum separation vacuum cleaner. After standing at room temperature for 1-2 hours, serum samples were aliquoted and stored at-80 ℃ for proteomic analysis after centrifugation (3000 rpm,10 min at 4 ℃).
3. Proteomic analysis
3.1 protein extraction
The serum samples were centrifuged at 11000 Xg for 10 min at 4℃to remove cell debris. The supernatant was transferred to a new centrifuge tube. Top 12 high abundance protein removal PierceTM Spin Columns kit (Thermo Fisher Scientific, hanover Park, IL, USA) was used to remove high abundance serum proteins to eliminate interference. Protein concentration (sameire feishier scientific) was determined using the bicinchoninic acid assay (BCA) according to the manufacturer's instructions.
3.2 trypsin digestion
After removal of the high abundance serum proteins, the samples were dried using a freeze vacuum concentrator (Thermo Fisher Scientific). 8M urea was added to redissolve the remaining protein, dithiothreitol was added to a final concentration of 5mM, and the protein was allowed to reduce at 56℃for 30 minutes. Iodoacetamide was added to a final concentration of 11mM and the mixture was incubated at room temperature and protected from light for 15 minutes, after which the alkylated sample was transferred to an ultrafiltration membrane (membrane with a molecular weight cut-off of 10kDa, millipore, darmstadt, germany) and centrifuged at 12000 Xg for 20 minutes at room temperature. The samples were resuspended in 8M urea three times, then ammonium bicarbonate three times, centrifuged after each resuspension, trypsin (protease: protein, M/M) was then added in a 1:50 mass ratio and the samples were allowed to digest overnight at 37 ℃. The supernatant was collected by centrifugation at 12000 Xg for 10 minutes at room temperature, and pure water was added to increase the solubility. The enzyme digested peptide solution was acidified to pH 2-3 using 10% trifluoroacetic acid, then centrifuged at 12000×g for 10 min at room temperature and the supernatant transferred to a new centrifuge tube for Stage Tip (Pierce, thermo Fisher Scientific) desalting.
3.3 liquid chromatography and Mass Spectrometry
Trypsin digested peptides were analyzed using an EASY nLC 1200 ultra high performance liquid chromatography system (Thermo Fisher Scientific), wherein mobile phase a was an aqueous solution containing 0.1% formic acid and 2% acetonitrile and mobile phase B was an aqueous solvent containing 0.1% formic acid and 90% acetonitrile. The liquid phase gradient is set to be 0-96 minutes, and 4% -20% of B;96-114 minutes, 20% -32% B;114-117 min, 32% -80% B;117-120 minutes, 80% B, flow rate was maintained at 500.00nL/min. Peptides were isolated by UHPLC system and then injected into Nanospray Flex TM Ionization was performed in an electron source (Siemens technologies). Isolated peptides were detected using the explari 480 mass spectrometry system (Thermo Fisher Scientific). The ion source voltage was set at 2.2kV and peptide parent ions and their secondary fragments were detected and analyzed using high resolution Orbitrap. The primary mass spectrum scanning range is 400-1200m/z, the resolution is 60000.00, and the secondary mass spectrum scanning range is fixed to 100m/z, the resolution30000.00. Data acquisition mode the data-dependent acquisition scan procedure was used, i.e. the first 15.00 peptide precursor ions with the highest signal intensities were selected after the initial scan. Fragmentation was performed sequentially using 27% of the fragmentation energy into a high energy collision-dissociation-collision cell and secondary mass spectrometry was performed sequentially. To improve the effective utilization of the mass spectrum, the automatic gain control is set to 7.5E4, the signal threshold is set to 1E4 ions/s, the maximum injection time is set to 100ms, and the dynamic exclusion time of tandem mass spectrometry scanning is set to 30 seconds to avoid repeated scanning of parent ions.
3.4 database search
Secondary mass spectrometry data was retrieved using Proteome DiscoverTM 2.4.2.4 (Thermo Fisher Scientific). The search parameters were set as follows: the database used is homosapiens 9606 (20366 sequences), a reverse database is added to calculate the error discovery rate caused by random matching, and a common pollution database is added to the database to eliminate the influence of polluted proteins in the identification result; setting the cleavage mode as trypsin (Full); the number of missing cutting positions is set to be two; the minimum peptide length was set to 7 amino acid residues; the maximum number of peptide modifications was set to five; the mass error tolerance of the primary parent ion for the first search and the primary search was set to 10ppm and 5ppm, respectively, and the mass error tolerance of the second fragment ion was set to 0.02Da. The cysteine alkylated aminomethyl group is provided as a fixed modification and a variable modification a '[' Acetyl (Protein N-te 'M)', 'Oxidation' M) ',' deamination ('Q)' ]. The quantification method was set to LFQ and the false discovery rate for protein identification and peptide profile matching identification was set to 1%.
3.5 functional Cluster analysis
GO functional clustering was performed to determine the biological function of proteins whose expression levels were significantly different in serum of sALS patients and healthy controls. The two-tailed Fisher exact test was used to classify proteins annotated with GO database. In addition, a two-tailed Fisher exact test was performed using a KEGG (https:// www.genome.jp/KEGG /) database to determine enrichment pathways for pathway analysis. Each protein class was searched using the InterPro database (https:// www.ebi.ac.uk/Interpro) and tested for double tail Fisher accuracy. Functional classification (GO, domain and pathway) based on Differentially Expressed Proteins (DEP) includes at least one class with an enriched cluster in hierarchical cluster analysis. The function x= -log10 (P-value) is used to transform the filtered P-value matrix. Then, the x value of each class is z-transformed. The z-scores were clustered using a one-way hierarchical clustering function of the Genesis software program. For cluster membership visualization, a heatmap is generated using the "hetmap.2" function in the "gplot" tool of the R-package software program. Molecular interactions between DEPs were analyzed using a STRING database (version 10.1; https:// STRING-db. Org) 6 and visualized using a "networkD3" tool of the R-package software program (version 0.4, https:// CRAN. R-project. Org/package = networkD 3).
4. Verification Using ELISA
cathelicidin related antimicrobial peptides (CAMP, ELK5115, ELK biotechnology, wuhan, china), FK506 binding protein 1A (FKBP 1A, ELK4323, ELK biotechnology), hemoglobin subunit alpha (HBA 1, ELK5194, ELK biotechnology), hemoglobin subunit beta (HBB, ELK4071, ELK biotechnology), CD84 (ELK 8729, ELK biotechnology), talin-1 (TLN 1, ELK2345, ELK biotechnology), translation-controlled tumor proteins (TPT 1, ELK4011, ELK biotechnology), zyxin (ZYX, ELK4891, ELK biotechnology), and expression levels of the above protein indicators were detected using corresponding ELISA detection kits according to the manufacturer's instructions.
5. Statistical analysis
Classification data were analyzed using chi-square test or Fisher's exact test, and serial data were evaluated using Student's t test or single-factor anova and post-hoc least significant difference test. Multivariate logistic regression analysis was performed to develop diagnostic algorithms and scores for sALS. The area under the working curve (area under the curve, AUC) of the subjects was determined and compared using the Z test. Double tail P values <0.05 were considered to indicate significant differences. Statistical analysis was performed using SPSS software (version 18.0, SPSS Inc. of Chicago, ill.).
2. Construction and application of amyotrophic lateral sclerosis prediction and prognosis evaluation system
Based on the constructed sALS prediction and prognosis evaluation model based on various proteins, a amyotrophic lateral sclerosis prediction and prognosis evaluation system is constructed by adopting a computer program mode, and the system comprises an input module, a prediction evaluation module and an output module. The functions of the input module, the predictive evaluation module, and the output module are realized when the computer program is executed.
The input module obtains the content of a plurality of candidate ALS serum biomarkers. In practical applications, concentration data of specific types of ALS serum biomarkers involved in sALS prediction and prognosis evaluation models based on a plurality of proteins need only be input.
The prediction evaluation module calculates and scores according to the inputted concentration data of the specific ALS serum biomarker and a preset sALS prediction and prognosis evaluation model, and judges that amyotrophic lateral sclerosis is caused when the scoring result is larger than or equal to a specific judgment threshold value, and judges that amyotrophic lateral sclerosis is not caused when the scoring result is smaller than the specific judgment threshold value.
Illustratively, (1) when the pre-set sALS prediction and prognosis evaluation model based on various proteins in the computer program is:
when Log (P) =0.002hbb+0.572camp+0.196tln1+0.900 zyx+0.201tpt1-6.981, then only the following five data need to be entered: protein concentration encoded by HBB gene, protein concentration encoded by CAMP gene, protein concentration encoded by TLN1 gene, protein concentration encoded by ZYX gene, protein concentration encoded by TPT1 gene.
The corresponding judgment threshold value of the sALS prediction and prognosis evaluation model based on various proteins is 0.426. When the score is 0.426 or more, it is determined that amyotrophic lateral sclerosis is present, and when the score is less than 0.426, it is determined that amyotrophic lateral sclerosis is not present.
Illustratively, (2) when the pre-set sALS prediction and prognosis evaluation model based on various proteins in the computer program is: when Log (P) =0.403 tln1+0.177 tpt1-4.719, then only the following two data needs to be input: protein concentration encoded by TLN1 gene, protein concentration encoded by TPT1 gene.
The corresponding judgment threshold value of the sALS prediction and prognosis evaluation model based on various proteins is 0.207. When the score is 0.207 or more, it is determined that amyotrophic lateral sclerosis is present, and when the score is less than 0.207, it is determined that amyotrophic lateral sclerosis is not present.
The output module outputs the prediction evaluation result according to the judgment result of the prediction evaluation module: "having amyotrophic lateral sclerosis" or "not having amyotrophic lateral sclerosis".
It should be noted that the embodiments of the present invention are preferred and not limited in any way, and any person skilled in the art may make use of the above-disclosed technical content to change or modify the same into equivalent effective embodiments without departing from the technical scope of the present invention, and any modification or equivalent change and modification of the above-described embodiments according to the technical substance of the present invention still falls within the scope of the technical scope of the present invention.

Claims (9)

1. A method for constructing a model for predicting and prognosticating amyotrophic lateral sclerosis, comprising:
step S1: preliminary screening a number of potential ALS serum biomarkers by label-free quantitative proteomics based on a test set consisting of ALS patient data and healthy control group data;
step S2: detecting the expression levels of the several potential ALS serum biomarkers by ELISA detection kit based on a validation set consisting of ALS patient data and healthy control group data, then sequentially performing a principal component dimension reduction analysis and an interaction analysis on ALS serum biomarkers whose expression levels are significantly up-regulated, thereby finally screening out several ALS serum biomarkers, and performing a subject operation profile analysis to evaluate the sensitivity and specificity of each biomarker in distinguishing sALS patients from healthy control groups;
step S3: and constructing a logistic regression model by using a likelihood ratio method based on two or more than two ALS serum biomarkers to generate a sALS prediction and prognosis evaluation model based on a plurality of proteins, wherein the sALS prediction and prognosis evaluation model based on the plurality of proteins is used for calculating a scoring result according to the concentration combination of the plurality of proteins, and judging that amyotrophic lateral sclerosis exists if the scoring result is greater than or equal to a specific judging threshold value.
2. The method for constructing a model for predicting amyotrophic lateral sclerosis and prognosis evaluation according to claim 1, wherein in step S1, a differentially expressed protein is identified using error-finding rate analysis; performing principal multicomponent analysis, gene ontology enrichment analysis, pathway enrichment analysis and protein interaction network analysis by using a STRING database to determine functional clusters of differentially expressed proteins; a number of potential ALS serum biomarkers were further screened based on the degree of expression differential, molecular function, and organ expression specificity.
3. The method of claim 2, wherein the plurality of candidate ALS serum biomarkers comprises proteins encoded by FKBP1A, CD, CAMP, ZYX, HBA1, HBB, TLN1, TPT1 genes, respectively.
4. The method for constructing a model for predicting and prognosticating amyotrophic lateral sclerosis according to claim 3, wherein the multiple protein-based sALS prediction and prognosis model is:
Log(P)=0.002HBB+0.572CAMP+0.196TLN1+0.900ZYX+0.201TPT1-6.981
wherein HBB in the formula represents the protein concentration encoded by the HBB gene, CAMP represents the protein concentration encoded by the CAMP gene, TLN1 represents the protein concentration encoded by the TLN1 gene, ZYX represents the protein concentration encoded by the ZYX gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.426.
5. The method for constructing a model for predicting and prognosticating amyotrophic lateral sclerosis according to claim 3, wherein the multiple protein-based sALS prediction and prognosis model is:
Log(P)=0.403TLN1+0.177TPT1-4.719
wherein TLN1 in the formula represents the protein concentration encoded by the TLN1 gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.207;
the model is used for prediction of early amyotrophic lateral sclerosis.
6. A amyotrophic lateral sclerosis prediction and prognosis evaluation system, comprising:
the input module is used for acquiring the contents of various candidate ALS serum biomarkers;
the prediction evaluation module is used for calculating and grading according to the sALS prediction and prognosis evaluation models based on various proteins, judging that amyotrophic lateral sclerosis is caused when the grading result is larger than or equal to a specific judgment threshold value, and judging that amyotrophic lateral sclerosis is not caused when the grading result is smaller than the specific judgment threshold value;
and the output module is used for outputting the prediction evaluation result.
7. The amyotrophic lateral sclerosis prediction and prognosis evaluation system of claim 6, wherein the plurality of candidate ALS serum biomarkers are selected from two or more of the proteins encoded by FKBP1A, CD, CAMP, ZYX, HBA1, HBB, TLN1, TPT1 genes.
8. The amyotrophic lateral sclerosis prediction and prognosis evaluation system according to claim 7, wherein the multiple protein-based sALS prediction and prognosis evaluation model is:
Log(P)=0.002HBB+0.572CAMP+0.196TLN1+0.900ZYX+0.201TPT1-6.981
wherein HBB in the formula represents the protein concentration encoded by the HBB gene, CAMP represents the protein concentration encoded by the CAMP gene, TLN1 represents the protein concentration encoded by the TLN1 gene, ZYX represents the protein concentration encoded by the ZYX gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model based on the multiple proteins is 0.426.
9. The amyotrophic lateral sclerosis prediction and prognosis evaluation system according to claim 7, wherein the multiple protein-based sALS prediction and prognosis evaluation model is:
Log(P)=0.403TLN1+0.177TPT1-4.719
wherein TLN1 in the formula represents the protein concentration encoded by the TLN1 gene, and TPT1 represents the protein concentration encoded by the TPT1 gene;
the judging threshold value corresponding to the sALS prediction and prognosis evaluation model is 0.207;
the model is used for prediction of early amyotrophic lateral sclerosis.
CN202311509828.5A 2023-11-14 2023-11-14 Amyotrophic lateral sclerosis prediction and prognosis evaluation system Pending CN117637160A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311509828.5A CN117637160A (en) 2023-11-14 2023-11-14 Amyotrophic lateral sclerosis prediction and prognosis evaluation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311509828.5A CN117637160A (en) 2023-11-14 2023-11-14 Amyotrophic lateral sclerosis prediction and prognosis evaluation system

Publications (1)

Publication Number Publication Date
CN117637160A true CN117637160A (en) 2024-03-01

Family

ID=90022682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311509828.5A Pending CN117637160A (en) 2023-11-14 2023-11-14 Amyotrophic lateral sclerosis prediction and prognosis evaluation system

Country Status (1)

Country Link
CN (1) CN117637160A (en)

Similar Documents

Publication Publication Date Title
US12085565B2 (en) SNTF is a blood biomarker for the diagnosis and prognosis of sports-related concussion
EP3260866B1 (en) Novel biomarkers for cognitive impairment and methods for detecting cognitive impairment using such biomarkers
SG173310A1 (en) Apolipoprotein fingerprinting technique
US20180088126A1 (en) Method of identifying proteins in human serum indicative of pathologies of human lung tissues
Rossi et al. Biomarker discovery in asthma and COPD by proteomic approaches
US20180003724A1 (en) Alzheimer&#39;s disease diagnostic panels and methods for their use
Bakochi et al. Cerebrospinal fluid proteome maps detect pathogen-specific host response patterns in meningitis
US20160018413A1 (en) Methods of Prognosing Preeclampsia
Jiang et al. Integration of metabolomics and peptidomics reveals distinct molecular landscape of human diabetic kidney disease
JP2019536062A (en) Mass spectrometry based method for detecting circulating histones H3 and H2B in plasma from patients with sepsis or septic shock (SS)
WO2024159559A1 (en) Protein marker and kit for early screening of colorectal cancer and use thereof
Kang et al. Fibrinogen and kininogen are potential serum protein biomarkers for depressive disorder
CN116754772A (en) Peripheral blood protein marker for early diagnosis of senile dementia, application and auxiliary diagnosis system
EP3654038B1 (en) Biomarker for cognitive impairment disorders and detection method for cognitive impairment disorders using said biomarker
JP2024529555A (en) Biomarkers for predicting or monitoring recurrence of NMOSD and their uses
CN117637160A (en) Amyotrophic lateral sclerosis prediction and prognosis evaluation system
US12044684B2 (en) Methods for diagnosing an autistic spectrum disorder
WO2020140425A1 (en) Application of group of serum differential protein combinations in preparing reagents for detecting autism
KR102603913B1 (en) Plasma protein biomarker panel for screening Alzheimer&#39;s Disease using mass spectrometry
KR20230173319A (en) Biomarker for determining major depressive disorder, polar disorder and zophrenia based on mass spectrometry and its use
CN117723759A (en) Plasma protein biomarker combination and application thereof as well as diagnostic system for distinguishing various mental diseases of children and teenagers
Guo et al. Serum proteomic analysis uncovers novel serum biomarkers for depression
KR20230120357A (en) Biomarker for predicting depression severity in mood disorder and its use
CN111289669A (en) Development of several potential biomarkers in polymyositis/dermatomyositis serum exosomes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination