CN116732164A

CN116732164A - Biomarker combinations and their use in predicting ASD disease

Info

Publication number: CN116732164A
Application number: CN202310725801.3A
Authority: CN
Inventors: 李明珠
Original assignee: Shanghai Aipu Tikang Biotechnology Co ltd
Current assignee: Shanghai Aipu Tikang Biotechnology Co ltd
Priority date: 2023-06-16
Filing date: 2023-06-16
Publication date: 2023-09-12

Abstract

The application discloses a biomarker combination and application thereof in prediction of ASD (autism spectrum disorder) diseases. The biomarker combination consists of 52 biomarkers, and particularly, the description in the specification of the application is shown, the 52 biomarkers in the application can be used for risk prediction and detection of autism spectrum disorder patients, have the advantages of high sensitivity and high specificity, and provide favorable technical support for predicting autism spectrum disorder diseases, intervention treatment and the like.

Description

Biomarker combinations and their use in predicting ASD disease

Technical Field

The application belongs to the field of biomedical technology and diagnosis, and particularly relates to a biomarker combination and application thereof in prediction of ASD.

Background

Autism spectrum disorder (Autism Spectrum Disorder, ASD) is a mental disorder disease that is broadly developed in infant and young children, and is mainly manifested by social communication disorder, narrow interest or range of motion, repeated plate-carving behavior, and mental behavior disorder such as mental retardation, unstable emotion, sleep disorder, etc. with different degrees. Among this spectrum disorder, childhood autism is one of the most severe of childhood mental disorders. Over the last decade, diagnosis of ASD has become increasingly popular worldwide. China accounts for 22% of the world population, and by 2020, the prevalence of Chinese ASD is 0.7% which is significantly lower than that of U.S. 1.7%, and the reasons for this difference may be genetic factors, environmental factors, diagnostic means, etc.

Currently, the means of diagnosing ASD in common use include the following: manual for diagnosis and statistics of mental disorders (DMS-5), diagnosis and observation of Autism (ADOS), behavioral autism (Autism Behavior Checklist, ABC), rating scale for autism in children (ChildhoodAutism Rating Scale, CARS), table for language and intellectual screening, and other evaluation scales, electroencephalogram, mri, etc. Although diseases can be clinically screened from multiple angles, the main problems of the existing diagnosis means are that the diagnosis scale is high in subjectivity, misdiagnosis is easy to cause, the evaluation time is short, the actual conditions of some patients cannot be reflected, and in addition, most patients cannot find problems through electroencephalogram and magnetic resonance. Therefore, it is important to find a molecular marker for early screening diagnosis of ASD by means of medical tests (e.g. blood tests).

Molecular markers generally refer to a general physiological or pathological or biochemical index characteristic of a therapeutic process that can be objectively determined and evaluated. The blood test can provide the change condition of the expression level of protein molecules in blood, which not only can help us to screen and diagnose the early autism spectrum disorder patients, but also can provide a new solution for researching the generation of the autism spectrum disorder. Timely intervention and timely treatment of autism spectrum disorder can have good effect. Therefore, the application discloses a novel non-invasive early auxiliary diagnosis kit for autism spectrum disorder, which has great social and economic significance and clinical application prospect.

Disclosure of Invention

In order to solve the technical problems, the application provides a biomarker combination and application thereof in predicting autism spectrum disorder diseases.

The present application provides in a first aspect the use of a biomarker combination consisting of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F7, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1 and UBC for the manufacture of a product for predicting autism spectrum disorder.

Autism spectrum disorder according to the present application may include both mild to moderate autism spectrum disorder and severe autism spectrum disorder.

In a second aspect the application provides a reagent combination for detecting a biomarker combination consisting of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1 and UBC.

In a preferred embodiment, the combination of reagents is used to detect the expression level of the biomarker combination.

In a preferred embodiment, the combination of reagents comprises a reagent that specifically binds to the biomarker, or comprises a biomolecular reagent that specifically hybridizes to a nucleic acid encoding the biomarker.

In a certain preferred embodiment, the combination of reagents comprises reagents for genomic, transcriptome and/or proteomic sequencing.

In a preferred embodiment, the expression level is a protein expression level and/or an mRNA transcription level, and/or the biomolecular reagent is selected from one or more of a primer, a probe and an antibody.

Preferably, the protein expression level is detected by one or more of mass spectrometry, chips such as protein chips or microfluidic chips, digital single molecule immunoassays, ELISA, radioimmunoassays, immunonephelometry, immunohistochemistry, and Western blotting.

In a third aspect the application provides a biomarker combination consisting of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, RPS 1 and UBC.

In a fourth aspect the application provides a kit comprising a combination of reagents as described in the second aspect of the application and/or a combination of biomarkers as described in the third aspect of the application.

In a fifth aspect, the present application provides a method of constructing a predictive model of autism spectrum disorder disease, the method comprising: inputting protein expression data corresponding to biomarker combinations in the sample into an R language Caret packet containing a logistic regression model for machine learning to obtain an autism spectrum disorder disease prediction model;

the biomarker combinations consist of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1, and UBC.

In a preferred embodiment, the sample is a body fluid exosome.

In a preferred embodiment, the sample is from blood, urine, saliva or cerebrospinal fluid.

Preferably, the blood is serum or plasma.

In a preferred embodiment, the samples are subjected to DIA-mode collection of the protein expression data and peptide fragment matching by Firmiana software prior to machine learning.

In a certain preferred embodiment, the sample is from a patient suffering from a disease comprising autism spectrum disorder and a healthy person.

In a preferred embodiment, FOT of the proteins corresponding to the biomarker combinations are input as protein expression data into the R language Caret package of the logistic regression model for machine learning.

In a preferred embodiment, the protein expression data entered into the logistic regression model satisfy: the expression of the protein corresponding to the biomarker combination in the sample is 1.5 or more than 1.5 of the expression of the corresponding protein of the healthy human, and the t-test p value is less than 0.05.

In a preferred embodiment, the peptide fragment matches utilize the UniProt human protein database.

In a preferred embodiment, the samples are grouped prior to machine learning to obtain a modeling set of samples for autism spectrum disorder disease prediction model construction and a validation set of samples for validating the autism spectrum disorder disease prediction model.

In a preferred embodiment, the protein expression data entered into the logistic regression model is protein abundance greater than or equal to 30%.

In a preferred embodiment, the step of using the validation set sample for validation comprises: calculating the area under the line, the sensitivity and the specificity of a specificity curve of the protein expression data of the biomarker combinations in the sample; and judging the accuracy of the prediction model according to the offline area, sensitivity and specificity.

In a preferred embodiment, the method further comprises determining whether the sample has autism spectrum disorder, determining that the sample is ill when the probability of the sample being autism spectrum disorder is greater than or equal to 0.5, and determining that the sample is normal when the probability of the sample being autism spectrum disorder is greater than or equal to 0.5.

In a preferred embodiment of the application, the protein expression data is obtained by LC-MS technology, collected using DIA detection means.

Preferably, the peptide fragment matching is carried out on the data acquired in the DIA detection mode through Firmiana software. More preferably, the database of peptide segment matches is the UniProt human protein database.

In a preferred embodiment, the FOT of the protein corresponding to the biomarker combination is input as protein expression data into the R language caret package of the logistic regression model for machine learning.

Further preferably, the protein expression data after the Firmiana treatment is used: protein quantification was performed using the unlabeled intensity-based absolute quantification (iBAQ) method, FOT (Fraction of total) was calculated for each protein, defined as the iBAQ (intensity-based absorption-protein-quantification) of that protein divided by the total iBAQ of all identified proteins in the sample, and FOT for each protein was entered as protein expression data into a logistic regression model.

In a sixth aspect the present application provides a predictive model of autism spectrum disorder disease, the predictive model being constructed by a method as described in the fifth aspect of the application.

A seventh aspect of the present application provides a prediction system for autism spectrum disorder diseases, the prediction system comprising an analysis and judgment module, wherein the analysis and judgment module contains a prediction model according to the fifth aspect of the present application, and is used for judging whether the autism spectrum disorder diseases are probability;

wherein the biomarker combination consists of ACADVL, ACTA1, ACTA2, ACTC1, AP2A2, ARHGEF6, BLVRB, CALR, CFHR3, CFHR4, ESD, CLIP1, COPE, BAG2, CSN1S1, DBI, DDR1, EIF5B, FBN1, FTH1, glad 2, GOSR2, H2AC21, H2AX, HRNR, KANK2, MAT2B, MCM2, PPIB, PSME1, QSOX1, MMRN2, RANBP2, SERPINB3, SPARC, SOD1, SUPT16H, SYNM, TUBB2A, TUBB B, TUBB4A, USP14, TXNRD1, and UTRN.

In a preferred embodiment, the prediction system further comprises an output module and/or a detection module, the output module outputs the judgment result of the analysis judgment module, and the detection module detects the protein expression level corresponding to the biomarker combination in the sample to be detected and transmits the expression level data to the analysis judgment module.

In an eighth aspect, the application provides a method of predicting autism spectrum disorder in a sample using a combination of reagents as described in the second aspect of the application, a combination of biomarkers as described in the third aspect of the application, a kit as described in the fourth aspect of the application, a predictive model as described in the sixth aspect of the application or a predictive system as described in the seventh aspect of the application.

A ninth aspect of the application provides the use of a combination of reagents as described in the second aspect of the application, a combination of biomarkers as described in the third aspect of the application, a kit as described in the fourth aspect of the application, a predictive model as described in the sixth aspect of the application or a predictive system as described in the seventh aspect of the application in the prediction of autism spectrum disorder.

In a tenth aspect the application provides the use of a combination of reagents according to the second aspect of the application for the preparation of a kit for the prediction or diagnosis of autism spectrum disorder disease; wherein the biomarker combination consists of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F7, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, fm4, MAPK1, RPS27A, SOD1, TPM1, and UBC.

An eleventh aspect of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the method according to the eighth aspect of the present application or performs the function of the predictive model according to the sixth aspect of the present application or the predictive system according to the seventh aspect of the present application.

A twelfth aspect of the application provides an electronic device comprising a memory storing a computer program for executing the computer program for carrying out the steps of the method according to the eighth aspect of the application or for carrying out the functions of the predictive model according to the sixth aspect of the application or the predictive system according to the seventh aspect of the application.

The inventor analyzes the plasma biomarker which can be applied to clinical diagnosis of the autism spectrum disorder by researching the variation trend of the protein molecule expression level in the plasma samples of the autism spectrum disorder patients and normal people similar to the conditions of the autism spectrum disorder patients, and provides possibility for early screening diagnosis and intervention treatment of the autism spectrum disorder patients.

On the basis of conforming to the common knowledge in the field, the above preferred conditions can be arbitrarily combined to obtain the preferred examples of the application.

The reagents and materials used in the present application are commercially available.

The application has the positive progress effects that:

the 52 protein biomarkers provided by the application are found through experiments that the expression level in a blood sample of an autism spectrum disorder patient has obvious change, so that the 52 protein biomarkers provided by the application can be used as risk prediction and detection of the autism spectrum disorder patient, have the advantages of high sensitivity and high specificity, and provide favorable technical support for early screening diagnosis, intervention treatment and the like of the autism spectrum disorder patient.

The corresponding auxiliary early diagnosis kit is developed based on the plasma protein molecular marker of the autism spectrum disorder patient, has wide scientific research value and provides great convenience for early clinical diagnosis, intervention treatment and the like.

Drawings

Figure 1 is a ROC curve under training set for 52 protein molecular biomarker combinations in autism spectrum disorder disease group and healthy control group.

Figure 2 is a ROC curve with 52 protein molecule biomarker combinations under internal validation in autism spectrum disorder disease group and healthy control group.

Fig. 3 is a schematic diagram of the architecture of a system for predicting the risk of autism spectrum disorder disease.

Fig. 4 is a schematic structural diagram of an electronic device.

Detailed Description

The application is further illustrated by means of the following examples, which are not intended to limit the scope of the application. The experimental methods, in which specific conditions are not noted in the following examples, were selected according to conventional methods and conditions, or according to the commercial specifications.

Example 1

The autism spectrum disorder disease group and healthy control group samples required in the examples were all from the third affiliated hospital of the university of Zhengzhou, of which 247 autism spectrum disorder disease group and 244 healthy control group. The design and implementation of this study was approved and supervised by the third affiliated hospital medical ethics committee of the university of zheng by ethical voting. Written informed consent was obtained for all patients.

1. Separation of plasma

Collecting whole blood sample, mixing in EDTA anticoagulant tube, centrifuging at 4deg.C for 10min with 1,600Xg, centrifuging, collecting supernatant (blood plasma) in new EP tube, centrifuging at 16,000Xg for 10min to remove cell debris, packaging blood plasma in centrifuge tube, and freezing at-80deg.C for use.

2. Plasma sample pretreatment

To 2. Mu.L of plasma sample was added 100. Mu.L of 50mM ammonium bicarbonate, vortexed and mixed for 1min, the sample was incubated at 95℃for 4min to thermally denature the protein, cooled to room temperature, 2. Mu.g of Trypsin (Trypsin) was added to the system, and the system was shaken for 18h at 37℃and then 10. Mu.L of aqueous ammonia was added to the system to stop the enzymatic hydrolysis. Desalting the peptide sample after enzymolysis, pumping, and freezing at-80 ℃ until mass spectrum detection.

3. Mass spectrometric detection of ASD plasma samples

The peptide sample was detected by a Orbitrap Fusion Lumos three-in-one high resolution mass spectrometry system (Thermo Fisher Scientific, rockford, USA) in tandem with a high performance liquid chromatography system (EASY-nLC 1200,Thermo Fisher) and mass spectrometry data of the whole protein corresponding to the peptide sample was obtained. The specific operation is as follows:

the nano-flow liquid chromatography is adopted, and the chromatographic column is a self-made C18 chromatographic column (150 μm ID×8cm,1.9 μm +.And (3) filling). The temperature of the column temperature box is 60 ℃. The dry powder peptide is re-dissolved by using a loading buffer (0.1% formic acid aqueous solution), separated by a chromatographic column after loading, eluted by 600nL/min of linear 6-30% mobile phase B (ACN and 0.1% formic acid), and a mass spectrum detection means of liquid phase combination Data Independent Acquisition (DIA) is utilized. The DIA mass spectrometry detection parameters were set as follows: the ion mode is positive ions; the resolution of the primary mass spectrum is 30K, the maximum injection time is 20ms, the AGC Target is 3e6, and the scanning range is 300-1400m/z; the secondary scanning resolution is 15K, 30 variable isolation windows are acquired, and the collision energy is 27%. The liquid chromatography tandem mass spectrometry system uses Xcalibur software control for data acquisition.

4. Data analysis

All data were using Firmiana. The Firmiana is a workflow based on Galaxy system, and consists of a plurality of functional modules such as a user login interface, raw data, identification and quantification, data analysis, knowledge mining and the like. DIA data were searched against the UniProt human protein database (updated at 2013.07.04, 32015 entries) using FragPipe (v 12.1) and MSFragger (2.2). The mass difference of the parent ion was 20ppm and the mass difference of the daughter ion was 50mmu. At most two leaky sites are allowed. The search engine sets cysteine carbamoyl methylation as the fixed modification and N-acetylation and oxidation of methionine as the variable modification. The parent ion charge range is set to +2, +3, and +4. The error discovery rate (FDR) was set to 1%.

The identified peptide fragment quantification results are recorded as the average of the peak areas of chromatographic fragment ions in all reference spectra libraries. Protein quantification was performed using the unlabeled intensity-based absolute quantification (iBAQ) method. We calculated the peak area values as part of the corresponding proteins. Total Fraction (FOT) is used to represent normalized abundance of a particular protein in a sample. FOT is defined as the iBAQ of the protein divided by the total iBAQ of all identified proteins in the sample. Proteins with at least one proprietary peptide fragment (unique peptide) and 1% fdr were selected for further analysis.

5. Establishing a predictive model

368 samples (75%) were randomly drawn from all samples as training sets (i.e., modeling sets), and the remaining 123 samples (25%) were used as internal test sets. First, 2631 more broadly most present proteins were screened by Frequency > 30%. And selecting molecules with obvious difference in expression (FOT difference multiple is more than 1.5 times and t-test p value is less than 0.05) between the sample of the autism spectrum disorder disease and the monitoring sample by comparison, wherein 355 proteins are selected as candidate markers.

Based on the regression classifier, and according to the logistic regression analysis, FOT values of the candidate markers are input into a Caret R package (namely, R language Caret package) to establish a prediction model.

The following biomarker combinations were analyzed and screened through the R-package:

ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F, FAH, FGB, GLOD, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1, and UBC.

6. Early screening protein biomarkers for autism spectrum disorder disease

Classifier set-up for early screening of protein molecules for autism spectrum disorder disease, including two stages of discovery, testing.

In the examples, 491 blood samples of autistic patients and healthy people are randomly divided into a training set and an internal test set, wherein the training set comprises 368 samples (75%) which are randomly extracted, and the rest 123 samples (25%) are used as the internal test set. And constructing a classifier by adopting a logistic regression (Logistic Regression) algorithm on 368 samples in the training set. Logistic regression employs a 10-fold cross-validation method in estimating error rates, first dividing 368 samples randomly into 10 aliquots. The model was constructed with 9 aliquots of the samples, and the test was performed with the remaining 10% of the samples, and repeated 10 times, and the average value of the ROC curve (Receiver Operating Curve) was calculated for 10 times.

Calculating AUC (Area Under the ROC Curve) for relative expression level ROC curves (Receiver Operating Curve) of 52 protein molecular markers (ALDH 2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F7, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1, UBC) in autism spectrum disorder plasma samples of a third affiliated hospital of university, analytical methods see Karimollah Hajian-tiki Receiver Operating Characteristic (ROC 20132, caspian J Intern Med; 4 (2):627-635. And models were built to analyze these markers.

The training set included 368 cases, auc=0.942, diagnostic sensitivity 84.46%, specificity 88.57% (see fig. 1), and the remaining 123 cases were internal validation sets, auc=0.867, diagnostic sensitivity 83.33%, and specificity 72.46% (see fig. 2).

From the above results, it can be seen that the use of 52 protein molecular markers (ALDH 2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F7, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, myll 12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1, UBC) in combination in the plasma of patients with autism spectrum disorders can be used for early screening and diagnosis of autism spectrum disorders.

And (3) inputting the FOT of the protein molecular biomarker, which is collected by the DIA, into the obtained prediction model for a sample to be tested, so as to obtain an output result of predicting the autism spectrum disorder disease. Namely, when the probability is greater than or equal to 0.5, judging that the autism spectrum disorder disease; when the probability is less than 0.5, the judgment is normal.

Example 2 System for predicting autism spectrum disorder disease

System 61 for predicting autism spectrum disorder disease: the data processing module 52 and the judging and outputting module 53 further include a data collecting module 51 (fig. 3).

The data collection module 51 is used to collect the expression level data of the biomarker combinations in the patient body fluid exosome sample and transmit them to the data processing module.

The data processing module 52 is configured to analyze the expression level data of the received or input biomarker combinations according to the data analysis method described in example 1 to obtain a calculation result. Wherein the expression level data of the biomarker combinations can be collected by the data collection module 51, and the expression level data of the biomarker combinations can also be obtained from other sources.

The judging and outputting module 53 is configured to judge whether the calculated result meets a preset judging condition, that is, the probability of suffering from the autism spectrum disorder disease is greater than or equal to the probability of not suffering from the autism spectrum disorder disease, so as to predict the risk of suffering from the autism spectrum disorder disease, and output a prediction result; wherein, in the judging and outputting module, when the expression level data satisfies that the probability of suffering from the autism spectrum disorder disease is greater than or equal to the probability of not suffering from the autism spectrum disorder disease, outputting a prediction result of "risk of suffering from the autism spectrum disorder disease"; when the expression level data does not meet the judgment condition, namely the probability of suffering from the autism spectrum disorder disease is smaller than the probability of not suffering from the autism spectrum disorder disease, outputting a prediction result as 'no risk of suffering from the autism spectrum disorder disease'.

Example 3 electronic device

The present embodiment provides an electronic device, which may be expressed in the form of a computing device (e.g., may be a server device), including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor may implement the method for predicting autism spectrum disorder disease in embodiment 1 of the present application when executing the computer program.

Fig. 4 shows a schematic diagram of the hardware structure of the present embodiment, and the electronic device 4 specifically includes:

at least one processor 91, at least one memory 92, and a bus 93 for connecting the different system components (including the processor 91 and the memory 92), wherein:

the bus 93 includes a data bus, an address bus, and a control bus.

The memory 92 includes volatile memory such as Random Access Memory (RAM) 921 and/or cache memory 922, and may further include Read Only Memory (ROM) 923.

Memory 92 also includes a program/utility 925 having a set (at least one) of program modules 924, such program modules 924 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

The processor 91 executes various functional applications and data processing, such as the data analysis method of embodiment 1 of the present application, by running a computer program stored in the memory 92.

The electronic device 9 may further communicate with one or more external devices 94 (e.g., keyboard, pointing device, etc.). Such communication may occur through an input/output (I/O) interface 95. Also, the electronic device 9 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 96. The network adapter 96 communicates with other modules of the electronic device 9 via the bus 93. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with the electronic device 9, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.

It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present application. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.

Embodiment 4 computer-readable storage Medium

An embodiment of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of predicting autism spectrum disorder in embodiment 1 of the present application.

More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In a possible embodiment, the application may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps of carrying out the method of predicting autism spectrum disorder in embodiment 1 of the application, when said program product is run on the terminal device.

Wherein the program code for carrying out the application may be written in any combination of one or more programming languages, which program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on the remote device or entirely on the remote device.

Finally, the above embodiments are only for illustrating the technical solution of the present application, and are not limiting.

Biomarker holly (refer to genegards database)

ALDH2：Aldehyde Dehydrogenase 2 Family Member

ARF5：ADP Ribosylation Factor 5

ATP6AP1：ATPase H+Transporting Accessory Protein 1

CD14：CD14 Molecule

CDC37：Cell Division Cycle 37

COL1A2：Collagen Type I Alpha 2 Chain

CYBB：Cytochrome B-245 Beta Chain

DST：Dystonin

F7：Coagulation Factor VII

FAH：Fumarylacetoacetate Hydrolase

FGB：Fibrinogen Beta Chain

GLOD4：Glyoxalase Domain Containing 4

GNPDA2：Glucosamine-6-Phosphate Deaminase 2

HBE1：Hemoglobin Subunit Epsilon 1

HMGCL：3-Hydroxy-3-Methylglutaryl-CoA Lyase

HSPA1L：Heat Shock Protein Family A(Hsp70)Member 1 Like

IGHD：Immunoglobulin Heavy Constant Delta

IMMT：Inner Membrane Mitochondrial Protein

ITGA6：Integrin Subunit Alpha 6

MCAM：Melanoma Cell Adhesion Molecule

MRC1：Mannose Receptor C-Type 1

MSL1：MSL Complex Subunit 1

MYL12B：Myosin Light Chain 12B

NUCB1：Nucleobindin 1

OGDH：Oxoglutarate Dehydrogenase

PF4：Platelet Factor 4

PF4V1：Platelet Factor 4 Variant 1

PFN1：Profilin 1

POTEJ：POTE Ankyrin Domain Family Member J

PROCR：Protein C Receptor

PSMA3：Proteasome 20S Subunit Alpha 3

PTPRG：Protein Tyrosine Phosphatase Receptor Type G

RPS3：Ribosomal Protein S3

SCP2：Sterol Carrier Protein 2

TPI1：Triosephosphate Isomerase 1

VASN：Vasorin

ANXA5：Annexin A5

ASAH1：N-Acylsphingosine Amidohydrolase 1

CALD1：Caldesmon 1

CSE1L：Chromosome Segregation 1 Like

EXOC1：Exocyst Complex Component 1

FTH1：Ferritin Heavy Chain 1

GNPDA1：Glucosamine-6-Phosphate Deaminase 1

HSPA8：Heat Shock Protein Family A(Hsp70)Member 8

ITGB1：Integrin Subunit Beta 1

MTA2：Metastasis Associated 1 Family Member 2

OLFM4：Olfactomedin 4

MAPK1：Mitogen-Activated Protein Kinase 1RPS27A：Ribosomal Protein S27a SOD1：Superoxide Dismutase 1TPM1：Tropomyosin 1

UBC：Ubiquitin C。

Claims

1. Use of a biomarker combination consisting of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F7, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1 and UBC for the preparation of a product for predicting autism spectrum disorder disease.

2. A reagent combination for detecting a biomarker combination, characterized in that the biomarker combination consists of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F7, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, myll 12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC1, FTH1, GNPDA1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1 and UBC.

3. The combination of reagents according to claim 2, wherein the combination of reagents is used to detect the expression level of the combination of biomarkers,

and/or, the combination of reagents comprises reagents for genomic, transcriptome and/or proteomic sequencing,

and/or the combination of reagents comprises a biomolecular reagent that specifically binds to the biomarker, or specifically hybridizes to a nucleic acid encoding the biomarker.

4. The combination of reagents according to claim 3, wherein the expression level is protein expression level and/or mRNA transcription level and/or the biomolecular reagent is selected from one or more of a primer, a probe and an antibody.

5. A biomarker combination, characterized in that the biomarker combination consists of ALDH2, ARF5, ATP6AP1, CD14, CDC37, COL1A2, CYBB, DST, F, FAH, FGB, GLOD4, GNPDA2, HBE1, HMGCL, HSPA1L, IGHD, IMMT, ITGA6, MCAM, MRC1, MSL1, MYL12B, NUCB1, OGDH, PF4V1, PFN1, POTEJ, PROCR, PSMA3, PTPRG, RPS3, SCP2, TPI1, VASN, ANXA5, ASAH1, CALD1, CSE1L, EXOC, FTH1, gn 1, HSPA8, ITGB1, MTA2, OLFM4, MAPK1, RPS27A, SOD1, TPM1 and UBC.

6. A kit comprising a combination of reagents according to any one of claims 2 to 4 and/or a biomarker combination according to claim 5.

7. A method of constructing a predictive model of autism spectrum disorder disease, the method comprising: inputting protein expression data corresponding to biomarker combinations in the sample into an R language Caret packet containing a logistic regression model for machine learning to obtain an autism spectrum disorder disease prediction model;

8. The method of claim 7, wherein the sample is from blood, urine, saliva, or cerebrospinal fluid;

and/or, before machine learning, the sample acquires the protein expression data in a DIA mode and carries out peptide segment matching through Firmiana software;

and/or, the sample is from a patient and healthy person comprising autism spectrum disorder disease;

and/or, inputting the FOT of the protein corresponding to the biomarker combination as protein expression data into an R language Caret package of a logistic regression model for machine learning;

and/or grouping the samples before machine learning to obtain a modeling group sample and a verification group sample, wherein the modeling group sample is used for constructing the autism spectrum disorder disease prediction model, and the verification group sample is used for verifying the autism spectrum disorder disease prediction model.

9. The method of claim 8, wherein the peptide fragment matching utilizes a UniProt human protein database;

and/or, the protein expression data input into the logistic regression model satisfies: the expression of the protein corresponding to the biomarker combination of the patient in the sample is 1.5 or more than 1.5 of the expression of the corresponding protein of the healthy human, and the t-test p value is less than 0.05;

and/or inputting protein expression data of the logistic regression model to the protein abundance of more than or equal to 30%;

and/or the step of employing the validation set sample for validation comprises: calculating the area under the line, the sensitivity and the specificity of a specificity curve of the protein expression data of the biomarker combinations in the sample; judging the accuracy of a prediction model according to the offline area, sensitivity and specificity;

and/or, the method further comprises determining whether the sample has autism spectrum disorder,

and judging that the disease is diseased when the probability of the predicted sample for the autism spectrum disorder disease is greater than or equal to 0.5, and judging that the disease is normal when the probability of the predicted sample for the autism spectrum disorder is not greater than or equal to 0.5.

10. A predictive model of an autism spectrum disorder disease, wherein the predictive model is constructed by a method according to any one of claims 7 to 9.

11. A prediction system for autism spectrum disorder diseases is characterized in that the prediction system comprises an analysis and judgment module,

the analysis and judgment module comprises the prediction model as claimed in claim 10, and outputs a prediction result of whether the sample suffers from autism spectrum disorder.

12. The prediction system of claim 11, wherein the prediction system further comprises an output module and/or a detection module; the output module outputs the judging result of the analysis judging module, and the detection module detects the protein expression level corresponding to the biomarker combination in the sample to be detected and transmits the expression level data to the analysis judging module.

13. Use of the predictive model according to claim 10 or the predictive system according to claim 11 or 12 for the prediction of autism spectrum disorder diseases.

14. Use of a combination of agents according to any one of claims 2 to 4 for the preparation of a kit for the prediction or diagnosis of diseases of autism spectrum disorder.

15. A computer readable storage medium storing a computer program, which, when executed by a processor, performs the function of the predictive model of claim 10 or the predictive system of claim 11 or 12.

16. An electronic device comprising a memory storing a computer program and a processor, wherein the processor is configured to execute the computer program to implement the functionality of the predictive model of claim 10 or the predictive system of claim 11 or 12.