Disclosure of Invention
In order to solve the technical problems, the invention provides a corresponding solution:
because gout attack can cause systemic metabolic change, the invention discovers different metabolites and potential metabolic pathways by using metabonomics and machine learning algorithms, provides biomarkers for identifying recurrent gout and frequent gout, and establishes a prediction model for identifying recurrent gout and frequent gout according to the biomarkers so as to effectively distinguish InGF and FrGF.
The invention provides a novel biomarker for identifying sporadic gout and frequent gout, wherein the biomarker is 4-trimethyl-ammioniobutanoic acid, 5'-methylthioadenosine (5' -methylthioadenosine), arachidonic acid (arachidic acid), taurine (taurines), uridine (uridines) and xanthine (xanthines).
The prediction model for identifying the sporadic gout and the frequent gout, which is established according to the biomarker, is characterized in that the identification and judgment formula of the model is as follows:
Predictionscore=e logit(P) /(1+e logit(P) )
logit(P)=2.00-0.21×[Arachidic acid]-0.12×[Xanthine]+0.41×[4-Trimethylammoniobutanoicacid]+0.08×[Taurine]+0.07×[5′-Methylth[oadenosine]-1.93×[Uridine]
wherein, (4-trimethyl-ammioniobutanoic acid) is 4-trimethylaminobutyric acid, (5 '-methylmethadenosine) is 5' -methylthioadenosine, (arachidic acid) is arachidonic acid, (taurines) is taurine, (uridines) is uridine and (xantine) is xanthine;
if the score is more than 0.5, the probability that the patient belongs to frequent gout is high; conversely, when the score is < 0.5, the patient will be classified as sporadic gout.
A kit for identifying sporadic and frequent gout, the kit comprising the biomarker described above.
The application of the biomarker in the preparation of the kit for identifying the sporadic gout and the frequent gout is applied to identifying the sporadic gout and the frequent gout.
The application of the model in the preparation of the kit for identifying the sporadic gout and the frequent gout is applied to identifying the sporadic gout and the frequent gout.
A computer-readable storage medium having stored thereon a computer program for execution by a processor to calculate an identification judgment formula of:
Predictionscore=e logit(P) /(1+e logit(P) )
lgit(P)=2.00-0.21×[Arachidic acid]-0.12×[Xanthine]+0.41×[4-Trimethylammoniobutanoic acid]+0.08×[Taurine]+0.07×[5′-Methylthioadenosine]-1.93×[Uridine]
wherein, (4-trimethyl-ammioniobutanoic acid) is 4-trimethylaminobutyric acid, (5 '-methylmethadenosine) is 5' -methylthioadenosine, (arachidic acid) is arachidonic acid, (taurines) is taurine, (uridines) is uridine and (xantine) is xanthine;
if the score is more than 0.5, the probability that the patient belongs to frequent gout is high; conversely, when the score is < 0.5, the patient will be classified as sporadic gout.
An apparatus comprising an input device and a computing device, wherein the input device is to input 4-trimethylaminobutyric acid (4-trimethyl-ammioniobutanoic acid), 5'-methylthioadenosine (5' -methylthioadenosine), arachidonic acid (arachidic acid), taurine (taurines), uridine (uridines), xanthine (xanthines);
the computing device is configured to calculate from the input biomarkers using the following formula:
Prediction score=e logit(P) /(1+e logit(P) )
logit(P)=2.00-0.21×[Arachidic acid]-0.12×[Xanthine]+0.41×[4-Trimethylammoniobutanoic acid]+0.08×[Taurine]+0.07×[5′-Meihylthioadenosine]-1.93×[Uridine]
if the score is more than 0.5, the probability that the patient belongs to frequent gout is high; conversely, when the score is < 0.5, the patient will be classified as sporadic gout.
A computer device comprising a memory, a processor, and a program stored on the memory and executable on the processor, wherein the processor performs the following formula calculations when executing the program:
Prediction score=e logit(P) /(1+e logit(P) )
logit(P)=2.00-0.21×[Arachidic acid]-0.12×[Xanthine]+0.41×[4-Trimethylammoniobutanoic acid]+0.08×[Taurine]+0.07×[5′-Methylthioadenosine]-1.93×[Uridine]
wherein, (4-trimethyl-ammioniobutanoic acid) is 4-trimethylaminobutyric acid, (5 '-methylmethadenosine) is 5' -methylthioadenosine, (arachidic acid) is arachidonic acid, (taurines) is taurine, (uridines) is uridine and (xantine) is xanthine;
if the score is more than 0.5, the probability that the patient belongs to frequent gout is high; conversely, when the score is < 0.5, the patient will be classified as sporadic gout.
Benefits of the present application include, but are not limited to:
1. the invention uses a multivariable selection method based on machine learning to identify potential metabolic biomarkers, and obtains six biomarkers through further verification of targeted metabolomics: 4-trimethylaminobutyric acid, 5' -methylthioadenosine, arachidonic acid, taurine, uridine and xanthine.
2. The invention is based on six biomarkers, and the area estimation values under the working characteristic curve (ROC) of the subjects for distinguishing InGF and FrGF in the discovery queue and the verification queue are respectively 0.88 and 0.67.
3. According to the method, a feasible quantization method for recognizing the sporadic gout and the frequent gout is established through a model formula according to a prediction model for recognizing the sporadic gout and the frequent gout established by the biomarker.
4. The invention also comprises the technical schemes of a corresponding diagnosis kit, a detection system, a computing system and the like based on six biomarkers, and by the methods, the identification of the sporadic gout and the frequent gout becomes a realistic and feasible operation.
5. The invention provides unprecedented insight for the metabolic basis of gout attack frequency, proves the unique metabonomics profile of frequent gout and sporadic gout groups, and proves that the metabonomics characteristics can distinguish different clinical manifestations of gout.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following description of the preferred embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses. The specific description is as follows: the following examples, in which the specific conditions are not specified, are conducted under conventional conditions or conditions recommended by the manufacturer, and the raw materials used in the following examples are commercially available from ordinary sources except for the specific descriptions.
The specific conditions are not specified in the embodiments and are carried out according to conventional conditions or conditions recommended by the manufacturer.
In the following embodiments, unless specified otherwise, the reagents or apparatus used are conventional products available commercially without reference to the manufacturer.
Study and sample collection
We have incorporated 638 patients from the gout dedicated disease clinic visits at the affiliated hospitals of the Qingdao university from 1 month 2019 to 11 months 2021. The discovery queue includes: sporadic gout 163 people and frequent gout 239 people, and the verification queue comprises: 97 persons with sporadic gout and 139 persons with frequent gout. The diagnosis of gout accords with the gout classification standard of ACR/EULAR in 2015 and the Chinese hyperuricemia and gout diagnosis and treatment guide in 2019. Sporadic gout (InGF) is defined as the number of times of gout flares in the past 1 year is less than or equal to 1 (patient self report), and frequent gout (FrGF) is defined as the number of times of gout flares in the past 1 year is more than or equal to 2. All participants were males, between 18 and 70 years of age. To minimize confounding factors for other metabolic diseases, exclusion criteria included: (a) At present, the patient has malignant tumor or has history of malignant tumor, (b) chronic renal failure (eGFR is less than 15 mL/min.1.73 m) 2 ) (c) liver function abnormality (glutamic pyruvic transaminase or total bilirubin is not less than 2 times of normal value). The study of the invention was approved by the ethical committee of affiliated hospitals at the Qingdao university, and all subjects signed informed consent.
Demographic data and sample collection
Demographic and demographics (age, height, weight, diastolic Blood Pressure (DBP), number of gout flares, and other biochemical indicators) were obtained from all participants. After overnight fast, all participants collected peripheral venous blood in vacuum negative pressure blood collection tubes and coagulated for 30 minutes at room temperature. Then centrifuged at 3500rpm for 10min, the supernatant was separated and stored in a-80 ℃ refrigerator for subsequent biochemical index measurement and LC-MS analysis.
Clinical biochemical index measurement
Alanine Aminotransferase (ALT), aspartic acid Aminotransferase (AST), glucose, triglyceride (TG), cholesterol (CH), urea nitrogen (BUN), creatinine (Cr), uric Acid (UA) were measured using an automated biochemical analyzer (roche, germany).
Sample processing
Serum was collected independently from the discovery and validation queues and pre-treated, with the two queues being separated by 6 months. All samples were subjected to the same treatment procedure. Briefly, 50 μl of serum was mixed with 200 μl of methanol and an internal standard (fluorouracil) at 4deg.C and incubated for 15min to precipitate the proteins. After vortexing at 4 ℃ and centrifugation at 14000rpm, the supernatant was collected and stored at-80 ℃ for analysis. Quality Control (QC) samples were prepared by randomly collecting 200 aliquots of serum samples and mixing.
Non-targeted metabonomics analysis
Non-targeted metabonomics analysis was analyzed by Ultra High Performance Liquid Chromatography (UHPLC) (Nexera UHPLC-30A, japan, shimadzu) in combination with a mass spectrometry system (triple TOF 6600, AB SCIEX). The mass spectrometer operates on positive and negative ions in data-dependent acquisition (data-dependent acquisition, DDA) mode. The mass range of data acquisition is 60-1200 Da, the scanning frequency is 4Hz, and the collision energy is 30eV. UHPLC separation was performed using an ACQUITY UPLC BEH amide column (100 mm. Times.2.1 mm,1.7 μm; waters). Mobile phase a was ultrapure water containing 25mmol/L ammonium hydroxide (NH 4 OH) and 25mmol/L ammonium acetate (NH 4 Ac), and mobile phase B was pure Acetonitrile (ACN). The flow rate was maintained at 0.5mL/min and the oven temperature was maintained at 25 ℃. The separation was performed according to the following gradient: 0-0.5 min, 95% b;0.5-7 minutes, 95% b to 65% b;7-8 minutes, 65% b to 40% b;8-9 minutes, 40% B;9-9.1 minutes, 40% b to 95% b;9.1-12 minutes, 95% B. The sample loading was 2. Mu.L. QC (quality control) samples and blank samples were inserted between every 8 samples and cycled in this manner for data conditioning and instrument cleaning.
The raw data (. Wiff file) was converted by the Proteowizard software to mzML format (https:// Proteowizard. Sourceforge. Io) and then analyzed using the XCMS extension package of R software. The analytical procedure written in R is as follows: detecting peak value by using a Centwave algorithm, wherein the expected error is 20ppm, the peak width is 5-30 s, and the s/N threshold value is 10; retention Time (RT) is adjusted by a subset-based algorithm that adjusts RT of the sample by the previous and subsequent QC of the sample; peak correspondence was performed by a peak density method (peak density methos), and a peak appearing in > 40% of samples was used as a feature (feature); retrieving the missing value from the original file, and if there is no retrieved value, filling the missing value with half of the minimum value; a normalization method based on support vector regression is adopted to eliminate batch effect (batch effect). Features with a coefficient of variation exceeding 30% were excluded from the subsequent analysis.
Based on the precursor m/z and retention time, an MS/MS spectrum is extracted and corresponds to each MS feature. The spectra were first determined by our internal metabolite database, which contains the exact m/z, retention time and MS/MS spectra of the metabolites. Thus, such feature annotations may be considered level 1 according to MSI.
Next, the spectra were identified by several public databases (including HMDB, moNA, massbank and GNPS). The note that the vector dot product > 0.3 is considered to be level 2, otherwise (+.0.3) is considered to be level 3.
Furthermore, we analyzed the profile and MS/MS secondary spectra using MetDNA, which identified metabolites using a recursive algorithm based on the metabolic reaction network (metabolic reaction network-based recursive algorithm) and gave MSI class 4 features. If a feature is identified by multiple annotations, the MSI and priority of the product are considered to select the most reliable annotation.
Targeted metabonomics analysis
Targeted metabolomics was performed using the same chromatographic conditions as non-targeted metabolomics. In contrast, HPLC was combined with triple quadrupole mass spectrometry (LCMS-8050, kyoto Shimadzu, japan) for quantification of metabolites in Multiplex Reaction Monitoring (MRM) mode. For each metabolite, ion polarity, precursor and product ion transitions are extracted from the non-targeted metabonomics data and the ion monitoring parameters are optimized separately. Data processing was performed by Labsolutions software (Kyoto island, japan).
Statistical analysis
All statistical analyses were performed as R (4.0.5). Unsupervised Principal Component Analysis (PCA) was performed by the mixomic extension package of the R software package. Minimum absolute contraction and selection operator (LASSO) regression was performed by the gate extension package of R software as a continuous variable to predict gout's frequency of onset, using 10-fold cross validation in model training to avoid overfitting. Using the ropls extension package of R software, a supervised orthogonal partial least squares discriminant analysis (OPLS-DA) was performed to distinguish between InGF and FrGF, and 200 permutation tests were performed to avoid overfitting. The frequency of onset of gout was considered a discrete variable for the KrusKal-Wallis test, and metabolites with significant differences between different onset frequencies were used for cluster analysis. To identify differential metabolites, we used the Mann-Whitney U test to examine changes in metabolites. The metabolite and pathway classes follow KEGG BRITE (br 08001: compounds with biological roles, br08901: KEGG pathway maps). Enrichment analysis of metabolic pathways was performed by global test according to MSEA. Enrichment analysis was performed only on the pathways belonging to the taxonomic metabolism in br 08901. For metabolic network analysis, we constructed a KEGG-based whole metabolic network consisting of 5226 biological entities and 32020 connections using FELLA extension package of R software. Network analysis using differential metabolites showed the first 250 interfered nodes in the subnetwork. The selection of biomarkers was performed by MUVR extension package of R software (MUVR: multivariate method of unbiased variable selection in R).
Briefly, all samples were randomly divided into 20 external replicates, and in each external replicate, samples were randomly divided into 200 internal replicates for model training and parameter adjustment. The optimal model is evaluated, a variable of 70% is selected according to the importance of the optimal model, and the selected variable is continuously input for another iteration. For each repetition, a random forest model is built and parameters are automatically adjusted. The area under the subject's working characteristics (AUC) was calculated to assess the performance of the model and the variable importance was ranked for comparison between replicates. The predictive model is built from the caret extension package of R software, and if available, each model parameter is adjusted to optimize its performance. In each cohort, the intensity (non-targeted metabolomics) or ratio to internal standard (targeted metabolomics) was normalized, ranging from 0 to 1, to build the model.
Analysis of results
Clinical characteristics of gout patients
To describe the serum metabolic characteristics of InGF and FrGF patients, we performed a serometabonomic study on 163 InGF patients and 239 FrGF patients found in the cohort, the overall flow of the study of the invention being shown in FIG. 1 (FIG. 1: overall flow chart of the study).
Next, we used a machine learning algorithm to select potential metabolites to distinguish between InGF and FrGF. The predictive model was further optimized and validated using targeted metabonomics techniques on a single cohort of 236 subjects (including 97 InGF patients and 139 FrGF patients, respectively).
Table 1 summarizes the clinical characteristics of these two independent queues. The gout flares distribution (flare distribution) in both queues was similar (table 1 and fig. 6A). As expected, the number of tophus and serum uric acid levels in FrGF patients were significantly higher than in inf patients; the proportion of ULT and anti-inflammatory drugs taken by patients in the FrGF group is significantly greater. In both queues, other biochemical parameters are similar.
Table 1 clinical information of study cohort
For categorical variables,data are presented as n(%).For continuous variables , values are presented as mean(±SD)when they conform to normal distribution.Otherwise,they are presented as medain(quartile).
InGF:infrequent gout flares;FrGF:frequent gout flares;ULT:urate-lowering therapy;DBP:diastolic blood pressure;BMI:body mass index;ALT:alanine aminotransferase;AST:aspartate aminotransferase;GLU:glucose;TG:triglycerides;CH:cholesterol;BUN:blood urea nitrogen;CREA:creatine;UA:uric acid;CCR:creatinine clearance rate;eGFR:estimated glomerular filtration rate.
a At least 20cigarette packs in a lifetime or at least one cigarette a day for at least 1year.
b Alcohol intake at least once a week for 6months.
*indicate p<0.05(FrGF versus InGF);#indicate p<0.05(Discovery versus Validation cohort);
Serum metabolic profile that varies with frequency of onset
First, we performed metabonomic analysis on serum samples in the discovery cohort using a non-targeted metabonomics method, with 14141 metabolic features determined in total in positive and negative ion modes (fig. 1, 6B). Interestingly, in Principal Component Analysis (PCA), we observed an overall trend of the metabolic profile in the frequency dependence of gout flares (fig. 2A). Furthermore, the InGF group and the FrGF group, in PCA (FIG. 2B), can be distinguished to some extent; in Orthogonal Partial Least Squares Discriminant Analysis (OPLSDA), the global metabolic characteristics of the InGF and FrGF groups can be completely separated without overfitting (fig. 2C and 6C). We then built a least absolute contraction and selection operator (LASSO) regression model to predict gout flares frequency from metabolic characteristics, with predicted flares frequency matching very well with actual flares frequency (fig. 2D). More importantly, patients in the InGF group were explicitly predicted to be ∈1, while patients in the FrGF group were predicted to be ∈2 (FIG. 2D). Furthermore, we performed a cluster analysis on the annotated 3560 compounds and observed a clear difference between gout at different frequencies of onset (fig. 2E). Consistent with the unsupervised and supervised models, the samples of the InGF group showed a highly similar pattern, unlike the patients of the FrGF group. Together, these results strongly support that the use of metabolic profiles can distinguish between InGF and FrGF.
Differential metabolites and deregulated metabolic pathways between InGF and FrGF
To explore the changes in metabolites and metabolic pathways between InGF and FrGF, we performed a Mann-Whitney U test to determine the differential metabolites between InGF and FrGF in the discovery queue. Of the noted metabolites, frGF patients had 116 metabolites up-regulated (FDR < 0.05 and fold change > 4/3) and 323 metabolites down-regulated (FDR < 0.05 and fold change < 3/4) compared to InGF (FIG. 3A, supplementary Table 1 and supplementary Table 2).
Up-regulated metabolites in FrGF patients of table 1
Supplementation of FrGF patients with down-regulated metabolites
These differential metabolites play an important role in different biological functions. According to the biological function nomenclature known in KEGG (kyoto genome encyclopedia) biological knowledge database, in FrGF patients the down-regulated metabolites are mostly of organic acids, lipids, steroids, hormones and transmitters, while the up-regulated metabolites are mostly carbohydrates (fig. 3B). We then performed a quantitative enrichment analysis based assay on the 64 metabolic pathways of KEGG. As a result, there were 57 metabolic pathways with significant changes between InGF and FrGF (FDR < 0.05), and significantly deregulated pathways were mostly involved in carbohydrate metabolism, amino acid metabolism and nucleotide metabolism (FIG. 3C). Citric acid cycle (TCA cycle), amino sugar and nucleotide sugar metabolism, glyoxylic acid and dicarboxylic acid metabolism, glycolysis/gluconeogenesis are major differences in carbohydrate metabolism; several metabolites, such as pyruvic acid and oxaloacetic acid, are enriched in various carbohydrate metabolic pathways (fig. 3C, fig. 7). Amino acid metabolism, particularly alanine, aspartic acid and glutamic acid metabolism, showed significant changes, and a large number of amino acids and their derivatives were enriched in these pathways (fig. 3C, fig. 7). Nucleotide metabolism, mainly purine metabolism, was also enriched because of the changes in several purine metabolites, xanthine (xanthine), hypoxanthine (hypoxanthine) and uric acid (fig. 3C, fig. 7). Next, we constructed an anabolic landscape in the discovery queue (metabolic landscape) (fig. 3D). Among the seven most significantly altered metabolic pathways (inner circles), the metabolic profile between InGF and FrGF overall exhibited a clear separation.
Next, we applied a network propagation-based algorithm, FELLA, to study the crosstalk (cross talk) between these significantly disturbed single metabolic pathways between InGF and FrGF. The R-packet takes as input statistically different metabolites to evaluate each node (metabolite, enzyme, and reaction) and each edge (hierarchical connection) in the KEGG overall metabolic network to determine the sub-network with the greatest interference between InGF and FrGF. Interestingly, cross talk (cross talk) between purine metabolism and caffeine metabolism is the most clearly disturbed subnetwork (fig. 4). These two pathways focus on uric acid metabolism and are one of the most unique clinical parameters that distinguish between InGF and FrGF. Xanthine Dehydrogenase (XDH) is the rate-limiting enzyme for uric acid formation and thus is a therapeutic target for uric acid lowering drugs, such as febuxostat and allopurinol, and appears to play a key role in regulating caffeine and purine metabolism. Furthermore, the upregulation of XDH was responsible for elevated uric acid levels in FrGF compared to InGF (fig. 8). Upregulation of XDH also resulted in decreased levels of caffeine (caffeine), 1, 7-dimethylxanthine (1, 7-dimethyl xanthine), theophylline (theophylline) and 1-methylxanthine (1-methylxanthine) in caffeine metabolism (FIG. 8). These findings indicate that xanthine is a key metabolite linking these two pathways. In addition to being an endogenous metabolite in purine metabolism, enteric bacteria can synthesize xanthine from ingested caffeine or xanthosine.
The changes in the alanine, aspartate and glutamate metabolic sub-networks are related to taurine and hypotaurine metabolic sub-networks and primary bile acid biosynthesis (fig. 4). We observe four key enzymes linking these three metabolic pathways: serine-glyoxylate aminotransferase (serine-glyoxylate transaminase), aspartate 1-decacarboxylase (aspartate 1-decacarboxylase), palmitoyl-CoA hydrolase (palmitoyl-CoA hydrolase), and bile acid CoA (bile acid-CoA): amino acid N-acylases (FIG. 4, enzymes 15-18). Interestingly, bile acid synthesis is also affected by intestinal microbiomes, and interactions of these sub-networks strongly suggest that intestinal bacterial interactions with the host are involved in inflammation of InGF and FrGF.
To further investigate the effect of drugs (table 1) on metabonomics, we analyzed by excluding patients treated with each drug, and found that the effect of various drug treatments on the number of differential metabolites between InGF and FrGF was limited. Interestingly, the effect of allopurinol was much smaller than that of febuxostat, with a rate of overlap of 98% and 69% for the differential metabolites, respectively, compared to patients not treated with any drug (fig. 9A). Importantly, most important metabolites involved in purine metabolism, arachidonic acid metabolism, bile acid metabolism and aspartic acid metabolism remain statistically significant (fig. 9B). Thus, the metabolic changes observed in the studies of the present invention are mainly caused by endogenous metabolic pathways.
Selection of metabolic biomarkers using targeted metabolomics to build predictive models and validation in a separate cohort
To screen metabolites and build predictive models to distinguish between patients in the InGF and FrGF groups, we applied the multivariate selection algorithm MUVR on all determined metabolites, with confidence levels of MSI class 1 and class 2 metabolites, and tested in a Machine Learning (ML) model, including Support Vector Machines (SVM), random Forest (RF) and LASSO. To select the most predictive and robust (robust) metabolites, we performed 20 external replicates using MUVR, each comprising 200 internal replicates, for iterative variable selection according to their importance ranking (iterative variable selection). The first 6 predicted variables were sufficient to construct a model with an AUC of 0.985, although inclusion of more variables resulted in an increase in AUC (fig. 5A and 5B). In the iteration, 21 metabolites were in stable position and thus contributed most to the predictive model, while the other 35 metabolites in unstable position (turbulent position) also showed predictive ability to varying degrees (fig. 10A-B). All these metabolites are possible for use in predictive models.
It is well known that non-targeted metabonomics is quantitatively limited, whereas triple quadrupole mass spectrometry based on Multiple Reaction Monitoring (MRM) is a quantitative gold standard. Next, we began to build predictive models for InGF and FrGF using a Multiple Reaction Monitoring (MRM) based approach. For each selected biomarker we constructed transitions from precursor ion to product ion pairs from non-targeted MS/MS spectra (fig. 5C) and manually optimized other MS parameters (supplementary table 3).
Precursor and product mass to core ratios and retention times for supplementation of 25 metabolites of Table 3
m/z:mass-to-charge ratio;RT:retention time.
In the discovery cohort, 25 biomarkers were measured in total, 14 of which showed a high correlation between non-targeting and targeting methods (fig. 5D and 11). Next, we applied the same multivariate selection method to determine the most practical metabolite numbers. We build a model using the discovery queue and validate the model using the validation queue. The AUCs of both the discovery and validation queues tended to rise and fall. When 6 metabolites [ 4-trimethylaminobutyric acid (4-trimethyl-ammioniobutanoic acid), 5'-methylthioadenosine (5' -methyladenosine), arachidonic acid (arachidic acid), taurine (taurines), uridine (uridines), and xanthine (xanethine), fig. 5F ] were selected, the model reached the best AUR values in both the discovery and validation queues (fig. 12A). After optimization of the multiple machine learning algorithms (fig. 12B), the AUC in the queue was found to be 0.88, while the AUC of the validation queue was found to be 0.67 (fig. 5E). Notably, we also tried to incorporate various drugs into the model, but no significant improvement was found (fig. 12C). Thus, the last 6 selected biomarkers (fig. 5F and 12) were included in the logistic regression model and the following formula was derived:
Predictionscore=e logit(P) /(1+e logit(P) )
logit(P)=2.00-0.21×[Arachidic acid]-0.12×[Xanthine]+0.41×[4-Trimethylammoniobutanoic acid]+0.08×[Taurine]+0.07×[5′-Methylthioadenosine]-1.93×[Uridine]
each metabolite in the above formula was normalized. If the score is > 0.5, the patient has a higher likelihood of belonging to FrGF; conversely, when the score is < 0.5, the patient will be classified as InGF.
Comprehensive analysis
The present invention discloses a systematic metabolic profile and related metabolic pathways that are able to distinguish between InGF and FrFG. Gout flares are positively correlated with systemic changes in serum metabolome, and studies of the present invention have determined metabolic profiles associated with InGF and FrGF. Next, we selected and validated a set of 6 metabolites using three machine learning algorithms that differentiated the InGF and FrGF in separate validation queues.
Systematic analysis of circulating metabolites using metabolomics revealed a variety of metabolic pathways associated with InGF and FrGF (FIG. 2). Among them, carbohydrate metabolism was ranked most top in significantly altered pathways, and was mainly represented by an increase in oxalic acid succinic acid, oxalic acid, 2, 3-diphosphoglycemic acid and a decrease in citric acid in TCA cycle and glycolytic metabolism (supplementary table 1, supplementary table 2).
TCA and glycolysis are central to the metabolic activity of organisms and are involved in many metabolic diseases, arthritis and inflammation. For example, recombination (rewiring) to glycolysis accounts for macrophage activation and inflammatory factor release by MSU crystals. Glutamate and aspartate metabolism is one of the most altered amino acid metabolic pathways, consistent with previous studies. Interestingly, whole genome association analysis (GWAS) of gout revealed metabolic pathways similar to gout-related gene loci. Common missense variants of the CPS1 and GLS2 genes involved in glutamine metabolism have been found to be associated with lower plasma glutamine levels and identified as gout susceptibility gene loci. Glutamine, which serves as a substrate for the first and rate-limiting steps of the de novo purine biosynthesis, is used as an amino donor to produce 5-phosphoribosyl amine (5-PRA) and glutamic acid; further synthesis of purine and uric acid would use glutamic acid and aspartic acid. Furthermore, aspartic acid and glutamic acid are substrates for epigenomic reprogramming, which occurs in the "training" of the innate immune system by soluble uric acid, which makes the innate immune system more reactive towards MSU crystals. On the other hand, some lipids and fatty acids in FrGF are significantly reduced compared to InGF, such as arachidonic acid and eicosapentaenoic acid. Eicosanoids such as prostaglandin E2 and prostaglandin D2, or oxidized lipids (oxyipins), downstream metabolites of arachidonic acid, are involved in inflammatory and painful reactions associated with gout and various rheumatic diseases. A recent study found that some serum oxidized lipids are biomarkers for early onset of gout in adolescents. The bile acids (e.g., glycocholic acid and chenodeoxycholic acid) were significantly reduced in FrGF group patients, consistent with previous studies in rheumatoid arthritis and gout. Interestingly, the pathway of bile acid synthesis is also affected by the intestinal microbiome. Taken together, all of these data strongly suggest that the interaction of the intestinal flora with the host and the epigenetic modification of certain key metabolic enzymes may be related to inflammation of InGF and FrGF.
Further network analysis using network propagation-based algorithms revealed cross-talk (cross-talk) between different metabolic pathways, which may play a role in mediating metabolic changes in InGF and FrGF, which provides a systematic insight for a better understanding of the underlying metabolic pathophysiology. Consistent with pathway enrichment analysis, the overall interference is focused in a sub-network consisting of purine metabolism and caffeine metabolism. While previous studies have linked coffee intake to serum uric acid concentrations and reduced gout risk, which are associated with multiple alleles of several SNPs, potential pathways linking caffeine metabolism to uric acid formation and gout remain to be established.
Furthermore, we determined a key role for XDH in linking these two pathways (fig. 4). XDH is shared among multiple degradation steps of xanthine and caffeine derivatives. In agreement with this, the reduction of xanthosine and xanthine and the increase of uric acid indicate a higher XDH activity of FrGF, which may be responsible for the reduction of 1, 7-dimethylxanthine, theophylline and methylxanthine and the increase of serum uric acid in FrGF patients (FIG. 8).
In addition, there is an interaction between taurine and hypotaurine metabolism, primary bile acid biosynthesis, alanine, aspartic acid and glutamic acid metabolism. Previous studies have shown reduced bile acid biosynthesis in gout patients and in rat models. Bile acids inhibit XDH by peroxisome proliferator-activated receptor- α (PPAR- α).
In summary, the study of the present invention reveals interference in multiple metabolic networks that distinguish between InGF and FrGF: TCA cycle and glycolysis provide energy and substrates for the synthesis of several amino acids and other metabolic activities, while aspartate, glycine and threonine metabolism is involved in bile acid biosynthesis, which is an important regulator of XDH in purine metabolism, the latter (XDH) appears to be associated with changes in glycine, glutamine and aspartate in uric acid production and caffeine degradation. Taken together, these data again confirm the involvement of the intestinal microbiome, epigenetic modification of acquired immunity (trained immunity) during InGF and FrGF processes.
Metabolomics has become a powerful tool for identifying metabolic biomarkers along with machine learning algorithms for diagnosis of diseases. A recent study uses metabonomics and machine learning to predict clinical outcome of eight common diseases. Using similar methods, we have recently revealed metabolic differences in serum of hyperuricemia and gout patients. In addition to systematically analyzing the metabolic profile changes of InGF and FrGF in the present study, the present invention also creates a predictive model to distinguish between FrGF and InGF, which may have a significant impact on the precise gout management advocated by several clinical guidelines, but currently lacks diagnostic tools. We determined multiple metabolites as biomarkers after strict variable selection and cross-validated by non-targeted metabolomic analysis, then based on targeted metabolomics, a diagnostic model was built using machine learning algorithms to distinguish FrGF from InGF. In addition, the utility model included six metabolites and achieved effective predictions in the discovery cohort (auc=0.88). More importantly, the model was validated in a separate validation queue, auc=0.67.
The invention provides unprecedented insight for the metabolic basis of gout attack frequency, proves the unique metabonomics profile of frequent gout and sporadic gout groups, and proves that the metabonomics characteristics can distinguish different clinical manifestations of gout.
The foregoing is merely an implementation of the present application, and the scope of protection of the present application is not limited by these specific examples, but is determined by the claims of the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the technical ideas and principles of the present application should be included in the protection scope of the present application.