CN106795480B - Biomarker for rheumatoid arthritis and application thereof - Google Patents

Biomarker for rheumatoid arthritis and application thereof Download PDF

Info

Publication number
CN106795480B
CN106795480B CN201480082365.1A CN201480082365A CN106795480B CN 106795480 B CN106795480 B CN 106795480B CN 201480082365 A CN201480082365 A CN 201480082365A CN 106795480 B CN106795480 B CN 106795480B
Authority
CN
China
Prior art keywords
con
seq
nucleotide sequence
atcc
actinomyces
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201480082365.1A
Other languages
Chinese (zh)
Other versions
CN106795480A (en
Inventor
冯强
张东亚
贾慧珏
王东辉
王俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of CN106795480A publication Critical patent/CN106795480A/en
Application granted granted Critical
Publication of CN106795480B publication Critical patent/CN106795480B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • C12N1/20Bacteria; Culture media therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/569Immunoassay; Biospecific binding assay; Materials therefor for microorganisms, e.g. protozoa, bacteria, viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/10Musculoskeletal or connective tissue disorders
    • G01N2800/101Diffuse connective tissue disease, e.g. Sjögren, Wegener's granulomatosis
    • G01N2800/102Arthritis; Rheumatoid arthritis, i.e. inflammation of peripheral joints

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Virology (AREA)
  • Urology & Nephrology (AREA)
  • Medicinal Chemistry (AREA)
  • Hematology (AREA)
  • Biophysics (AREA)
  • Food Science & Technology (AREA)
  • Cell Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Biomarkers and methods for predicting disease associated with microbiota, particularly Rheumatoid Arthritis (RA), are provided.

Description

Biomarker for rheumatoid arthritis and application thereof
CROSS-REFERENCE TO RELATED APPLICATIONS
Is free of
Technical Field
The present invention relates to the field of biomedicine and in particular to biomarkers and methods for predicting the risk of microbiota-related diseases, in particular Rheumatoid Arthritis (RA).
Background
Rheumatoid Arthritis (RA) is a debilitating autoimmune disease affecting tens of millions of people worldwide and increases the mortality of patients with their cardiovascular and other systemic complications, but the cause of RA is still unclear. Infectious pathogens have been implicated in RA. However, the characteristics and pathogenicity of RA-related pathogens are largely unclear, and the problem is further complicated by the recent determination that humans are super-organisms (super-organisms) hosting trillions of beneficial as well as harmful microorganisms. Although the use of disease modifying antirheumatic drugs (DMARDs) has been successful in alleviating the state of many RA patients, inadequate knowledge of the factors that trigger or contribute to the disease has prevented the development of specific and more effective treatments. Investigations of microorganisms have also revealed probiotics that prevent or reduce RA.
RA is thought to originate and remain latent for years in certain other body sites before the onset of joint inflammation. Intestinal microbiota is a key environmental factor in human health and has established roles in obesity, diabetes, colon cancer, and the like. In addition to playing a role in nutrition and xenobiotic metabolism, microorganisms in the distal gut interact with the neuro-immune-endocrine system and the bloodstream to affect the entire body. The intestinal microbiota is stably associated with a given individual, increasing its value in disease-related studies. The heterogeneity of gut microbiota in the population indicates that treatment of the disease should be individualized according to gut microbiota, the role of which in drug activation or inactivation, immunomodulation etc. remains largely unclear. In contrast to the tract microbiota, the oral microbiota was relatively under study, where only about 100 healthy individuals were collected by the Human Microbiota Program (HMP) for WGS (Human Microbiome Project consortium. a frame work for Human Microbiome research. nature 486,215-21 (2012), incorporated herein by reference). Despite the fact that tooth and saliva samples are more readily available in outpatient treatment than stool samples, there has been a long-felt lack of metagenomic analysis of stool for the role of oral microbiota in disease. It is not known to what extent the oral and intestinal microbial disease markers are consistent in their identity or function.
Disclosure of Invention
Embodiments of the present disclosure aim to address, at least to some extent, at least one of the problems existing in the prior art.
The present invention is based on the following findings of the present inventors:
the evaluation and characterization of gut microbiota has become a major research area for human diseases including Rheumatoid Arthritis (RA). For analysis of intestinal microbial content in RA patients, the present inventors performed a protocol for metagenomic Association analysis (MGWAS) (Qin, j. et al. a metagenomic-Wide Association Study of gut microbial in type 2diabetes 490, 55-60 (2012), incorporated herein by reference) based on deep shotgun sequencing of microbial DNA from 212 individuals. The present inventors identified and confirmed the gut/tooth/saliva marker set (29 gut MLGs \28 tooth MLGs \19 saliva MLGs) by a random forest model based on RA-related gene markers. In order to intuitively assess the risk of RA disease based on these 29 intestinal MLGs \28 dental MLGs \19 salivary MLGs, the inventors calculated the probability of disease separately by a random forest model based on the relative abundance spectra of the MLG markers in the training set. The present inventors' data provide an insight into the characteristics of the gut/tooth/saliva metagenome associated with the risk of RA, provide an example for future studies of the pathophysiological role of the gut/tooth/saliva metagenome in other related diseases, and provide potential uses of microbiota-based methods for assessing an individual at risk for such a disease.
It is believed that the RA-associated gut microbiota (29 gut MLGs \28 dental MLGs \19 salivary MLGs) is valuable for increasing RA detection at an early stage for the following reasons. First, the markers of the present invention have specificity and sensitivity. Second, analysis of stool ensures accuracy, safety, affordability, and patient compliance. And samples of stool are transportable. Polymerase Chain Reaction (PCR) -based assays are comfortable and non-invasive, so one would be more likely to participate in a given screening procedure. Third, the markers of the invention can also be used as a tool for therapy monitoring of RA patients to detect response to therapy.
In one aspect, a biomarker panel is provided for predicting a disease associated with a microbiota in a subject, and according to an embodiment of the present disclosure, the biomarker panel consists of an intestinal biomarker, a dental biomarker, a saliva biomarker, or a microorganism having genomic DNA comprising at least a partial sequence of SEQ ID NOs 1 to 15843, wherein the intestinal biomarker comprises Bifidobacterium (Bifidobacterium), RA-2633, Enterococcus (Enterococcus sp.), RA-781, Gordonibacter palmaee, RA-3396, RA-6638, RA-2441, RA-527, Clostridium (Clostridium sp.), RA-2637, Citrobacter sp., Eubacterium sp., Citrobacter, RA-3215, Con-1722, Con-4360, Con-4212, Con-1261, Bifidobacterium bifidum (Bifidobacterium bifidum), Klebsiella pneumoniae (Klebsiella pneumoniae), Con-1423, Veillonella sp, Con-4095, Con-4103, Con-1735, Con-1710, Con-1832, Con-1170,
dental biomarkers include RA-10848, RA-9842, RA-9941, RA-9938, RA-10684, RA-9998, Con-7913, Con-20702, Con-11, Con-8169, Con-1708, Con-7847, Con-5233, Con-791, Con-5566, Con-4455, Con-13169, Con-6088, Con-5554, Con-14781, Con-2466, Con-483, Con-2562, Con-4701, Con-4824, Con-5030, Con-757, Con-530, and
salivary biomarkers include RA-27683, RA-9651, RA-13621, RA-27616, Con-6908, Con-305, Con-1559, Con-1374, Con-6746, Campylobacter rectus (Campylobacter rectus), Con-1141, Con-20, Streptococcus (Streptococcus sp.), Con-1238, Con-1073, Con-636, Con-1, Porphyromonas gingivalis (Porphyromonas gingivalis), Lactococcus (Lactococcus sp.),
or a microorganism whose genomic DNA comprises at least part of the sequence of SEQ ID NO 1 to 15843.
Alternatively, the biomarker panel consists of at least one of the species listed in table 3-2, preferably at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100% of the species listed in table 3-2.
According to an embodiment of the present disclosure, the dental biomarker comprises at least a partial sequence of SEQ ID NOs 1 to 15843 as set forth in table 6.
According to an embodiment of the present disclosure, the intestinal biomarkers include bifidobacterium dentale JCVIHMP022, Prevotella CB7(Prevotella copri CB7), DSM 18205, Enterococcus faecium E980(Enterococcus faecalis E980), Ruminococcus equi a2-162(Ruminococcus obtum a2-162), gordonium pamelaeae 7-10-1-bT, DSM 19378, Ruminococcus L2-63(Ruminococcus bronii L2-63), Eubacterium ventriculi ATCC 27560(Eubacterium verticinosum ATCC 27560), Klebsiella oxytoca tc 1686(Klebsiella oxytoca KCTC 6), Clostridium asaggparvum 81, Clostridium CB7(Prevotella coprinus 7), bacillus subtilis c 16805, Clostridium acetobacter Citrobacter 1687 (Clostridium sp 3. 35), Clostridium sp 3. 31, Clostridium sp 2. 31-3. sp 31, Clostridium 31. 35. sp 3. 35 (Clostridium acetobacter sp 3. 31. 35. sp 2. 3. f.31), Clostridium sp 3. 31. sp 3. f.3 Vibrio rosenbergii M50/1(Roseburia intestinalis M50/1), Dialfister invisus DSM 15470, bacteria within Plebeius M12, DSM 17135, Bifidobacterium bifidum S17(Bifidobacterium bifidum S17), Klebsiella pneumoniae NTUH-K2044(Klebsiella pneumoniae NTUH-K2044), Vellonella oral taxonom 158F0412 (Vellonella sp. orataxon F0412), Comamonas testosteroni KF-1(Comamonas testosteroni KF-1), Klebsiella pneumoniae NTUH-K2044(Klebsiella pneumoniae NTK 2044), atypical Veillus ACS-134-Colorum a (Veillonia cavala reticulata ATCC 700641), ATCC 700641 (ATCC 3623),
the dental biomarker includes Actinomyces oral taxa 180F0310(Actinomyces sp. oral taxon 180F0310), Rosematoloma DY-18(Rothia mucogirosa DY-18), Actinomyces gravenitzii C83, Actinomyces carinata ATCC 17982(Actinomyces odontosyticus ATCC 17982), atypical Veillonella virginiana ACS-134-V-Col7a (Vellonella atrovirginica ACS-134-V-Col7a), Actinomyces F0384(Actinomyces sp.F0384), Actinomyces oral taxa ACS 0332(Actinomyces sp. oral taxon 848F0332), Mycosa Mucosa M26(Neisseria mulosa M26), ATCC 25996, Actinomyces F0400(Actinomyces sp. Synthron 040448), Actinomyces sp.040448, Actinomyces ATCC 04037 (ATCC 04037), Actinomyces 04076A 040448 (ATCC 0401200A 040448), Actinomyces 040448 (ATCC 040448), Actinomyces 04076A 04076, ATCC 04078, ATCC 040448 (ATCC 04078), and ATCC 04076A 040448 (ATCC 04076A 04076, ATCC 04078), and ATCC 04078), Actinomyces mirabilis ATCC 51599 (Lautropopia mirabilis ATCC 51599), Cellophilus gingivalis ATCC 33624(Capnocytophaga gingivalis ATCC 33624), Acidobacterium anthropi ATCC 15826(Cardiobacter hominis ATCC 15826), Cellophilus gingivalis ATCC 33624(Capnocytophaga gingivalis ATCC 33624), Acidobacterium paraguariensis ATCC 51599 (Lautropopia mirabilis ATCC 51599), Acidobacterium carinatum ATCC 51276(Johnsonella ignavivalis ATCC 51276), Propionibacterium freudenrerii CIM-BIA 1(Propionibacterium freudenreichii Shermanii-BIA 04 1), Microspira denticola ATCC 35405 (Poronecticola ATCC 35405), Clostridium classification F0437 (oral cavity fungi ATCC 5151834), Acidobacterium paragonium ATCC 51599 (ATCC 51599), Porphyromonas erythomonas dimicola ATCC 599 (ATCC 23599), Porphyromonas pilosula australis ATCC 51515151599), Porphyra porphyria littoralis ATCC 599 (Porphyra porphyria littoralis ATCC 514382), Porphyra littoralis ATCC 599), Porphyra lita littoralis ATCC 599 (Porphyra lita littoralis ATCC 51515143276), Porphyra lita littoralis ATCC 599), Porphyra littoralis ATCC 599 (Porphyra littoralis ATCC 599), Porphyromonae ATCC 5932), Pseudomonas aeruginosa 36541, Pseudomonas aeruginosa ATCC 599 (Porphyra littoralis, Bulleidia extructta W1219,
salivary biomarkers include Geotrichum haemolyticus ATCC 10379 (Gemela haemolyticus ATCC 10379), atypical Veillonella ACS-049-V-Sch6(Veillonella typica ACS-049-V-Sch6), Actinomyces carinatans ATCC 17982(Actinomyces odontolyticus ATCC 17982), Actinomyces carinatans ATCC 17982(Actinomyces odontolyticus ATCC 17982), Treponema denticola ATCC 35405(Treponema denticola ATCC 35405), Actinomyces oral taxa 448F0400(Actinomyces sp.oral taxon 448F 040448 0), Treponema venenum ATCC 35580(Treponema vincenii ATCC 35580), Streptococcus australis ATCC 700641(Streptococcus mutans ATCC 700641), Campylobacter rectus RM3267 (Catulus RM3267), Actinomyces Actinomyces 20446, Streptococcus mutans ATCC 2047 (VMstreptococcus arboricus ATCC 897), Streptococcus mutans ATCC 820337 (VMnospora c 82171), Streptomyces edulcoralis ATCC 857, Actinomyces ATCC 04076 (Mycosenomycetes ATCC 04035), Streptomyces edulcoralis ATCC 0337, Mycospora C8285, Actinomyces ATCC 0338, Actinomyces ATCC No. 18, Actinomyces ATCC No. M18, Actinomyces oral taxa 448F0400(Actinomyces sp.orataxon 448F0400), Neisseria baculoides ATCC BAA-1200(Neisseria bacillus ATCC BAA-1200), Burkholderia rhinocerns PRL-20(Burkholderia mallei PRL-20), Porphyromonas gingivalis TDC60(Porphyromonas gingivalis 60), Lactococcus lactis subspecies KF147(Lactococcus lactis KF 147).
In another aspect of the present disclosure, there is provided a biomarker panel for predicting a disease associated with microbiota in a subject, the biomarker panel consisting of intestinal, dental and salivary biomarkers according to embodiments of the present disclosure, wherein
The dental biomarker comprises at least a partial sequence of SEQ ID NOs 1 to 15843.
According to an embodiment of the present disclosure, the disease is rheumatoid arthritis or a related disease.
In another aspect of the present disclosure, there is provided a kit for determining the above gene marker set, comprising primers for PCR amplification and designed according to DNA sequences listed below:
the dental biomarker comprises at least a partial sequence of SEQ ID NOs 1 to 15843.
In another aspect of the present disclosure, there is provided a kit for determining the above gene marker set, comprising one or more probes designed according to the genes listed below: the dental biomarker comprises at least a partial sequence of SEQ ID NOs 1 to 15843.
In another aspect of the present disclosure, there is provided a use of the above gene marker panel for predicting the risk of rheumatoid arthritis or related diseases in a test subject, comprising:
(1) collecting a sample from a test subject;
(2) determining relative abundance information of each biomarker of a biomarker panel according to any one of claims 1 to 5 in the sample obtained in step (1);
(3) the probability of the rheumatoid arthritis is obtained by comparing the relative abundance information of each biomarker of the tested subject with a training data set by adopting a multivariate statistical model,
wherein a probability of rheumatoid arthritis greater than a threshold value indicates that the test subject has or is at risk of developing rheumatoid arthritis or a related disease.
According to an embodiment of the present disclosure, the training data set is constructed using a multivariate statistical model based on information of relative abundance of each biomarker for a plurality of subjects with rheumatoid arthritis and a plurality of normal subjects, optionally the multivariate statistical model is a random forest model.
According to an embodiment of the present disclosure, the training dataset is a matrix, wherein each row represents a respective biomarker of the biomarker panel according to any of claims 1 to 5, each column represents a sample, each cell represents a relative abundance spectrum of the biomarker in the sample, and the sample disease status is a vector, wherein 1 represents rheumatoid arthritis and 0 represents a control.
According to embodiments of the present disclosure, the relative abundance information of each of RA-10848, RA-9842, RA-9941, RA-9938, RA-10684, RA-9998, Con-7913, Con-20702, Con-11, Con-8169, Con-1708, Con-7847, Con-5233, Con-791, Con-5566, Con-4455, Con-13169, Con-6088, Con-5554, Con-14781, Con-2466, Con-483, Con-2562, Con-4701, Con-4824, Con-5030, Con-757 and Con-530, such as Actinomyces oral taxa 180F0310, Gliocladium viscidum DY-18, Nomyces granitzii C83, Actinomyces Actinomyces ATCC 17982, Veillus ATCC 7 a-0384F 03134, Col-0325F-357, Con-17982, Con-14781, Con-483-4, Con-2562, Con-C, Actinomyces oral taxa 848F0332, Neisseria mucosae M26, ATCC 25996, Actinomyces oral taxa 448F0400, Fostainer ATCC 43037, Actinomyces oral taxa 448F0400, Neisseria baculoides ATCC BAA-1200, bacteria of the phylum Intertrophic SGP1, Actinomyces mirabilis ATCC 51599, Cellophilus gingivalis ATCC 33624, human cardiac bacterium ATCC 15826, Cellophilus gingivalis ATCC 33624, Actinomyces mirabilis ATCC 51599, Actinomyces lazulii ATCC 51276, Propionibacterium freudenreri CIRM-BIA1, Treponema tartarum ATCC 35405, Clostridium oral taxa 370F0437, Actinomyces mirabilis ATCC 51599, Acheuma caldariffa ATCC 23834, harmful Pseudomonas crescent ATCC 43541, Porphyromonas 23370, Bullexilia 1219 are the relative abundance information obtained according to DSM ID 1581 to DSM 15843.
According to an embodiment of the present disclosure, the training dataset is at least one of tables 9-1 and 9-2, and a probability of rheumatoid arthritis of at least 0.5 indicates that the test subject has or is at risk of developing rheumatoid arthritis or a related disease.
In another aspect of the present disclosure, there is provided a use of the above gene marker for preparing a kit for predicting the risk of rheumatoid arthritis or related diseases in a test subject, comprising:
(1) collecting a sample from a test subject;
(2) determining relative abundance information of each biomarker of a biomarker panel according to any one of claims 1 to 5 in the sample obtained in step (1);
(3) the probability of the rheumatoid arthritis is obtained by comparing the relative abundance information of each biomarker of the tested subject with a training data set by adopting a multivariate statistical model,
wherein a probability of rheumatoid arthritis greater than a threshold value indicates that the test subject has or is at risk of developing rheumatoid arthritis or a related disease.
According to an embodiment of the present disclosure, the training data set is constructed using a multivariate statistical model based on information of relative abundance of each biomarker for a plurality of subjects with rheumatoid arthritis and a plurality of normal subjects, optionally the multivariate statistical model is a random forest model.
According to an embodiment of the present disclosure, the training dataset is a matrix, wherein each row represents a respective biomarker of the biomarker panel according to any of claims 1 to 5, each column represents a sample, each cell represents a relative abundance spectrum of the biomarker in the sample, and the sample disease status is a vector, wherein 1 represents rheumatoid arthritis and 0 represents a control.
According to an embodiment of the present disclosure, wherein the relative abundance information of each of RA-10848, RA-9842, RA-9941, RA-9938, RA-10684, RA-9998, Con-7913, Con-20702, Con-11, Con-8169, Con-1708, Con-7847, Con-5233, Con-791, Con-5566, Con-4455, Con-13169, Con-6088, Con-5554, Con-14781, Con-2466, Con-483, Con-2562, Con-4701, Con-4824, Con-5030, Con-757 and Con-530, for example Actinomyces oral taxa 180F0310, Gliocladium DY-18, Nomyces granitziC 83, Actinomyces Actinomyces 17982, atypical Vececroisseria ATCC 7a, Col 3884-0325F 03134, Col-0325, Con-17982, Actinomyces oral taxa 848F0332, Neisseria mucosae M26, ATCC 25996, Actinomyces oral taxa 448F0400, Fostainer ATCC 43037, Actinomyces oral taxa 448F0400, Neisseria baculoides ATCC BAA-1200, bacteria of the phylum Intertrophic SGP1, Actinomyces mirabilis ATCC 51599, Cellophilus gingivalis ATCC 33624, human cardiac bacterium ATCC 15826, Cellophilus gingivalis ATCC 33624, Actinomyces mirabilis ATCC 51599, Actinomyces lazulii ATCC 51276, Propionibacterium freudenreri CIRM-BIA1, Treponema tartarum ATCC 35405, Clostridium oral taxa 370F0437, Actinomyces mirabilis ATCC 51599, Acheuma caldariffa ATCC 23834, harmful Pseudomonas crescent ATCC 43541, Porphyromonas 23370, Bullexilia 1219 are the relative abundance information obtained according to DSM ID 1581 to DSM 15843.
According to an embodiment of the present disclosure, the training dataset is at least one of tables 9-1 and 9-2, and a probability of rheumatoid arthritis of at least 0.5 indicates that the test subject has or is at risk of developing rheumatoid arthritis or a related disease.
In another aspect of the present disclosure, there is provided a method of diagnosing whether a subject has or is at risk of developing an abnormal state associated with a microbiota, comprising:
determining the relative abundance of the above biomarkers in a sample from the subject, and
determining whether the subject has an abnormal state associated with microbiota or is at risk for developing an abnormal state associated with microbiota based on the relative abundance.
According to an embodiment of the present disclosure, the method includes:
(1) collecting a sample from a test subject;
(2) determining relative abundance information of each biomarker of a biomarker panel according to any one of claims 1 to 5 in the sample obtained in step (1);
(3) the probability of the rheumatoid arthritis is obtained by comparing the relative abundance information of each biomarker of the tested subject with a training data set by adopting a multivariate statistical model,
wherein a probability of rheumatoid arthritis greater than a threshold value indicates that the test subject has or is at risk of developing rheumatoid arthritis or a related disease.
According to an embodiment of the present disclosure, the training data set is constructed using a multivariate statistical model based on information of relative abundance of each biomarker for a plurality of subjects with rheumatoid arthritis and a plurality of normal subjects, optionally the multivariate statistical model is a random forest model.
According to an embodiment of the present disclosure, the training dataset is a matrix, wherein each row represents a respective biomarker of the biomarker panel according to any of claims 1 to 5, each column represents a sample, each cell represents a relative abundance spectrum of the biomarker in the sample, and the sample disease status is a vector, wherein 1 represents rheumatoid arthritis and 0 represents a control.
According to an embodiment of the present disclosure, wherein the relative abundance information of each of RA-10848, RA-9842, RA-9941, RA-9938, RA-10684, RA-9998, Con-7913, Con-20702, Con-11, Con-8169, Con-1708, Con-7847, Con-5233, Con-791, Con-5566, Con-4455, Con-13169, Con-6088, Con-5554, Con-14781, Con-2466, Con-483, Con-2562, Con-4701, Con-4824, Con-5030, Con-757 and Con-530, for example Actinomyces oral taxa 180F0310, Gliocladium DY-18, Nomyces granitziC 83, Actinomyces Actinomyces 17982, atypical Vececroisseria ATCC 7a, Col 3884-0325F 03134, Col-0325, Con-17982, Actinomyces oral taxa 848F0332, Neisseria mucosae M26, ATCC 25996, Actinomyces oral taxa 448F0400, Fostainer ATCC 43037, Actinomyces oral taxa 448F0400, Neisseria baculoides ATCC BAA-1200, bacteria of the phylum Intertrophic SGP1, Actinomyces mirabilis ATCC 51599, Cellophilus gingivalis ATCC 33624, human cardiac bacterium ATCC 15826, Cellophilus gingivalis ATCC 33624, Actinomyces mirabilis ATCC 51599, Actinomyces lazulii ATCC 51276, Propionibacterium freudenreri CIRM-BIA1, Treponema tartarum ATCC 35405, Clostridium oral taxa 370F0437, Actinomyces mirabilis ATCC 51599, Acheuma caldariffa ATCC 23834, harmful Pseudomonas crescent ATCC 43541, Porphyromonas 23370, Bullexilia 1219 are the relative abundance information obtained according to DSM ID 1581 to DSM 15843.
According to an embodiment of the present disclosure, the training dataset is at least one of tables 9-1 and 9-2, and a probability of rheumatoid arthritis of at least 0.5 indicates that the test subject has or is at risk of developing rheumatoid arthritis or a related disease.
Drawings
These and other aspects and advantages of the present disclosure will become apparent and more readily appreciated from the following description, taken in conjunction with the accompanying drawings, wherein:
fig. 1 intestinal or oral MLG allows classification of RA patients from healthy controls. (a, d, f) ROC curves for training sets of stool (a), tooth (d) and saliva (f) consisting of untreated RA cases and unrelated normal controls (n 157, 100, 94 for stool, tooth and saliva samples, respectively). The dots mark the false positive and true positive rates for the best threshold probability. (b) Stool test sets consisting of 17 controls and 17 RA cases with or without kinship to each other were classified. (c, e, g) RA samples of stool (c), teeth (e) and saliva (g) after DMARD treatment were classified (n ═ 40, 38, 24 for stool, teeth and saliva samples, respectively). DAS28 < 2.6 indicated remission according to european union of antirheumatics (EULAR) standards. The classification results for all samples are listed in table 12.
Detailed Description
Examples
The terms used herein have the meanings commonly understood by those of ordinary skill in the art to which the present invention pertains. Terms such as "a," "an," and "the" are not intended to refer to only a singular entity, but rather encompass the general class of items described in connection with the detailed description. The terms used herein are used to describe specific embodiments of the invention, except as outlined in the claims, but their use is not intended to limit the invention.
The invention is further illustrated in the following non-limiting examples. Parts and percentages are by weight and degrees are in degrees celsius unless otherwise indicated. As will be understood by one of ordinary skill in the art, these examples, while indicating preferred embodiments of the invention, are given by way of illustration only and the reagents are commercially available.
Example 1 identification and validation of biomarkers for assessing risk of rheumatoid arthritis
1. Materials and methods
1.1 sample Collection and DNA extraction
The inventors collected a total of 212 stool samples (table 1-1, stool samples, plaque samples and saliva samples) containing a training set (n 157,77 treatment-naive (treated) RA cases and 80 healthy controls) and a test set (n 34 for the relevant case-control pairs, i.e. 8 cases-control pairs with and 9 cases-control pairs without a relationship; n 21 for DMARD-treated RA patients).
Stool samples were collected in Beijing collaborations hospitals, cryo-transported and extracted in BGI-Shenzhen (Shenzhen Huada Gene) as described previously (Qin, J. et al. A. Metamenome-wind association study of gut microbiota in type 2diabetes 490, 55-60 (2012), incorporated herein by reference). Dental plaque was scraped from the tooth surface with ophthalmic forceps until it had a volume of 3 μ l. Samples were transferred to 200. mu.l of 1 × lysis buffer containing 10mM Tris, 1mM EDTA, 0.5 % Tween 20 and 200. mu.g/ml proteinase K (Fermentas) and incubated for 2 hours at 55 ℃. Lysis was terminated by incubation at 95 ℃ for 10 min and the samples were frozen at-80 ℃ before transport. DNA extraction was performed according to the protocol for the stool sample. For saliva, 100. mu.l of saliva was added to 100. mu.l of 2 × lysis buffer, the posterior pharyngeal wall was wiped and added to the same tube, and then the samples were lysed and extracted as for the tooth specimens.
RA was diagnosed in the Beijing cooperative hospital according to the 2010ACR/EULAR classification criteria. All phenotypic information was collected at the initial visit of the subject to the hospital according to standard procedures. RA patients between 18 and 65 years of age with disease duration of at least 6 weeks, at least 1 joint swelling and 3 joint tenderness were enrolled. Patients were excluded if they had a history of chronic severe infection, any current infection or any type of cancer. Excluding pregnant women or women in lactation period. All patients were informed of the risk of infertility and patients who wanted children were excluded. Although some patients have had RA for many years, they are not DMARDs because they were not diagnosed with RA at the local hospital prior to visiting the beijing counseling hospital, and they only took analgesics to alleviate RA symptoms.
All phenotypic information was collected at the initial visit of the subject to the hospital according to standard procedures. Of the 212 samples used for the construction of the gut microbiome gene catalogue, only 21 stool samples from DMARD-treated patients were obtained and were not analyzed in this paper.
This study was approved by the institutional review board of Beijing coordination Hospital and Shenzhen Huada Gene.
TABLE 1-1 samples for Gene catalog construction
Figure BDA0001259110000000101
1.2 metagenomic sequencing and Assembly
Double-ended metagenomic sequencing was performed on the Illumina platform (insert 350bp, sequence length 100bp), quality control of the sequencing reads and reassembly of the sequencing reads into contigs using soapdenov 2.04 (Luo, r. et al. soapdeno 2: an empirical improved memory-efficiency short-read non associated assembler, gigascience 1,18(2012), incorporated herein by reference), as previously described (Qin et al. 2012, supra). The average rate of host contamination was 0.37% for the fecal sample, 5.55% for the dental sample, and 40.85% for the saliva sample.
1.3 Gene catalog construction
Genes of the assembled contigs were predicted using GeneMark v2.7 d. Using BLAT (Kent, w.j. BLAT-the BLAST-like alignment tool genome res.12, 656-64 (2002), incorporated herein by reference), redundant genes were removed at a threshold of 90% overlap and 95% identity (not allowing for the presence of holes), a non-redundant gene list of 3,800,011 genes was formed for 212 stool samples (containing 21 DMARD-treated samples), and a list of 3,234,997 genes was formed for 203 oral samples (105 plaque samples and 98 saliva samples). The gene catalogue from the stool sample was incorporated into an existing gut microbiome reference catalogue containing 430 ten thousand genes using BLAT (95% identity, 90% overlap) to form a final catalogue containing 590 ten thousand genes (Qin et al 2012, supra). The relative abundance of genes was determined by aligning high quality sequencing reads to an intestinal or oral reference gene catalogue using the same procedure as in the published T2D paper (Qin et al, 2012, supra).
1.4 Classification Annotation and abundance calculations
The predicted genes were assigned by classification according to the IMG database (v400) using the internal procedure detailed previously (pipeline) (Qin et al, 2012, supra) with 70% overlap and 65% identity for assignment to phyla, 85% identity for assignment to genus, and 95% identity for assignment to class. The relative abundance of the taxa was calculated from the relative abundance of the taxa genes.
Significant differences in the relative abundance of taxa between patients and healthy controls were determined by Wilcoxon rank-sum test (where p < 0.05).
1.5 metagenomic association analysis (MGWAS)
For case-control comparison of fecal microbiota, removal of genes detected in less than 6 samples (n ═ 157) resulted in a set with 3,110,085 genes. 83,858 genes showed differences in relative abundance between control and case (p <0.01, Wilcoxon rank-sum test, FDR 0.3285). These marker genes were clustered into MLGs based on their abundance changes in all samples (Qin et al, 2012, supra). For the construction of dental MLG 209820 marker genes were selected from 2,247,835 genes (present in at least 6 samples, n ═ 105) (p <0.01, Wilcoxon rank sum test, FDR ═ 0.072). For salivary MLG, the inventors selected 206399 marker genes (p <0.01, Wilcoxon rank-sum test, FDR 0.088) from 2,404,726 genes (present in at least 6 samples, n 98).
As previously described (Qin et al, 2012, supra), taxonomic assignment and abundance analysis was performed based on taxonomy and the relative abundance of their constituent genes. In short, assignment to a species requires over 95% identity of over 90% of the genes in the MLG to the genome of the species, with 70% of queries overlapping. The assignment of MLG to genera requires more than 80% of its gene-to-genome alignment with 85% identity in DNA and protein sequences. The average identity to the genome calculated from all genes is shown for reference only. MLGs were further clustered according to Kendall correlations between their abundance in all samples regardless of case-control status, and the co-occurrence network was visualized by Cytoscape 3.0.2.
1.6 MLG-based classifier
Random forest models (R.2.14, random forest4.6-7 package) (Liaw, Andy & Wiener, Matthey. Classification and Regression by random forest, R News (2002), 2/3, page 18, incorporated herein by reference) were trained using the MLG abundance spectra of the training cohort (tables 1-2) to select the best set of MLG markers. The model is tested on more than one test set and the prediction error is calculated.
For the random forest model, the "random forest4.6-7 software package" packed in version 2.14R was used, with the inputs being the training data set (i.e., the relative abundance spectrum of the selected MLG in the training samples), the sample disease state (the sample disease state of the training samples is the vector, 1 for RA, 0 for the control), and the test set (only the relative abundance spectrum of the selected MLG in the test set). The inventors then constructed a classification using random forest functions from a random forest software package of the R software, and predicted the test set using a prediction function. The output is the prediction (probability of illness, threshold 0.5, and if the probability of illness ≧ 0.5, subject is at risk for RA).
TABLE 1-2 sample information for training set (selected from the samples for Gene catalog construction in TABLE 1-1)
Figure BDA0001259110000000121
2. Results
Microbiota-based identification and validation of RA patients
To further illustrate the diagnostic or prognostic value of RA-associated microbiota, the inventors first constructed a random forest disease classifier based on intestinal MLG. The model using 29 of the 85 gut MLG markers (at least 100 genes) from control and case gave the lowest prediction error in the training set (n 157) (fig. 1a, table 2-1, table 2-2, table 5, table 8-1, table 8-2) and the area under the Receiver Operating Characteristic (ROC) curve (AUC) of 0.977. For the test set consisting of case-control pairs with and without kindred relationship (n-34, tables 1-3), the overall error rate was 32% (fig. 1b, table 11) and AUC was 0.706. Thus, the efficacy of the gut MLG-based model on the training set and, where applicable, the test set, compares or exceeds the efficacy of existing RA serum marker-based classifiers (Van der Helm-Van Mil, a.h.m. rise evaluation in rheumatoid arthritis-from bench to bedside.nat. rev.rheumatol. (2014). doi: 10.1038/nrrev.2013.215, incorporated herein by reference).
Similarly, 28 MLGs (Table 3-1, Table 3-2, Table 6, Table 9-1, Table 9-2) selected from 171 dentition MLGs (at least 100 genes) gave an AUC of 0.864 in the training set (FIG. 1 d). 19 MLGs selected from 142 salivary MLGs (at least 100 genes) (Table 4-1, Table 4-2, Table 7, Table 10-1, Table 10-2) gave an AUC of 0.898 (FIG. 1 f). These results indicate that stool, tooth, and saliva microbial markers are all very useful for diagnosing RA.
Furthermore, testing of DMARD-treated patient samples (tables 1-3) the intestinal and dental MLG classifiers still identified most of them as RA patients, while dental samples with low disease activity (DAS28) were more often classified as healthy (fig. 1c, 1e, table 12), indicating that dental microbiota faithfully demonstrated the effect of DMARD treatment. In addition, saliva samples from DMARD-treated patients were generally classified as controls, possibly due to direct regulation of salivary microbiota by DMARDs (fig. 1g, table 12). Taken together, the results indicate that intestinal and oral MLGs can distinguish between effective and ineffective treatment and facilitate the assessment of treatment strategies.
Tables 1-3 sample information for test sets
Figure BDA0001259110000000131
Figure BDA0001259110000000141
Figure BDA0001259110000000151
Figure BDA0001259110000000161
Figure BDA0001259110000000171
Figure BDA0001259110000000181
Figure BDA0001259110000000191
TABLE 5.29 SEQ ID of the gut optimal markers
MLG ID SEQ ID NO: Base factor
mlg_id:2441 1~159 159
mlg_id:4103 160~304 145
mlg_id:4212 305~709 405
mlg_id:1047 710~856 147
mlg_id:1735 857~1536 680
mlg_id:4360 1537~1646 110
mlg_id:1796 1647~1798 152
mlg_id:3396 1799~2071 273
mlg_id:2472 2072~2309 238
mlg_id:1261 2310~2991 682
mlg_id:1832 2992~3093 102
mlg_id:6638 3094~3214 121
mlg_id:1722 3215~3353 139
mlg_id:1423 3354~3455 102
mlg_id:1170 3456~3558 103
mlg_id:3215 3559~3739 181
mlg_id:4095 3740~4381 642
mlg_id:2637 4382~4754 373
mlg_id:905 4755~4885 131
mlg_id:4111 4886~6743 1858
mlg_id:1710 6744~6862 119
mlg_id:2633 6863~7113 251
mlg_id:819 7114~7425 312
mlg_id:4158 7426~7736 311
mlg_id:527 7737~7854 118
mlg_id:784 7855~8048 194
mlg_id:2473 8049~8758 710
mlg_id:781 8759~8869 111
mlg_id:5 8870~9319 450
TABLE 6.28 SEQ ID of best dental markers
Figure BDA0001259110000000201
Figure BDA0001259110000000211
TABLE 7.19 SEQ ID of saliva optimal markers
MLG ID SEQ ID NO: Base factor
mlg_id:1238 1~126 126
mlg_id:1559 127~231 105
mlg_id:6908 232~360 129
mlg_id:1141 361~519 159
mlg_id:6746 520~697 178
mlg_id:1 698~5680 4983
mlg_id:27683 5681~5851 171
mlg_id:1374 5852~6032 181
mlg_id:13 6033~8482 2450
mlg_id:1073 8483~9597 1115
mlg_id:29 9598~10469 872
mlg_id:636 10470~11246 777
mlg_id:9651 11247~11383 137
mlg_id:305 11384~11485 102
mlg_id:12 11486~14228 2743
mlg_id:20 14229~16239 2011
mlg_id:2831 16240~17605 1366
mlg_id:13621 17606~18115 510
mlg_id:27616 18116~15843 123
Figure BDA0001259110000000221
Figure BDA0001259110000000231
Figure BDA0001259110000000241
Figure BDA0001259110000000251
Figure BDA0001259110000000261
Figure BDA0001259110000000271
Figure BDA0001259110000000281
Figure BDA0001259110000000291
Figure BDA0001259110000000301
Figure BDA0001259110000000311
Figure BDA0001259110000000321
Figure BDA0001259110000000331
Figure BDA0001259110000000341
Figure BDA0001259110000000351
Figure BDA0001259110000000361
Figure BDA0001259110000000371
Figure BDA0001259110000000381
Figure BDA0001259110000000391
Figure BDA0001259110000000401
Figure BDA0001259110000000411
Figure BDA0001259110000000421
Figure BDA0001259110000000431
Thus, the present inventors have identified and validated marker sets (29 intestinal MLGs \28 dental MLGs \19 salivary MLGs) by random forest models based on RA-related gene markers. And the inventors have constructed RA classifiers that assess the risk of RA disease based on these RA-associated gut microbiota.
Although exemplary embodiments have been shown and described, it will be understood by those skilled in the art that the above embodiments are not to be construed as limiting the present disclosure and that changes, substitutions and alterations can be made to the embodiments without departing from the spirit, principles and scope of the present disclosure.

Claims (12)

1. A biomarker panel for predicting disease associated with microbiota in a subject, consisting of dental biomarkers including RA-10848, RA-9842, RA-9941, RA-9938, RA-10684, RA-9998, Con-7913, Con-20702, Con-11, Con-8169, Con-1708, Con-7847, Con-5233, Con-791, Con-5566, Con-4455, Con-13169, Con-6088, Con-5554, Con-14781, Con-2466, Con-483, Con-2562, Con-4701, Con-4824, Con-5030, Con-757 and Con-530,
wherein the nucleotide sequence of RA-10848 is shown as SEQ ID NO: 12044-12154, the nucleotide sequence of RA-9842 is shown as SEQ ID NO: 11671-11923, the nucleotide sequence of RA-9941 is shown as SEQ ID NO: 5884-6099, the nucleotide sequence of RA-9938 is shown as SEQ ID NO: 1473-1750, the nucleotide sequence of RA-10684 is shown as SEQ ID NO: 11537-11670, the nucleotide sequence of RA-9998 is shown as SEQ ID NO: 1861-1968, the nucleotide sequence of Con-7913 is shown as SEQ ID NO: 1751-1860, the nucleotide sequence of Con-20702 is shown as SEQ ID NO: 1969-2395, the nucleotide sequence of Con-11 is shown as SEQ ID NO: 1-196119, the nucleotide sequence of Con-8169 is shown as SEQ ID NO: 11924-12043, the nucleotide sequence of the Con-1708 is shown as SEQ ID NO 15454-15843, the nucleotide sequence of the Con-7847 is shown as SEQ ID NO 12941-13266, the nucleotide sequence of the Con-5233 is shown as SEQ ID NO 10235-10572, the nucleotide sequence of the Con-791 is shown as SEQ ID NO 13267-13570, the nucleotide sequence of the Con-5566 is shown as SEQ ID NO 6100-8372, the nucleotide sequence of the Con-4455 is shown as SEQ ID NO 9016-9734, the nucleotide sequence of the Con-13169 is shown as SEQ ID NO 1328-1472, the nucleotide sequence of the Con-6088 is shown as SEQ ID NO 11333-11536, the nucleotide sequence of the Con-4454 is shown as SEQ ID NO 8373-9015, the nucleotide sequence of the Con-14781 is shown as SEQ ID NO 533-1225, the nucleotide sequence of the Con-2466 is shown as SEQ ID NO. 13571-15453, the nucleotide sequence of the Con-483 is shown as SEQ ID NO. 12155-12331, the nucleotide sequence of the Con-2562 is shown as SEQ ID NO. 10573-11332, the nucleotide sequence of the Con-4701 is shown as SEQ ID NO. 2396-5883, the nucleotide sequence of the Con-4824 is shown as SEQ ID NO. 1226-1327, the nucleotide sequence of the Con-5030 is shown as SEQ ID NO. 120-532, the nucleotide sequence of the Con-757 is shown as SEQ ID NO. 9735-10234, and the nucleotide sequence of the Con-530 is shown as SEQ ID NO. 12332-12940.
2. The biomarker panel for predicting a subject's microbiota-associated disease according to claim 1, wherein the dental biomarkers comprise the sequences of SEQ ID NOs 1 to 15843.
3. A biomarker panel for predicting microbiota-associated disease in a subject, wherein the dental biomarkers comprise actinomycete oral taxa 180F0310 (a) ((b))Actinomyces sp. oral taxon 180 F0310) Rosemaria viscosa DY-18(Rothia mucilaginosa DY-18)、Actinomyces graevenitzii C83Actinomyces carinata ATCC 17982(Actinomyces odontolyticus ATCC 17982) SARS Veillonella ACS-134-V-Col7a (Veillonella atypica ACS-134-V-Col7a) Actinomycete F0384(Actinomyces sp. F0384) Actinomyces oral taxonomic group 848F0332(Actinomyces sp. oral taxon 848 F0332) Neisseria mucosae M26 (a)Neisseria mucosa M26) ATCC 25996, Actinomyces oral taxonomy 448F0400(Actinomyces sp. oral taxon 448 F0400) Fostainers ATCC 43037: (Tannerella forsythensis ATCC 43037) Actinomyces oral taxonomic group 448F 0400: (Actinomyces sp. oral taxon 448 F0400) N. bacerium ATCC BAA-1200(Neisseria bacilliformis ATCC BAA- 1200) Bacteria of the phylum of intercropping SGP 1: (Synergistetes bacterium SGP1) Actinomyces mirabilis ATCC 51599: (Lautropia mirabilis ATCC 51599) Cellophilus carbonaeus ATCC 33624 (C.gingivalis)Capnocytophaga gingivalis ATCC 33624) Bacillus nasutus ATCC 15826(Cardiobacterium hominis ATCC 15826) Cellophilus carbonaeus ATCC 33624 (C.gingivalis)Capnocytophaga gingivalis ATCC 33624) Actinomyces mirabilis ATCC 51599: (Lautropia mirabilis ATCC 51599) Lazy johnsonia ATCC 51276(Johnsonella ignava ATCC 51276) Propionibacterium freudenreichii CIRM-BIA1 (A)Propionibacterium freudenreichii shermanii CIRM-BIA1) Treponema denticola ATCC 35405(Treponema denticola ATCC 35405) Clostridium oral taxon 370F 0437: (Fusobacterium sp. oral taxon 370 F0437) Actinomyces mirabilis ATCC 51599: (Lautropia mirabilis ATCC 51599) Erkenella erosis ATCC 23834(Eikenella corrodens ATCC 23834) Harmful bacterium crescent ATCC 43541(Selenomonas noxia ATCC 43541) Porphyromonas lii DSM 23370: (A)Porphyromonas levii DSM 23370) AndBulleidia extructa W1219
4. a biomarker panel for predicting a subject's microbiota-associated disease consisting of dental biomarkers comprising the sequence of SEQ ID NOs 1 to 15843.
5. A kit for determining the biomarker panel of any one of claims 1 to 4, comprising primers for PCR amplification and designed according to the dental biomarkers as described in claim 4.
6. A kit for determining the biomarker panel of any one of claims 1 to 4, comprising a probe designed according to the dental biomarker as described in claim 4.
7. Use of a reagent for detecting the biomarker panel according to any one of claims 1 to 4, in the preparation of a kit for predicting the risk of rheumatoid arthritis in a subject to be tested, comprising:
(1) collecting a sample from the test subject;
(2) determining relative abundance information of each biomarker of the biomarker panel according to any one of claims 1 to 4 in the sample obtained in step (1);
(3) obtaining the probability of rheumatoid arthritis by comparing the relative abundance information of each biomarker of the test subject with a training data set using a multivariate statistical model,
wherein a probability of the rheumatoid arthritis being greater than a threshold value indicates that the test subject has or is at risk of developing the rheumatoid arthritis.
8. The use of claim 7, wherein the training data set is constructed using the multivariate statistical model based on relative abundance information of individual biomarkers from a plurality of subjects with rheumatoid arthritis and a plurality of normal subjects.
9. Use according to claim 8, wherein the multivariate statistical model is a random forest model.
10. Use according to claim 8, wherein the training data set is a matrix, wherein each row represents a respective biomarker of the biomarker panel according to any of claims 1 to 4, each column represents a sample, each cell represents a relative abundance spectrum of the biomarker in the sample, and the sample disease status is a vector, wherein 1 represents rheumatoid arthritis and 0 represents a control.
11. The use according to claim 8, wherein the nucleotide sequence of RA-10848 is shown as SEQ ID NO 12044-12154, the nucleotide sequence of RA-9842 is shown as SEQ ID NO 11671-11923, the nucleotide sequence of RA-9941 is shown as SEQ ID NO 5884-6099, the nucleotide sequence of RA-9938 is shown as SEQ ID NO 1473-1750, the nucleotide sequence of RA-10684 is shown as SEQ ID NO 11537-11670, the nucleotide sequence of RA-9998 is shown as SEQ ID NO 1861-1968, the nucleotide sequence of Con-7913 is shown as SEQ ID NO 1861-1750, the nucleotide sequence of Con-20702 is shown as SEQ ID NO 1969-2395, the nucleotide sequence of Con-11 is shown as SEQ ID NO 1-119, the nucleotide sequence of the Con-8169 is shown as SEQ ID NO: 11924-12043, the nucleotide sequence of the Con-1708 is shown as SEQ ID NO: 15454-15843, the nucleotide sequence of the Con-7847 is shown as SEQ ID NO: 12941-13266, the nucleotide sequence of the Con-5233 is shown as SEQ ID NO: 10235-10572, the nucleotide sequence of the Con-791 is shown as SEQ ID NO: 13267-13570, the nucleotide sequence of the Con-5566 is shown as SEQ ID NO: 6100-8372, the nucleotide sequence of the Con-4455 is shown as SEQ ID NO: 9016-9734, the nucleotide sequence of the Con-13169 is shown as SEQ ID NO: 1328-1472, the nucleotide sequence of the Con-6088 is shown as SEQ ID NO: 11333-11536, the nucleotide sequence of the Con-54 is shown as SEQ ID NO: 8373-9015, the nucleotide sequence of the Con-14781 is shown as SEQ ID NO 533-1225, the nucleotide sequence of the Con-2466 is shown as SEQ ID NO 13571-15453, the nucleotide sequence of the Con-483 is shown as SEQ ID NO 12155-12331, the nucleotide sequence of the Con-2562 is shown as SEQ ID NO 10573-11332, the nucleotide sequence of the Con-4701 is shown as SEQ ID NO 2396-5883, the nucleotide sequence of the Con-4824 is shown as SEQ ID NO 1226-1327, the nucleotide sequence of the Con-5030 is shown as SEQ ID NO 120-532, the nucleotide sequence of the Con-757 is shown as SEQ ID NO 9735-974, and the nucleotide sequence of the Con-530 is shown as SEQ ID NO 12332-12940.
12. The use of claim 9, wherein the training dataset is at least one of tables 9-1 and 9-2, and a probability of the rheumatoid arthritis of at least 0.5 indicates that the test subject has or is at risk of developing the rheumatoid arthritis.
CN201480082365.1A 2014-09-30 2014-09-30 Biomarker for rheumatoid arthritis and application thereof Active CN106795480B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/088069 WO2016049937A1 (en) 2014-09-30 2014-09-30 Biomarkers for rheumatoid arthritis and usage therof

Publications (2)

Publication Number Publication Date
CN106795480A CN106795480A (en) 2017-05-31
CN106795480B true CN106795480B (en) 2021-05-07

Family

ID=55629360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480082365.1A Active CN106795480B (en) 2014-09-30 2014-09-30 Biomarker for rheumatoid arthritis and application thereof

Country Status (2)

Country Link
CN (1) CN106795480B (en)
WO (1) WO2016049937A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3786305A4 (en) * 2018-04-24 2022-07-13 BGI Shenzhen Biomarker for depression and use thereof
CN108877937A (en) * 2018-05-02 2018-11-23 广州元亿国际生物科技有限公司 A kind of method and apparatus that medical information is generated according to Tiny ecosystem testing result
CN109266765B (en) * 2018-09-28 2022-06-28 人和未来生物科技(长沙)有限公司 Microbial flora for predicting precancerous lesion risk of oral cavity and application

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013056222A1 (en) * 2011-10-14 2013-04-18 New York University Causative agents and diagnostic methods relating to rheumatoid arthritis
WO2014019271A1 (en) * 2012-08-01 2014-02-06 Bgi Shenzhen Biomarkers for diabetes and usages thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101886132B (en) * 2009-07-15 2013-09-18 北京百迈客生物科技有限公司 Method for screening molecular markers correlative with properties based on sequencing technique and BSA (Bulked Segregant Analysis) technique
CN101921748B (en) * 2010-06-30 2012-11-14 上海华大基因科技有限公司 DNA molecular label for high-throughput detection of human papilloma virus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013056222A1 (en) * 2011-10-14 2013-04-18 New York University Causative agents and diagnostic methods relating to rheumatoid arthritis
WO2014019271A1 (en) * 2012-08-01 2014-02-06 Bgi Shenzhen Biomarkers for diabetes and usages thereof

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Analysis of Fecal Lactobacillus Community Structure in Patients with Early Rheumatoid Arthritis;Xiaofei Liu et al.;《Curr Microbiol》;20130313;第67卷;第170-176页 *
Association between chronic periodontitis and rheumatoid arthritis: a hospital-based case–control study;Rosamma Joseph et al.;《Rheumatol Int》;20120107;第33卷;第103-109页 *
Microbiome and mucosal inflammation as extra-articular triggers for rheumatoid arthritis and autoimmunity;Samuel B Brusca et al.;《Curr Opin Rheumatol.》;20140131;第26卷(第1期);第101-107页 *
Oral status in patients with early rheumatoid arthritis: a prospective, case-control study;Bjorn Wolff et al.;《Rheumatology》;20131123;第53卷(第3期);第526-531页 *
Periodontal Disease and the Oral Microbiota in New-Onset Rheumatoid Arthritis;Jose U. Scher, MD et al.;《Arthritis Rheum.》;20121031;第64卷(第10期);第3083-3094页 *
Rheumatoid Arthritis and Salivary Biomarkers of Periodontal Disease;Jeffrey Mirrielees et al.;《J Clin Periodontol.》;20101231;第37卷(第12期);第1068-1074页 *

Also Published As

Publication number Publication date
WO2016049937A1 (en) 2016-04-07
CN106795480A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN108350502B (en) Microbiome derived diagnostic and therapeutic methods and systems for oral health
CN108350510B (en) Microbiome derived diagnostic and therapeutic methods and systems for gastrointestinal health related disorders
Haldar et al. The sputum microbiome is distinct between COPD and health, independent of smoking history
Fieten et al. Fecal microbiome and food allergy in pediatric atopic dermatitis: a cross-sectional pilot study
CN107002021B (en) Biomarker for rheumatoid arthritis and application thereof
US10246753B2 (en) Method and system for characterizing mouth-associated conditions
Tito et al. Phylotyping and functional analysis of two ancient human microbiomes
CN108350019B (en) Microbiome derived diagnostic and therapeutic methods and systems for bacterial vaginosis
Biagi et al. Gut microbiome in Down syndrome
CN108348167B (en) Microbiota-derived diagnostic and therapeutic methods and systems for brain-craniofacial health-related disorders
CN108348168B (en) Microbiome derived diagnostic and therapeutic methods and systems for eczema
JP6485843B2 (en) Rheumatoid arthritis biomarker and use thereof
Mandarano et al. Eukaryotes in the gut microbiota in myalgic encephalomyelitis/chronic fatigue syndrome
AU2016321328A1 (en) Method and system for microbiome-derived diagnostics and therapeutics infectious disease and other health conditions associated with antibiotic usage
CN106795480B (en) Biomarker for rheumatoid arthritis and application thereof
AU2017229488A1 (en) Method and system for characterizing mouth-associated conditions
CN106795479B (en) Biomarker for rheumatoid arthritis and application thereof
WO2017034031A1 (en) Autoimmune disease diagnosis method, autoimmune disease diagnosis biomarker, and autoimmune disease preventing or treating agent
CN108350503B (en) Microbiome derived diagnostic and therapeutic methods and systems for thyroid health problem related disorders
Klimenko et al. The ability of taxonomic identification of bifidobacteria based on the variable regions of 16S rRNA gene
Faits The evaluation, application, and expansion of 16s amplicon metagenomics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518083 11F-3, Beishan industrial complex, 146 Beishan Road, Yantian District, Shenzhen, Guangdong

Applicant after: BGI SHENZHEN Co.,Ltd.

Applicant after: BGI SHENZHEN

Address before: 518083 11F-3, Beishan industrial complex, 146 Beishan Road, Yantian District, Shenzhen, Guangdong

Applicant before: BGI SHENZHEN Co.,Ltd.

Applicant before: BGI SHENZHEN

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant