US20230407409A1 - Microbiome based identification, monitoring and enhancement of fermentation processes and products - Google Patents
Microbiome based identification, monitoring and enhancement of fermentation processes and products Download PDFInfo
- Publication number
- US20230407409A1 US20230407409A1 US17/823,435 US202217823435A US2023407409A1 US 20230407409 A1 US20230407409 A1 US 20230407409A1 US 202217823435 A US202217823435 A US 202217823435A US 2023407409 A1 US2023407409 A1 US 2023407409A1
- Authority
- US
- United States
- Prior art keywords
- bacteroides
- samples
- pseudomonas
- sequencing
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 244000005700 microbiome Species 0.000 title claims abstract description 196
- 238000000855 fermentation Methods 0.000 title abstract description 112
- 230000004151 fermentation Effects 0.000 title abstract description 112
- 238000012544 monitoring process Methods 0.000 title abstract description 6
- 238000000034 method Methods 0.000 claims abstract description 231
- 238000012163 sequencing technique Methods 0.000 claims abstract description 100
- 239000000463 material Substances 0.000 claims abstract description 32
- 238000004519 manufacturing process Methods 0.000 claims abstract description 26
- 235000013305 food Nutrition 0.000 claims abstract description 9
- 108090000623 proteins and genes Proteins 0.000 claims description 133
- 239000002689 soil Substances 0.000 claims description 88
- 230000008569 process Effects 0.000 claims description 52
- 108091023242 Internal transcribed spacer Proteins 0.000 claims description 49
- 150000007523 nucleic acids Chemical class 0.000 claims description 37
- 241000894006 Bacteria Species 0.000 claims description 36
- 102000039446 nucleic acids Human genes 0.000 claims description 24
- 108020004707 nucleic acids Proteins 0.000 claims description 24
- 238000011109 contamination Methods 0.000 claims description 16
- 230000001580 bacterial effect Effects 0.000 claims description 14
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 13
- 239000003153 chemical reaction reagent Substances 0.000 claims description 13
- 238000010801 machine learning Methods 0.000 claims description 13
- 241000233866 Fungi Species 0.000 claims description 12
- 238000012512 characterization method Methods 0.000 claims description 12
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 claims description 10
- 239000012773 agricultural material Substances 0.000 claims description 7
- 238000010009 beating Methods 0.000 claims description 7
- 241000203069 Archaea Species 0.000 claims description 6
- 238000000265 homogenisation Methods 0.000 claims description 6
- 235000015097 nutrients Nutrition 0.000 claims description 6
- 229910052757 nitrogen Inorganic materials 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 5
- 201000010099 disease Diseases 0.000 claims description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 4
- 230000002538 fungal effect Effects 0.000 claims description 4
- 238000012706 support-vector machine Methods 0.000 claims description 4
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 claims description 2
- 229920002472 Starch Polymers 0.000 claims description 2
- 229910052799 carbon Inorganic materials 0.000 claims description 2
- 235000019698 starch Nutrition 0.000 claims description 2
- 239000008107 starch Substances 0.000 claims description 2
- 238000007671 third-generation sequencing Methods 0.000 claims description 2
- 230000001747 exhibiting effect Effects 0.000 claims 1
- 238000004458 analytical method Methods 0.000 abstract description 77
- 230000000694 effects Effects 0.000 abstract description 11
- 230000009286 beneficial effect Effects 0.000 abstract description 8
- 239000002551 biofuel Substances 0.000 abstract description 6
- 241000606125 Bacteroides Species 0.000 description 287
- 239000000523 sample Substances 0.000 description 176
- 241000589516 Pseudomonas Species 0.000 description 146
- 230000000813 microbial effect Effects 0.000 description 87
- 241000588722 Escherichia Species 0.000 description 71
- 241000607768 Shigella Species 0.000 description 67
- 241000194033 Enterococcus Species 0.000 description 62
- 108020004414 DNA Proteins 0.000 description 61
- 102000053602 DNA Human genes 0.000 description 61
- 241000196324 Embryophyta Species 0.000 description 60
- 235000014101 wine Nutrition 0.000 description 60
- 239000000203 mixture Substances 0.000 description 54
- 241000894007 species Species 0.000 description 53
- 241000588914 Enterobacter Species 0.000 description 49
- 102000004169 proteins and genes Human genes 0.000 description 42
- 241000194036 Lactococcus Species 0.000 description 36
- 108020004465 16S ribosomal RNA Proteins 0.000 description 35
- 241000186000 Bifidobacterium Species 0.000 description 34
- 239000000243 solution Substances 0.000 description 34
- 239000002585 base Substances 0.000 description 33
- 239000003550 marker Substances 0.000 description 32
- 238000012165 high-throughput sequencing Methods 0.000 description 31
- 241000588923 Citrobacter Species 0.000 description 30
- 241000588748 Klebsiella Species 0.000 description 30
- 238000003752 polymerase chain reaction Methods 0.000 description 30
- 238000005516 engineering process Methods 0.000 description 29
- 241001202853 Blautia Species 0.000 description 28
- 241000194017 Streptococcus Species 0.000 description 28
- 241000186660 Lactobacillus Species 0.000 description 27
- 229940039696 lactobacillus Drugs 0.000 description 27
- 239000002773 nucleotide Substances 0.000 description 26
- 125000003729 nucleotide group Chemical group 0.000 description 26
- 238000012545 processing Methods 0.000 description 26
- 238000012360 testing method Methods 0.000 description 25
- 241000607720 Serratia Species 0.000 description 24
- 230000008901 benefit Effects 0.000 description 24
- 239000006228 supernatant Substances 0.000 description 23
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 23
- 238000011156 evaluation Methods 0.000 description 22
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 20
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 20
- 239000008188 pellet Substances 0.000 description 20
- 238000012546 transfer Methods 0.000 description 19
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 18
- 235000009754 Vitis X bourquina Nutrition 0.000 description 17
- 235000012333 Vitis X labruscana Nutrition 0.000 description 17
- 240000006365 Vitis vinifera Species 0.000 description 17
- 235000014787 Vitis vinifera Nutrition 0.000 description 17
- 238000006243 chemical reaction Methods 0.000 description 15
- 230000007613 environmental effect Effects 0.000 description 15
- 239000007788 liquid Substances 0.000 description 15
- 238000002360 preparation method Methods 0.000 description 15
- 229920002477 rna polymer Polymers 0.000 description 15
- 238000011514 vinification Methods 0.000 description 15
- 241000186394 Eubacterium Species 0.000 description 14
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 14
- 239000000126 substance Substances 0.000 description 14
- 241000701474 Alistipes Species 0.000 description 13
- 239000012634 fragment Substances 0.000 description 13
- 230000002068 genetic effect Effects 0.000 description 13
- 239000000047 product Substances 0.000 description 13
- 238000003306 harvesting Methods 0.000 description 12
- 241001227086 Anaerostipes Species 0.000 description 11
- 241000160321 Parabacteroides Species 0.000 description 11
- 241000607734 Yersinia <bacteria> Species 0.000 description 11
- 238000013459 approach Methods 0.000 description 11
- 238000005070 sampling Methods 0.000 description 11
- 241000282414 Homo sapiens Species 0.000 description 10
- 241000589323 Methylobacterium Species 0.000 description 10
- 230000003321 amplification Effects 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 10
- 230000008859 change Effects 0.000 description 10
- 238000000605 extraction Methods 0.000 description 10
- 230000012010 growth Effects 0.000 description 10
- 239000011159 matrix material Substances 0.000 description 10
- 238000003199 nucleic acid amplification method Methods 0.000 description 10
- 238000005406 washing Methods 0.000 description 10
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 9
- 108091093088 Amplicon Proteins 0.000 description 9
- 241001493533 Streptophyta Species 0.000 description 9
- 230000000052 comparative effect Effects 0.000 description 9
- 238000012175 pyrosequencing Methods 0.000 description 9
- 238000003908 quality control method Methods 0.000 description 9
- 230000002441 reversible effect Effects 0.000 description 9
- 238000003860 storage Methods 0.000 description 9
- 238000000018 DNA microarray Methods 0.000 description 8
- 241001657509 Eggerthella Species 0.000 description 8
- 241001309423 Lactonifactor Species 0.000 description 8
- 241000219094 Vitaceae Species 0.000 description 8
- 235000021021 grapes Nutrition 0.000 description 8
- JVTAAEKCZFNVCJ-UHFFFAOYSA-N lactic acid Chemical compound CC(O)C(O)=O JVTAAEKCZFNVCJ-UHFFFAOYSA-N 0.000 description 8
- 239000002609 medium Substances 0.000 description 8
- 239000002207 metabolite Substances 0.000 description 8
- 239000004094 surface-active agent Substances 0.000 description 8
- 238000012800 visualization Methods 0.000 description 8
- 241000186046 Actinomyces Species 0.000 description 7
- 238000007400 DNA extraction Methods 0.000 description 7
- 102000004190 Enzymes Human genes 0.000 description 7
- 108090000790 Enzymes Proteins 0.000 description 7
- 241000605861 Prevotella Species 0.000 description 7
- 241001478280 Rahnella Species 0.000 description 7
- 241000192031 Ruminococcus Species 0.000 description 7
- 241000607142 Salmonella Species 0.000 description 7
- 230000001476 alcoholic effect Effects 0.000 description 7
- 238000013461 design Methods 0.000 description 7
- 235000013399 edible fruits Nutrition 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 6
- 241001216243 Butyricimonas Species 0.000 description 6
- 241001464956 Collinsella Species 0.000 description 6
- 241000605716 Desulfovibrio Species 0.000 description 6
- 241001608234 Faecalibacterium Species 0.000 description 6
- 241000662772 Flavonifractor Species 0.000 description 6
- 241000192132 Leuconostoc Species 0.000 description 6
- 241000321184 Raoultella Species 0.000 description 6
- 241000605947 Roseburia Species 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- KRKNYBCHXYNGOX-UHFFFAOYSA-N citric acid Chemical compound OC(=O)CC(O)(C(O)=O)CC(O)=O KRKNYBCHXYNGOX-UHFFFAOYSA-N 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 230000002550 fecal effect Effects 0.000 description 6
- 238000001914 filtration Methods 0.000 description 6
- 230000000968 intestinal effect Effects 0.000 description 6
- 238000005259 measurement Methods 0.000 description 6
- 238000007481 next generation sequencing Methods 0.000 description 6
- 108020004418 ribosomal RNA Proteins 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 241000390531 Cloacibacillus Species 0.000 description 5
- 241001464948 Coprococcus Species 0.000 description 5
- 238000001712 DNA sequencing Methods 0.000 description 5
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 5
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 5
- 241001535083 Dialister Species 0.000 description 5
- 241001143779 Dorea Species 0.000 description 5
- 241000588731 Hafnia Species 0.000 description 5
- 241001465754 Metazoa Species 0.000 description 5
- 108091034117 Oligonucleotide Proteins 0.000 description 5
- 241000280572 Pseudoflavonifractor Species 0.000 description 5
- 241001136694 Subdoligranulum Species 0.000 description 5
- 241001148134 Veillonella Species 0.000 description 5
- 239000012620 biological material Substances 0.000 description 5
- 239000000872 buffer Substances 0.000 description 5
- 239000003086 colorant Substances 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 150000001875 compounds Chemical class 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 230000001419 dependent effect Effects 0.000 description 5
- 238000001514 detection method Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 5
- 238000007710 freezing Methods 0.000 description 5
- 230000008014 freezing Effects 0.000 description 5
- CJNBYAVZURUTKZ-UHFFFAOYSA-N hafnium(IV) oxide Inorganic materials O=[Hf]=O CJNBYAVZURUTKZ-UHFFFAOYSA-N 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 230000002906 microbiologic effect Effects 0.000 description 5
- XEBWQGVWTUSTLN-UHFFFAOYSA-M phenylmercury acetate Chemical compound CC(=O)O[Hg]C1=CC=CC=C1 XEBWQGVWTUSTLN-UHFFFAOYSA-M 0.000 description 5
- 239000004033 plastic Substances 0.000 description 5
- 229920003023 plastic Polymers 0.000 description 5
- 108020004463 18S ribosomal RNA Proteins 0.000 description 4
- 241000607534 Aeromonas Species 0.000 description 4
- 241000611330 Chryseobacterium Species 0.000 description 4
- 241001443882 Coprobacillus Species 0.000 description 4
- 241000410239 Coraliomargarita Species 0.000 description 4
- 241000588698 Erwinia Species 0.000 description 4
- 241001520576 Gordonibacter Species 0.000 description 4
- 241000235796 Granulicatella Species 0.000 description 4
- 240000002605 Lactobacillus helveticus Species 0.000 description 4
- 235000013967 Lactobacillus helveticus Nutrition 0.000 description 4
- 241001647840 Leclercia Species 0.000 description 4
- 241001387858 Lentisphaera Species 0.000 description 4
- 241000890160 Limnobacter Species 0.000 description 4
- 241000741587 Luteolibacter Species 0.000 description 4
- 241000605635 Lutispora Species 0.000 description 4
- 241000025362 Marinifilum Species 0.000 description 4
- 241000183029 Mariprofundus Species 0.000 description 4
- 241000927544 Olsenella Species 0.000 description 4
- 241000843248 Oscillibacter Species 0.000 description 4
- 241000202386 Pseudobutyrivibrio Species 0.000 description 4
- 108020001027 Ribosomal DNA Proteins 0.000 description 4
- 241001657520 Slackia Species 0.000 description 4
- 238000007792 addition Methods 0.000 description 4
- 235000013351 cheese Nutrition 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 210000003763 chloroplast Anatomy 0.000 description 4
- 230000002759 chromosomal effect Effects 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 230000002596 correlated effect Effects 0.000 description 4
- 230000007423 decrease Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- ZJYYHGLJYGJLLN-UHFFFAOYSA-N guanidinium thiocyanate Chemical compound SC#N.NC(N)=N ZJYYHGLJYGJLLN-UHFFFAOYSA-N 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000001965 increasing effect Effects 0.000 description 4
- 239000003112 inhibitor Substances 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 239000004310 lactic acid Substances 0.000 description 4
- 229960000448 lactic acid Drugs 0.000 description 4
- 235000014655 lactic acid Nutrition 0.000 description 4
- 239000002184 metal Substances 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 239000002994 raw material Substances 0.000 description 4
- 238000000926 separation method Methods 0.000 description 4
- 244000000000 soil microbiome Species 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- BJEPYKJPYRNKOW-REOHCLBHSA-N (S)-malic acid Chemical compound OC(=O)[C@@H](O)CC(O)=O BJEPYKJPYRNKOW-REOHCLBHSA-N 0.000 description 3
- 241001495178 Acetivibrio Species 0.000 description 3
- 241001468161 Acetobacterium Species 0.000 description 3
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 3
- 241000466670 Adlercreutzia Species 0.000 description 3
- 241001013579 Anaerotruncus Species 0.000 description 3
- 102000007347 Apyrase Human genes 0.000 description 3
- 108010007730 Apyrase Proteins 0.000 description 3
- 241000927512 Barnesiella Species 0.000 description 3
- 241001495171 Bilophila Species 0.000 description 3
- 239000002028 Biomass Substances 0.000 description 3
- 241001557932 Butyricicoccus Species 0.000 description 3
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 3
- 108020004998 Chloroplast DNA Proteins 0.000 description 3
- 238000007399 DNA isolation Methods 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- 241000206602 Eukaryota Species 0.000 description 3
- 241000589565 Flavobacterium Species 0.000 description 3
- 102000013446 GTP Phosphohydrolases Human genes 0.000 description 3
- 108091006109 GTPases Proteins 0.000 description 3
- 241001185600 Gemmiger Species 0.000 description 3
- 241000605016 Herbaspirillum Species 0.000 description 3
- 241000862469 Holdemania Species 0.000 description 3
- 241001148465 Janthinobacterium Species 0.000 description 3
- 241000588752 Kluyvera Species 0.000 description 3
- 241001233595 Lachnobacterium Species 0.000 description 3
- 240000006024 Lactobacillus plantarum Species 0.000 description 3
- 235000013965 Lactobacillus plantarum Nutrition 0.000 description 3
- 108060001084 Luciferase Proteins 0.000 description 3
- 239000005089 Luciferase Substances 0.000 description 3
- 241000202987 Methanobrevibacter Species 0.000 description 3
- 108020005196 Mitochondrial DNA Proteins 0.000 description 3
- 241000098695 Moryella Species 0.000 description 3
- 229910019142 PO4 Inorganic materials 0.000 description 3
- 241000520272 Pantoea Species 0.000 description 3
- 241000168733 Paraeggerthella Species 0.000 description 3
- 241000192001 Pediococcus Species 0.000 description 3
- 241001660097 Pedobacter Species 0.000 description 3
- 102000035195 Peptidases Human genes 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- 241000605894 Porphyromonas Species 0.000 description 3
- 241000186429 Propionibacterium Species 0.000 description 3
- 239000004365 Protease Substances 0.000 description 3
- 241000192142 Proteobacteria Species 0.000 description 3
- 241000588769 Proteus <enterobacteria> Species 0.000 description 3
- 241000588671 Psychrobacter Species 0.000 description 3
- 241000232299 Ralstonia Species 0.000 description 3
- 241000589180 Rhizobium Species 0.000 description 3
- 240000000111 Saccharum officinarum Species 0.000 description 3
- 235000007201 Saccharum officinarum Nutrition 0.000 description 3
- 241000605036 Selenomonas Species 0.000 description 3
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 3
- 241000191940 Staphylococcus Species 0.000 description 3
- 241000122971 Stenotrophomonas Species 0.000 description 3
- 241000187747 Streptomyces Species 0.000 description 3
- 241000207194 Vagococcus Species 0.000 description 3
- 241000607598 Vibrio Species 0.000 description 3
- IRLPACMLTUPBCL-FCIPNVEPSA-N adenosine-5'-phosphosulfate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@@H](CO[P@](O)(=O)OS(O)(=O)=O)[C@H](O)[C@H]1O IRLPACMLTUPBCL-FCIPNVEPSA-N 0.000 description 3
- 239000003463 adsorbent Substances 0.000 description 3
- BJEPYKJPYRNKOW-UHFFFAOYSA-N alpha-hydroxysuccinic acid Natural products OC(=O)C(O)CC(O)=O BJEPYKJPYRNKOW-UHFFFAOYSA-N 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 235000011089 carbon dioxide Nutrition 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 3
- 230000003196 chaotropic effect Effects 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000002790 cross-validation Methods 0.000 description 3
- 238000007418 data mining Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 210000003608 fece Anatomy 0.000 description 3
- 239000011888 foil Substances 0.000 description 3
- 239000000446 fuel Substances 0.000 description 3
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000010348 incorporation Methods 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 230000000670 limiting effect Effects 0.000 description 3
- 239000010871 livestock manure Substances 0.000 description 3
- 239000001630 malic acid Substances 0.000 description 3
- 235000011090 malic acid Nutrition 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 238000002705 metabolomic analysis Methods 0.000 description 3
- 230000001431 metabolomic effect Effects 0.000 description 3
- 102000042567 non-coding RNA Human genes 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 239000012071 phase Substances 0.000 description 3
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 3
- 239000010452 phosphate Substances 0.000 description 3
- 238000004321 preservation Methods 0.000 description 3
- 230000012846 protein folding Effects 0.000 description 3
- 108700022487 rRNA Genes Proteins 0.000 description 3
- 239000000700 radioactive tracer Substances 0.000 description 3
- 238000007637 random forest analysis Methods 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 241000201860 Abiotrophia Species 0.000 description 2
- 241001420322 Acetanaerobacterium Species 0.000 description 2
- 241000203024 Acholeplasma Species 0.000 description 2
- 241000604451 Acidaminococcus Species 0.000 description 2
- 241000726121 Acidianus Species 0.000 description 2
- 241000726119 Acidovorax Species 0.000 description 2
- 241000589291 Acinetobacter Species 0.000 description 2
- 241000606750 Actinobacillus Species 0.000 description 2
- 241001156739 Actinobacteria <phylum> Species 0.000 description 2
- 241001663413 Aequorivita Species 0.000 description 2
- 241000567147 Aeropyrum Species 0.000 description 2
- 241001024600 Aggregatibacter Species 0.000 description 2
- 241000145069 Agreia Species 0.000 description 2
- 241000702460 Akkermansia Species 0.000 description 2
- 241000037909 Alkalibaculum Species 0.000 description 2
- 241000825535 Alkaliflexus Species 0.000 description 2
- 241001258698 Allisonella Species 0.000 description 2
- 101100166957 Anabaena sp. (strain L31) groEL2 gene Proteins 0.000 description 2
- 241000511612 Anaerofilum Species 0.000 description 2
- 241000079561 Anaerofustis Species 0.000 description 2
- 241000204018 Anaeroplasma Species 0.000 description 2
- 241000848219 Aquincola Species 0.000 description 2
- 241001135163 Arcobacter Species 0.000 description 2
- 241000186063 Arthrobacter Species 0.000 description 2
- 241000493436 Asaccharobacter Species 0.000 description 2
- 241000203081 Asteroleplasma Species 0.000 description 2
- 241000193818 Atopobium Species 0.000 description 2
- 241001313703 Bacteriovorax Species 0.000 description 2
- 235000017166 Bambusa arundinacea Nutrition 0.000 description 2
- 235000017491 Bambusa tulda Nutrition 0.000 description 2
- 241000053357 Bavariicoccus Species 0.000 description 2
- 241000604933 Bdellovibrio Species 0.000 description 2
- 241001648895 Bergeriella <Ciliophora> Species 0.000 description 2
- 241000186018 Bifidobacterium adolescentis Species 0.000 description 2
- 241000589173 Bradyrhizobium Species 0.000 description 2
- 241001236205 Brenneria Species 0.000 description 2
- 241000186146 Brevibacterium Species 0.000 description 2
- 241001622847 Buttiauxella Species 0.000 description 2
- 241000605902 Butyrivibrio Species 0.000 description 2
- 241000428792 Caldimicrobium Species 0.000 description 2
- 241001672012 Caldisericum Species 0.000 description 2
- 241000589876 Campylobacter Species 0.000 description 2
- 241000282465 Canis Species 0.000 description 2
- 241000190890 Capnocytophaga Species 0.000 description 2
- 241000206594 Carnobacterium Species 0.000 description 2
- 241001468185 Caryophanon Species 0.000 description 2
- 241000946390 Catenibacterium Species 0.000 description 2
- 241000159556 Catonella Species 0.000 description 2
- 241000863012 Caulobacter Species 0.000 description 2
- 241001496942 Cellulophaga Species 0.000 description 2
- 241000065744 Cellulosilyticum Species 0.000 description 2
- 241001051186 Cetobacterium Species 0.000 description 2
- 241001135720 Chelatococcus Species 0.000 description 2
- 241000191366 Chlorobium Species 0.000 description 2
- 241000298828 Cloacibacterium Species 0.000 description 2
- 241000589519 Comamonas Species 0.000 description 2
- 241001425834 Conexibacter Species 0.000 description 2
- 241000186216 Corynebacterium Species 0.000 description 2
- 241001445332 Coxiella <snail> Species 0.000 description 2
- 241001528480 Cupriavidus Species 0.000 description 2
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 2
- 241001245615 Dechloromonas Species 0.000 description 2
- CYCGRDQQIOGCKX-UHFFFAOYSA-N Dehydro-luciferin Natural products OC(=O)C1=CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 CYCGRDQQIOGCKX-UHFFFAOYSA-N 0.000 description 2
- 102000016911 Deoxyribonucleases Human genes 0.000 description 2
- 108010053770 Deoxyribonucleases Proteins 0.000 description 2
- 241000205145 Desulfobacterium Species 0.000 description 2
- 241000605802 Desulfobulbus Species 0.000 description 2
- 241001410212 Desulfopila Species 0.000 description 2
- 241000024397 Dysgonomonas Species 0.000 description 2
- 241001552883 Enhydrobacter Species 0.000 description 2
- 241001522957 Enterococcus casseliflavus Species 0.000 description 2
- 241001560646 Enterorhabdus Species 0.000 description 2
- 241001350695 Ethanoligenens Species 0.000 description 2
- 241000195623 Euglenida Species 0.000 description 2
- 241001478891 Filibacter Species 0.000 description 2
- 241000192125 Firmicutes Species 0.000 description 2
- BJGNCJDXODQBOB-UHFFFAOYSA-N Fivefly Luciferin Natural products OC(=O)C1CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 BJGNCJDXODQBOB-UHFFFAOYSA-N 0.000 description 2
- 241000605909 Fusobacterium Species 0.000 description 2
- 241000207202 Gardnerella Species 0.000 description 2
- 101710171049 Gene 21 protein Proteins 0.000 description 2
- 241001135750 Geobacter Species 0.000 description 2
- 241000606790 Haemophilus Species 0.000 description 2
- 241001548491 Haliea Species 0.000 description 2
- 241000509618 Hallella Species 0.000 description 2
- 241000972946 Hespellia Species 0.000 description 2
- 241000623267 Howardella Species 0.000 description 2
- 241000216643 Hydrogenophaga Species 0.000 description 2
- 241001280002 Ilumatobacter Species 0.000 description 2
- 241000316163 Jeotgalicoccus Species 0.000 description 2
- 241000588915 Klebsiella aerogenes Species 0.000 description 2
- 241000579722 Kocuria Species 0.000 description 2
- 241000186809 Kurthia Species 0.000 description 2
- 241001147746 Lactobacillus delbrueckii subsp. lactis Species 0.000 description 2
- 241000186840 Lactobacillus fermentum Species 0.000 description 2
- 241000186606 Lactobacillus gasseri Species 0.000 description 2
- 241000968140 Lactobacillus hominis Species 0.000 description 2
- 241001468157 Lactobacillus johnsonii Species 0.000 description 2
- 241000016642 Lactobacillus manihotivorans Species 0.000 description 2
- 241000394636 Lactobacillus mucosae Species 0.000 description 2
- 241000186784 Lactobacillus oris Species 0.000 description 2
- 241000186605 Lactobacillus paracasei Species 0.000 description 2
- 241000866650 Lactobacillus paraplantarum Species 0.000 description 2
- 241000186684 Lactobacillus pentosus Species 0.000 description 2
- 241001495404 Lactobacillus pontis Species 0.000 description 2
- 241000186604 Lactobacillus reuteri Species 0.000 description 2
- 241000218588 Lactobacillus rhamnosus Species 0.000 description 2
- 241000186869 Lactobacillus salivarius Species 0.000 description 2
- 241000186868 Lactobacillus sanfranciscensis Species 0.000 description 2
- 235000013864 Lactobacillus sanfrancisco Nutrition 0.000 description 2
- 241000186783 Lactobacillus vaginalis Species 0.000 description 2
- 241000577554 Lactobacillus zeae Species 0.000 description 2
- 241000371451 Lactococcus fujiensis Species 0.000 description 2
- 241000194040 Lactococcus garvieae Species 0.000 description 2
- 241000685814 Lactonifactor longoviformis Species 0.000 description 2
- 241000192003 Leuconostoc carnosum Species 0.000 description 2
- 241001468192 Leuconostoc citreum Species 0.000 description 2
- 241001376276 Leuconostoc garlicum Species 0.000 description 2
- 241000192131 Leuconostoc gelidum Species 0.000 description 2
- 241000201465 Leuconostoc gelidum subsp. gasicomitatum Species 0.000 description 2
- 241000779470 Leuconostoc inhae Species 0.000 description 2
- 241000192129 Leuconostoc lactis Species 0.000 description 2
- 241000192130 Leuconostoc mesenteroides Species 0.000 description 2
- 241001468196 Leuconostoc pseudomesenteroides Species 0.000 description 2
- DDWFXDSYGUXRAY-UHFFFAOYSA-N Luciferin Natural products CCc1c(C)c(CC2NC(=O)C(=C2C=C)C)[nH]c1Cc3[nH]c4C(=C5/NC(CC(=O)O)C(C)C5CC(=O)O)CC(=O)c4c3C DDWFXDSYGUXRAY-UHFFFAOYSA-N 0.000 description 2
- 102100032131 Lymphocyte antigen 6E Human genes 0.000 description 2
- 241000206589 Marinobacter Species 0.000 description 2
- 241000356711 Marinobacter arcticus Species 0.000 description 2
- 241001046559 Marvinbryantia Species 0.000 description 2
- 241000043362 Megamonas Species 0.000 description 2
- 241000604449 Megasphaera Species 0.000 description 2
- 241001468189 Melissococcus Species 0.000 description 2
- 241000204677 Methanosphaera Species 0.000 description 2
- 240000003433 Miscanthus floridulus Species 0.000 description 2
- 241000509624 Mitsuokella Species 0.000 description 2
- 241000588771 Morganella <proteobacterium> Species 0.000 description 2
- 241000592260 Moritella Species 0.000 description 2
- 241000186359 Mycobacterium Species 0.000 description 2
- LRHPLDYGYMQRHN-UHFFFAOYSA-N N-Butanol Chemical compound CCCCO LRHPLDYGYMQRHN-UHFFFAOYSA-N 0.000 description 2
- 241000909297 Negativicoccus Species 0.000 description 2
- 241000383839 Novosphingobium Species 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 241000785902 Odoribacter Species 0.000 description 2
- 241001193704 Orbus Species 0.000 description 2
- 241000089925 Oribacterium Species 0.000 description 2
- 241000108522 Owenweeksia Species 0.000 description 2
- 241000605937 Oxalobacter Species 0.000 description 2
- 241000740708 Paludibacter Species 0.000 description 2
- 241001520808 Panicum virgatum Species 0.000 description 2
- 241000588912 Pantoea agglomerans Species 0.000 description 2
- 241001446614 Papillibacter Species 0.000 description 2
- 241001267970 Paraprevotella Species 0.000 description 2
- 241001267951 Parasutterella Species 0.000 description 2
- 241000191992 Peptostreptococcus Species 0.000 description 2
- 241001464921 Phascolarctobacterium Species 0.000 description 2
- 241000607568 Photobacterium Species 0.000 description 2
- 244000082204 Phyllostachys viridis Species 0.000 description 2
- 235000015334 Phyllostachys viridis Nutrition 0.000 description 2
- 241000398992 Pilibacter Species 0.000 description 2
- 241000589952 Planctomyces Species 0.000 description 2
- 241000351212 Planomicrobium Species 0.000 description 2
- 241000607000 Plesiomonas Species 0.000 description 2
- 241000192696 Porphyrobacter Species 0.000 description 2
- 241001008590 Proteiniborus Species 0.000 description 2
- 241001161494 Proteiniphilum Species 0.000 description 2
- 241001078830 Pseudochrobactrum Species 0.000 description 2
- 241001183539 Pyramidobacter Species 0.000 description 2
- 241000588746 Raoultella planticola Species 0.000 description 2
- 241000186813 Renibacterium Species 0.000 description 2
- 241000316848 Rhodococcus <scale insect> Species 0.000 description 2
- 241000084440 Rhodopirellula Species 0.000 description 2
- 241001135259 Rikenella Species 0.000 description 2
- 241001662542 Robinsoniella Species 0.000 description 2
- 241000741591 Roseibacillus Species 0.000 description 2
- 241001453443 Rothia <bacteria> Species 0.000 description 2
- 241001400155 Rubritalea Species 0.000 description 2
- 241000888053 Saccharofermentans Species 0.000 description 2
- 241000193000 Salinicoccus Species 0.000 description 2
- 241000823229 Salinimicrobium Species 0.000 description 2
- 241000277331 Salmonidae Species 0.000 description 2
- 241001486845 Scardovia Species 0.000 description 2
- 241000543650 Schwartzia <Bacteria> Species 0.000 description 2
- 241000778933 Sedimenticola Species 0.000 description 2
- 241000199512 Sediminibacter Species 0.000 description 2
- 241000863430 Shewanella Species 0.000 description 2
- 241001571329 Solibacillus Species 0.000 description 2
- 241000549372 Solobacterium Species 0.000 description 2
- 241000383837 Sphingobium Species 0.000 description 2
- 241000736131 Sphingomonas Species 0.000 description 2
- 241001397065 Sporacetigenium Species 0.000 description 2
- 241000168515 Sporobacter Species 0.000 description 2
- 241000194049 Streptococcus equinus Species 0.000 description 2
- 244000057717 Streptococcus lactis Species 0.000 description 2
- 235000014897 Streptococcus lactis Nutrition 0.000 description 2
- 241001617354 Streptococcus lutetiensis Species 0.000 description 2
- 238000000692 Student's t-test Methods 0.000 description 2
- 241000124839 Succiniclasticum Species 0.000 description 2
- 102000004523 Sulfate Adenylyltransferase Human genes 0.000 description 2
- 108010022348 Sulfate adenylyltransferase Proteins 0.000 description 2
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 2
- 241000289862 Sulfuricella Species 0.000 description 2
- 241000580834 Sulfurospirillum Species 0.000 description 2
- 241000123710 Sutterella Species 0.000 description 2
- 101100439396 Synechococcus sp. (strain ATCC 27144 / PCC 6301 / SAUG 1402/1) groEL1 gene Proteins 0.000 description 2
- 241001656784 Syntrophococcus Species 0.000 description 2
- 241000606017 Syntrophomonas Species 0.000 description 2
- 241000158541 Syntrophus <bacteria> Species 0.000 description 2
- 241001470488 Tannerella Species 0.000 description 2
- 241001622829 Tatumella Species 0.000 description 2
- 241000205174 Thermofilum Species 0.000 description 2
- 241000194510 Thermogymnomonas Species 0.000 description 2
- 241000374781 Thermovirga Species 0.000 description 2
- 241001453270 Thiomonas Species 0.000 description 2
- 241000384691 Thorsellia Species 0.000 description 2
- 241001635318 Trichococcus Species 0.000 description 2
- 241000223259 Trichoderma Species 0.000 description 2
- 241001425419 Turicibacter Species 0.000 description 2
- 241001568331 Vampirovibrio Species 0.000 description 2
- 241001085041 Varibaculum Species 0.000 description 2
- 241001478283 Variovorax Species 0.000 description 2
- 241000703752 Victivallis Species 0.000 description 2
- 241000202221 Weissella Species 0.000 description 2
- 241000607479 Yersinia pestis Species 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 239000011425 bamboo Substances 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 230000008827 biological function Effects 0.000 description 2
- 239000000090 biomarker Substances 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 238000005352 clarification Methods 0.000 description 2
- 239000004927 clay Substances 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 229910052571 earthenware Inorganic materials 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 239000004744 fabric Substances 0.000 description 2
- 239000003337 fertilizer Substances 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 101150077981 groEL gene Proteins 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000020868 induced systemic resistance Effects 0.000 description 2
- 239000003317 industrial substance Substances 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 229940012969 lactobacillus fermentum Drugs 0.000 description 2
- 229940054346 lactobacillus helveticus Drugs 0.000 description 2
- 229940072205 lactobacillus plantarum Drugs 0.000 description 2
- 229940001882 lactobacillus reuteri Drugs 0.000 description 2
- 229920002521 macromolecule Polymers 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 238000002493 microarray Methods 0.000 description 2
- 238000010208 microarray analysis Methods 0.000 description 2
- 244000000010 microbial pathogen Species 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 235000019520 non-alcoholic beverage Nutrition 0.000 description 2
- 108091027963 non-coding RNA Proteins 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013081 phylogenetic analysis Methods 0.000 description 2
- 230000035479 physiological effects, processes and functions Effects 0.000 description 2
- 102000054765 polymorphisms of proteins Human genes 0.000 description 2
- 102000040430 polynucleotide Human genes 0.000 description 2
- 108091033319 polynucleotide Proteins 0.000 description 2
- 239000002157 polynucleotide Substances 0.000 description 2
- 239000003755 preservative agent Substances 0.000 description 2
- 108090000765 processed proteins & peptides Proteins 0.000 description 2
- 102000004196 processed proteins & peptides Human genes 0.000 description 2
- 210000003705 ribosome Anatomy 0.000 description 2
- 210000004708 ribosome subunit Anatomy 0.000 description 2
- 239000013049 sediment Substances 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 239000010865 sewage Substances 0.000 description 2
- 238000012421 spiking Methods 0.000 description 2
- 230000006641 stabilisation Effects 0.000 description 2
- 238000011105 stabilization Methods 0.000 description 2
- 229910052717 sulfur Inorganic materials 0.000 description 2
- 239000011593 sulfur Substances 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000021918 systemic acquired resistance Effects 0.000 description 2
- 238000012353 t test Methods 0.000 description 2
- 238000010257 thawing Methods 0.000 description 2
- -1 transcripts Proteins 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 239000011534 wash buffer Substances 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 239000002023 wood Substances 0.000 description 2
- 235000013618 yogurt Nutrition 0.000 description 2
- 108020005097 23S Ribosomal RNA Proteins 0.000 description 1
- 241000201856 Abiotrophia defectiva Species 0.000 description 1
- 241001420307 Acetanaerobacterium elongatum Species 0.000 description 1
- 241001468163 Acetobacterium woodii Species 0.000 description 1
- 241000604450 Acidaminococcus fermentans Species 0.000 description 1
- 241000132982 Acidianus brierleyi Species 0.000 description 1
- 241001528273 Acinetobacter guillouiae Species 0.000 description 1
- 241000122230 Acinetobacter junii Species 0.000 description 1
- 241000197362 Actinomyces lingnae Species 0.000 description 1
- 241000186066 Actinomyces odontolyticus Species 0.000 description 1
- 241000452716 Adlercreutzia equolifaciens Species 0.000 description 1
- 241000193798 Aerococcus Species 0.000 description 1
- 241000607528 Aeromonas hydrophila Species 0.000 description 1
- 241000567139 Aeropyrum pernix Species 0.000 description 1
- 241000222518 Agaricus Species 0.000 description 1
- 241000145068 Agreia bicolorata Species 0.000 description 1
- 241000702462 Akkermansia muciniphila Species 0.000 description 1
- 241001580959 Alistipes finegoldii Species 0.000 description 1
- 241000801627 Alistipes indistinctus Species 0.000 description 1
- 241000030713 Alistipes onderdonkii Species 0.000 description 1
- 241001135230 Alistipes putredinis Species 0.000 description 1
- 241000623794 Alistipes senegalensis Species 0.000 description 1
- 241000030716 Alistipes shahii Species 0.000 description 1
- 241001258672 Allisonella histaminiformans Species 0.000 description 1
- 241001041926 Alloscardovia Species 0.000 description 1
- 241001041927 Alloscardovia omnicolens Species 0.000 description 1
- 108700028939 Amino Acyl-tRNA Synthetases Proteins 0.000 description 1
- 102000052866 Amino Acyl-tRNA Synthetases Human genes 0.000 description 1
- 241000224489 Amoeba Species 0.000 description 1
- 241000501812 Ampelomyces Species 0.000 description 1
- 241001580965 Anaerofustis stercorihominis Species 0.000 description 1
- 241001522777 Anaerostipes butyraticus Species 0.000 description 1
- 241001505572 Anaerostipes caccae Species 0.000 description 1
- 241000428313 Anaerotruncus colihominis Species 0.000 description 1
- 108091023037 Aptamer Proteins 0.000 description 1
- 108010014885 Arginine-tRNA ligase Proteins 0.000 description 1
- 241000882105 Asaccharobacter celatus Species 0.000 description 1
- 241000235349 Ascomycota Species 0.000 description 1
- 241000228245 Aspergillus niger Species 0.000 description 1
- 241001377138 Atlantibacter subterranea Species 0.000 description 1
- 241000965595 Atopobacter Species 0.000 description 1
- 241000965596 Atopobacter phocae Species 0.000 description 1
- 241000193838 Atopobium parvulum Species 0.000 description 1
- 241000193836 Atopobium rimae Species 0.000 description 1
- 241000894008 Azorhizobium Species 0.000 description 1
- 241000589941 Azospirillum Species 0.000 description 1
- 241000589151 Azotobacter Species 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 241001674039 Bacteroides acidifaciens Species 0.000 description 1
- 241000168635 Bacteroides barnesiae Species 0.000 description 1
- 241000217846 Bacteroides caccae Species 0.000 description 1
- 241001032450 Bacteroides cellulosilyticus Species 0.000 description 1
- 241000801600 Bacteroides clarus Species 0.000 description 1
- 241001220439 Bacteroides coprocola Species 0.000 description 1
- 241000545821 Bacteroides coprophilus Species 0.000 description 1
- 241001105998 Bacteroides dorei Species 0.000 description 1
- 241001135322 Bacteroides eggerthii Species 0.000 description 1
- 241000337516 Bacteroides faecichinchillae Species 0.000 description 1
- 241000956551 Bacteroides faecis Species 0.000 description 1
- 241000402140 Bacteroides finegoldii Species 0.000 description 1
- 241000606124 Bacteroides fragilis Species 0.000 description 1
- 241000168642 Bacteroides gallinarum Species 0.000 description 1
- 241001109645 Bacteroides helcogenes Species 0.000 description 1
- 241000047484 Bacteroides intestinalis Species 0.000 description 1
- 241001195773 Bacteroides massiliensis Species 0.000 description 1
- 241001122266 Bacteroides nordii Species 0.000 description 1
- 241000801630 Bacteroides oleiciplenus Species 0.000 description 1
- 241001135228 Bacteroides ovatus Species 0.000 description 1
- 241000828416 Bacteroides paurosaccharolyticus Species 0.000 description 1
- 241001220441 Bacteroides plebeius Species 0.000 description 1
- 241001652281 Bacteroides rodentium Species 0.000 description 1
- 241001122267 Bacteroides salyersiae Species 0.000 description 1
- 241000911892 Bacteroides sartorii Species 0.000 description 1
- 241000337504 Bacteroides stercorirosoris Species 0.000 description 1
- 241000204294 Bacteroides stercoris Species 0.000 description 1
- 241000606123 Bacteroides thetaiotaomicron Species 0.000 description 1
- 241000606219 Bacteroides uniformis Species 0.000 description 1
- 241000606215 Bacteroides vulgatus Species 0.000 description 1
- 241000115153 Bacteroides xylanisolvens Species 0.000 description 1
- 241000260432 Barnesiella intestinihominis Species 0.000 description 1
- 241000927510 Barnesiella viscericola Species 0.000 description 1
- 241000221198 Basidiomycota Species 0.000 description 1
- 241000223679 Beauveria Species 0.000 description 1
- 235000016068 Berberis vulgaris Nutrition 0.000 description 1
- 241000335053 Beta vulgaris Species 0.000 description 1
- 241000537222 Betabaculovirus Species 0.000 description 1
- 241000186014 Bifidobacterium angulatum Species 0.000 description 1
- 241001134770 Bifidobacterium animalis Species 0.000 description 1
- 241000186016 Bifidobacterium bifidum Species 0.000 description 1
- 241000186012 Bifidobacterium breve Species 0.000 description 1
- 241000186011 Bifidobacterium catenulatum Species 0.000 description 1
- 241001495388 Bifidobacterium choerinum Species 0.000 description 1
- 241000186022 Bifidobacterium coryneforme Species 0.000 description 1
- 241000186020 Bifidobacterium dentium Species 0.000 description 1
- 241000186156 Bifidobacterium indicum Species 0.000 description 1
- 241001089584 Bifidobacterium kashiwanohense Species 0.000 description 1
- 241001608472 Bifidobacterium longum Species 0.000 description 1
- 241001312344 Bifidobacterium merycicum Species 0.000 description 1
- 241000186150 Bifidobacterium minimum Species 0.000 description 1
- 241001134772 Bifidobacterium pseudocatenulatum Species 0.000 description 1
- 241000186148 Bifidobacterium pseudolongum Species 0.000 description 1
- 241001312954 Bifidobacterium pullorum Species 0.000 description 1
- 241001312356 Bifidobacterium ruminantium Species 0.000 description 1
- 241001311520 Bifidobacterium saeculare Species 0.000 description 1
- 241000270732 Bifidobacterium saguini Species 0.000 description 1
- 241000042873 Bifidobacterium scardovii Species 0.000 description 1
- 241001051647 Bifidobacterium simiae Species 0.000 description 1
- 241000270734 Bifidobacterium stellenboschense Species 0.000 description 1
- 241001495172 Bilophila wadsworthia Species 0.000 description 1
- 241000186560 Blautia coccoides Species 0.000 description 1
- 241001449840 Blautia glucerasea Species 0.000 description 1
- 241000194002 Blautia hansenii Species 0.000 description 1
- 241000028537 Blautia luti Species 0.000 description 1
- 241000123777 Blautia obeum Species 0.000 description 1
- 241001464894 Blautia producta Species 0.000 description 1
- 241001051189 Blautia schinkii Species 0.000 description 1
- 241001038648 Blautia wexlerae Species 0.000 description 1
- 241000722885 Brettanomyces Species 0.000 description 1
- 241000206605 Brochothrix Species 0.000 description 1
- 241000206604 Brochothrix thermosphacta Species 0.000 description 1
- 241001453380 Burkholderia Species 0.000 description 1
- 241001622816 Buttiauxella gaviniae Species 0.000 description 1
- 241000132392 Butyricimonas synergistica Species 0.000 description 1
- 241000132393 Butyricimonas virosa Species 0.000 description 1
- 241000605900 Butyrivibrio fibrisolvens Species 0.000 description 1
- 241001102661 Butyrivibrio hungatei Species 0.000 description 1
- 101150001086 COB gene Proteins 0.000 description 1
- 241000589877 Campylobacter coli Species 0.000 description 1
- 241001290832 Campylobacter hominis Species 0.000 description 1
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 1
- 241000189502 Candidatus Carsonella Species 0.000 description 1
- 244000025254 Cannabis sativa Species 0.000 description 1
- 235000012766 Cannabis sativa ssp. sativa var. sativa Nutrition 0.000 description 1
- 235000012765 Cannabis sativa ssp. sativa var. spontanea Nutrition 0.000 description 1
- 241001135749 Carnobacterium alterfunditum Species 0.000 description 1
- 241001443867 Catenibacterium mitsuokai Species 0.000 description 1
- 102100036568 Cell cycle and apoptosis regulator protein 2 Human genes 0.000 description 1
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical compound [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 description 1
- 241000223782 Ciliophora Species 0.000 description 1
- 241001494522 Citrobacter amalonaticus Species 0.000 description 1
- 241000580513 Citrobacter braakii Species 0.000 description 1
- 241000949030 Citrobacter farmeri Species 0.000 description 1
- 241000588919 Citrobacter freundii Species 0.000 description 1
- 241000949040 Citrobacter gillenii Species 0.000 description 1
- 241000949041 Citrobacter murliniae Species 0.000 description 1
- 241001055101 Citrobacter sp. TNT4 Species 0.000 description 1
- 241000949039 Citrobacter werkmanii Species 0.000 description 1
- 241000222290 Cladosporium Species 0.000 description 1
- 241001262170 Collinsella aerofaciens Species 0.000 description 1
- 241000589518 Comamonas testosteroni Species 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 241001312183 Coniothyrium Species 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 241000222511 Coprinus Species 0.000 description 1
- 241001443880 Coprobacillus cateniformis Species 0.000 description 1
- 241000220677 Coprococcus catus Species 0.000 description 1
- 241000949098 Coprococcus comes Species 0.000 description 1
- 241001464949 Coprococcus eutactus Species 0.000 description 1
- 241000190633 Cordyceps Species 0.000 description 1
- 241001529717 Corticium <basidiomycota> Species 0.000 description 1
- 241000880909 Corynebacterium durum Species 0.000 description 1
- 241000989055 Cronobacter Species 0.000 description 1
- 241000989066 Cronobacter dublinensis Species 0.000 description 1
- 241001135265 Cronobacter sakazakii Species 0.000 description 1
- 241000988642 Cronobacter turicensis Species 0.000 description 1
- 241001657377 Cryptobacterium Species 0.000 description 1
- 241001657376 Cryptobacterium curtum Species 0.000 description 1
- 241000203813 Curtobacterium Species 0.000 description 1
- 241000186427 Cutibacterium acnes Species 0.000 description 1
- 102000004403 Cysteine-tRNA ligases Human genes 0.000 description 1
- 108090000918 Cysteine-tRNA ligases Proteins 0.000 description 1
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102100028559 Death domain-associated protein 6 Human genes 0.000 description 1
- 241001600129 Delftia Species 0.000 description 1
- 241000605739 Desulfovibrio desulfuricans Species 0.000 description 1
- 241000168528 Desulfovibrio fairfieldensis Species 0.000 description 1
- 241000604463 Desulfovibrio piger Species 0.000 description 1
- QSJXEFYPDANLFS-UHFFFAOYSA-N Diacetyl Chemical group CC(=O)C(C)=O QSJXEFYPDANLFS-UHFFFAOYSA-N 0.000 description 1
- 241001624700 Dialister invisus Species 0.000 description 1
- 241000260433 Dialister succinatiphilus Species 0.000 description 1
- 241001531200 Dorea formicigenerans Species 0.000 description 1
- 241000016537 Dorea longicatena Species 0.000 description 1
- 241001277594 Duganella Species 0.000 description 1
- 241000024398 Dysgonomonas gadei Species 0.000 description 1
- 241000607473 Edwardsiella <enterobacteria> Species 0.000 description 1
- 241000607471 Edwardsiella tarda Species 0.000 description 1
- 241001657508 Eggerthella lenta Species 0.000 description 1
- 241000988238 Eggerthella sinensis Species 0.000 description 1
- 241000881810 Enterobacter asburiae Species 0.000 description 1
- 241000982938 Enterobacter cancerogenus Species 0.000 description 1
- 241000588697 Enterobacter cloacae Species 0.000 description 1
- 241000043309 Enterobacter hormaechei Species 0.000 description 1
- 241001245440 Enterobacter kobei Species 0.000 description 1
- 241001217893 Enterobacter ludwigii Species 0.000 description 1
- 241000305071 Enterobacterales Species 0.000 description 1
- 241000588921 Enterobacteriaceae Species 0.000 description 1
- 241000580502 Enterococcus asini Species 0.000 description 1
- 241001468179 Enterococcus avium Species 0.000 description 1
- 241001643761 Enterococcus azikeevi Species 0.000 description 1
- 241001584862 Enterococcus canis Species 0.000 description 1
- 241001343527 Enterococcus devriesei Species 0.000 description 1
- 241000178337 Enterococcus dispar Species 0.000 description 1
- 241000520130 Enterococcus durans Species 0.000 description 1
- 241000194032 Enterococcus faecalis Species 0.000 description 1
- 241000194031 Enterococcus faecium Species 0.000 description 1
- 241000194030 Enterococcus gallinarum Species 0.000 description 1
- 241001059855 Enterococcus hermanniensis Species 0.000 description 1
- 241000194029 Enterococcus hirae Species 0.000 description 1
- 241001026002 Enterococcus italicus Species 0.000 description 1
- 241001106597 Enterococcus lactis Species 0.000 description 1
- 241001235140 Enterococcus malodoratus Species 0.000 description 1
- 241000009792 Enterococcus moraviensis Species 0.000 description 1
- 241000520134 Enterococcus mundtii Species 0.000 description 1
- 241000178338 Enterococcus pseudoavium Species 0.000 description 1
- 241001235138 Enterococcus raffinosus Species 0.000 description 1
- 241000238765 Enterococcus rotai Species 0.000 description 1
- 241000134765 Enterococcus saccharolyticus Species 0.000 description 1
- 241000194027 Enterococcus sulfureus Species 0.000 description 1
- 241001026958 Enterococcus thailandicus Species 0.000 description 1
- 241000911891 Enterorhabdus caecimuris Species 0.000 description 1
- 241000556426 Erwinia rhapontici Species 0.000 description 1
- 241000400604 Erwinia tasmaniensis Species 0.000 description 1
- 241001240954 Escherichia albertii Species 0.000 description 1
- 241000588720 Escherichia fergusonii Species 0.000 description 1
- 241001350691 Ethanoligenens harbinense Species 0.000 description 1
- 241000520740 Eubacterium callanderi Species 0.000 description 1
- 241000186398 Eubacterium limosum Species 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 241000605980 Faecalibacterium prausnitzii Species 0.000 description 1
- 241001134569 Flavonifractor plautii Species 0.000 description 1
- BDAGIHXWWSANSR-UHFFFAOYSA-M Formate Chemical compound [O-]C=O BDAGIHXWWSANSR-UHFFFAOYSA-M 0.000 description 1
- 241000589601 Francisella Species 0.000 description 1
- 241000245629 Francisella noatunensis Species 0.000 description 1
- 241000605986 Fusobacterium nucleatum Species 0.000 description 1
- 241000192128 Gammaproteobacteria Species 0.000 description 1
- 241000207201 Gardnerella vaginalis Species 0.000 description 1
- 241001223495 Gemmiger formicilis Species 0.000 description 1
- 241001205828 Gilliamella Species 0.000 description 1
- 241001486261 Gordonibacter pamelaeae Species 0.000 description 1
- 241000201858 Granulicatella adiacens Species 0.000 description 1
- 241001620099 Granulicatella para-adiacens Species 0.000 description 1
- 241000588729 Hafnia alvei Species 0.000 description 1
- 241000206596 Halomonas Species 0.000 description 1
- 241000605014 Herbaspirillum seropedicae Species 0.000 description 1
- 241001051780 Hespellia porcina Species 0.000 description 1
- 241001052168 Hespellia stercorisuis Species 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 241000862470 Holdemania filiformis Species 0.000 description 1
- 102100038145 Homeobox protein goosecoid-2 Human genes 0.000 description 1
- 101710150873 Homeobox protein goosecoid-2 Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000715194 Homo sapiens Cell cycle and apoptosis regulator protein 2 Proteins 0.000 description 1
- 101000915428 Homo sapiens Death domain-associated protein 6 Proteins 0.000 description 1
- 101000975428 Homo sapiens Inositol 1,4,5-trisphosphate receptor type 1 Proteins 0.000 description 1
- 101100126680 Homo sapiens KRT78 gene Proteins 0.000 description 1
- 101000777301 Homo sapiens Uteroglobin Proteins 0.000 description 1
- 241000623265 Howardella ureilytica Species 0.000 description 1
- 241000339806 Hydrogenoanaerobacterium Species 0.000 description 1
- 241001529428 Hydrogenoanaerobacterium saccharovorans Species 0.000 description 1
- 102100024039 Inositol 1,4,5-trisphosphate receptor type 1 Human genes 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 102100023969 Keratin, type II cytoskeletal 78 Human genes 0.000 description 1
- 241001534216 Klebsiella granulomatis Species 0.000 description 1
- 241000259812 Klebsiella milletis Species 0.000 description 1
- 241000588747 Klebsiella pneumoniae Species 0.000 description 1
- 241001014264 Klebsiella variicola Species 0.000 description 1
- 241000588773 Kluyvera cryocrescens Species 0.000 description 1
- 241001245439 Kosakonia cowanii Species 0.000 description 1
- 241000630162 Kosakonia oryzae Species 0.000 description 1
- 240000001046 Lactobacillus acidophilus Species 0.000 description 1
- 235000013956 Lactobacillus acidophilus Nutrition 0.000 description 1
- 240000001929 Lactobacillus brevis Species 0.000 description 1
- 235000013957 Lactobacillus brevis Nutrition 0.000 description 1
- 244000199866 Lactobacillus casei Species 0.000 description 1
- 235000013958 Lactobacillus casei Nutrition 0.000 description 1
- 241000218492 Lactobacillus crispatus Species 0.000 description 1
- 241001134659 Lactobacillus curvatus Species 0.000 description 1
- 241000186673 Lactobacillus delbrueckii Species 0.000 description 1
- 241000881808 Lelliottia amnigena Species 0.000 description 1
- 108010071170 Leucine-tRNA ligase Proteins 0.000 description 1
- 102100023342 Leucine-tRNA ligase, mitochondrial Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 101150053771 MT-CYB gene Proteins 0.000 description 1
- 240000003183 Manihot esculenta Species 0.000 description 1
- 235000016735 Manihot esculenta subsp esculenta Nutrition 0.000 description 1
- 229930195725 Mannitol Natural products 0.000 description 1
- 241000223201 Metarhizium Species 0.000 description 1
- 241000202974 Methanobacterium Species 0.000 description 1
- 241000607174 Methanobacterium subterraneum Species 0.000 description 1
- 241001531418 Methanobrevibacter arboriphilus Species 0.000 description 1
- 241000586182 Methanobrevibacter millerae Species 0.000 description 1
- 241001160906 Methanobrevibacter olleyae Species 0.000 description 1
- 241000936900 Methanobrevibacter oralis Species 0.000 description 1
- 241000202985 Methanobrevibacter smithii Species 0.000 description 1
- 101100038261 Methanococcus vannielii (strain ATCC 35089 / DSM 1224 / JCM 13029 / OCM 148 / SB) rpo2C gene Proteins 0.000 description 1
- 241000204676 Methanosphaera stadtmanae Species 0.000 description 1
- 241000342654 Methylobacterium adhaesivum Species 0.000 description 1
- 241000357226 Methylobacterium oryzae Species 0.000 description 1
- 241001430258 Methylobacterium radiotolerans Species 0.000 description 1
- 241001467578 Microbacterium Species 0.000 description 1
- 108091092878 Microsatellite Proteins 0.000 description 1
- 241001123604 Mitsuokella jalaludinii Species 0.000 description 1
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 241000588772 Morganella morganii Species 0.000 description 1
- 241000235575 Mortierella Species 0.000 description 1
- 241000115065 Moryella indoligenes Species 0.000 description 1
- 241000235395 Mucor Species 0.000 description 1
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 1
- 241000204031 Mycoplasma Species 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 241000244206 Nematoda Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 241000605122 Nitrosomonas Species 0.000 description 1
- 241000605120 Nitrosomonas eutropha Species 0.000 description 1
- 241000801628 Odoribacter laneus Species 0.000 description 1
- 241001135232 Odoribacter splanchnicus Species 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 241000233654 Oomycetes Species 0.000 description 1
- 241000605936 Oxalobacter formigenes Species 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 241000179039 Paenibacillus Species 0.000 description 1
- 240000004371 Panax ginseng Species 0.000 description 1
- 241001480342 Pantoea eucalypti Species 0.000 description 1
- 241001446611 Papillibacter cinnamivorans Species 0.000 description 1
- 241000606210 Parabacteroides distasonis Species 0.000 description 1
- 241000030714 Parabacteroides goldsteinii Species 0.000 description 1
- 241001216709 Parabacteroides gordonii Species 0.000 description 1
- 241000543747 Parabacteroides johnsonii Species 0.000 description 1
- 241000204306 Parabacteroides merdae Species 0.000 description 1
- 241000583469 Paraeggerthella hongkongensis Species 0.000 description 1
- 241000789910 Paraprevotella clara Species 0.000 description 1
- 241000789906 Paraprevotella xylaniphila Species 0.000 description 1
- 241000260425 Parasutterella excrementihominis Species 0.000 description 1
- 241000531155 Pectobacterium Species 0.000 description 1
- 241000588701 Pectobacterium carotovorum Species 0.000 description 1
- 241000556406 Pectobacterium wasabiae Species 0.000 description 1
- 241000937146 Pedobacter daechungensis Species 0.000 description 1
- 241000228143 Penicillium Species 0.000 description 1
- 241000192035 Peptostreptococcus anaerobius Species 0.000 description 1
- 241000684246 Peptostreptococcus stomatis Species 0.000 description 1
- 241001464924 Phascolarctobacterium faecium Species 0.000 description 1
- 241000425347 Phyla <beetle> Species 0.000 description 1
- 240000008299 Pinus lambertiana Species 0.000 description 1
- 241001112744 Planococcaceae Species 0.000 description 1
- 241000219000 Populus Species 0.000 description 1
- 241001135211 Porphyromonas asaccharolytica Species 0.000 description 1
- 241000302292 Porphyromonas bennonis Species 0.000 description 1
- 241001635667 Porphyromonas somerae Species 0.000 description 1
- 241001135215 Prevotella bivia Species 0.000 description 1
- 241001135206 Prevotella buccalis Species 0.000 description 1
- 241000385060 Prevotella copri Species 0.000 description 1
- 102100036134 Probable arginine-tRNA ligase, mitochondrial Human genes 0.000 description 1
- 241000186428 Propionibacterium freudenreichii Species 0.000 description 1
- 241000588733 Pseudescherichia vulneris Species 0.000 description 1
- 241000202384 Pseudobutyrivibrio ruminis Species 0.000 description 1
- 241001528479 Pseudoflavonifractor capillosus Species 0.000 description 1
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 1
- 241000204715 Pseudomonas agarici Species 0.000 description 1
- 241000218935 Pseudomonas azotoformans Species 0.000 description 1
- 241000180027 Pseudomonas cedrina Species 0.000 description 1
- 241000218936 Pseudomonas corrugata Species 0.000 description 1
- 241000121962 Pseudomonas cuatrocienegasensis Species 0.000 description 1
- 241000706677 Pseudomonas deceptionensis Species 0.000 description 1
- 241000429405 Pseudomonas extremorientalis Species 0.000 description 1
- 241000589540 Pseudomonas fluorescens Species 0.000 description 1
- 241000589538 Pseudomonas fragi Species 0.000 description 1
- 241000231049 Pseudomonas gingeri Species 0.000 description 1
- 241001300822 Pseudomonas jessenii Species 0.000 description 1
- 241001277052 Pseudomonas libanensis Species 0.000 description 1
- 241001670039 Pseudomonas lundensis Species 0.000 description 1
- 241000394642 Pseudomonas marginalis pv. marginalis Species 0.000 description 1
- 241001144909 Pseudomonas poae Species 0.000 description 1
- 241000530526 Pseudomonas psychrophila Species 0.000 description 1
- 241000589776 Pseudomonas putida Species 0.000 description 1
- 241000589614 Pseudomonas stutzeri Species 0.000 description 1
- 241000589615 Pseudomonas syringae Species 0.000 description 1
- 241000218903 Pseudomonas taetrolens Species 0.000 description 1
- 241001148199 Pseudomonas tolaasii Species 0.000 description 1
- 241001144907 Pseudomonas trivialis Species 0.000 description 1
- 241001183540 Pyramidobacter piscolens Species 0.000 description 1
- 235000014443 Pyrus communis Nutrition 0.000 description 1
- 241000233639 Pythium Species 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 241001478271 Rahnella aquatilis Species 0.000 description 1
- 241000589194 Rhizobium leguminosarum Species 0.000 description 1
- 241000187561 Rhodococcus erythropolis Species 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 241001478225 Riemerella Species 0.000 description 1
- 241001478212 Riemerella anatipestifer Species 0.000 description 1
- 241001052237 Robinsoniella peoriensis Species 0.000 description 1
- 241000605944 Roseburia cecicola Species 0.000 description 1
- 241000872831 Roseburia faecis Species 0.000 description 1
- 241000872832 Roseburia hominis Species 0.000 description 1
- 241000398180 Roseburia intestinalis Species 0.000 description 1
- 241001394655 Roseburia inulinivorans Species 0.000 description 1
- 241000282849 Ruminantia Species 0.000 description 1
- 241000192029 Ruminococcus albus Species 0.000 description 1
- 241000123753 Ruminococcus bromii Species 0.000 description 1
- 241000123754 Ruminococcus callidus Species 0.000 description 1
- 241000061145 Ruminococcus champanellensis Species 0.000 description 1
- 241000192026 Ruminococcus flavefaciens Species 0.000 description 1
- 241000100220 Ruminococcus gauvreauii Species 0.000 description 1
- 241000202356 Ruminococcus lactaris Species 0.000 description 1
- 241000235070 Saccharomyces Species 0.000 description 1
- 101100422768 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SUL2 gene Proteins 0.000 description 1
- 241000218998 Salicaceae Species 0.000 description 1
- 241001138501 Salmonella enterica Species 0.000 description 1
- 241000531795 Salmonella enterica subsp. enterica serovar Paratyphi A Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 241000644453 Serratia aquatilis Species 0.000 description 1
- 241000881765 Serratia ficaria Species 0.000 description 1
- 241000218654 Serratia fonticola Species 0.000 description 1
- 241001622810 Serratia grimesii Species 0.000 description 1
- 241000607717 Serratia liquefaciens Species 0.000 description 1
- 241000607715 Serratia marcescens Species 0.000 description 1
- 241001622809 Serratia plymuthica Species 0.000 description 1
- 241001135258 Serratia proteamaculans Species 0.000 description 1
- 241000975299 Serratia quinivorans Species 0.000 description 1
- 241000878021 Shewanella baltica Species 0.000 description 1
- 241000607766 Shigella boydii Species 0.000 description 1
- 241000607764 Shigella dysenteriae Species 0.000 description 1
- 241000607762 Shigella flexneri Species 0.000 description 1
- 241000607760 Shigella sonnei Species 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 241001135312 Sinorhizobium Species 0.000 description 1
- 241001191217 Slackia isoflavoniconvertens Species 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 241001464874 Solobacterium moorei Species 0.000 description 1
- 241001446760 Sporobacterium Species 0.000 description 1
- 241001446757 Sporobacterium olearium Species 0.000 description 1
- 241000191963 Staphylococcus epidermidis Species 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 241000193985 Streptococcus agalactiae Species 0.000 description 1
- 241001147738 Streptococcus alactolyticus Species 0.000 description 1
- 241000194008 Streptococcus anginosus Species 0.000 description 1
- 241001291896 Streptococcus constellatus Species 0.000 description 1
- 241000194043 Streptococcus criceti Species 0.000 description 1
- 241000191981 Streptococcus cristatus Species 0.000 description 1
- 241000193992 Streptococcus downei Species 0.000 description 1
- 241000194042 Streptococcus dysgalactiae Species 0.000 description 1
- 241000194048 Streptococcus equi Species 0.000 description 1
- 241001288016 Streptococcus gallolyticus Species 0.000 description 1
- 241000194026 Streptococcus gordonii Species 0.000 description 1
- 241001473878 Streptococcus infantarius Species 0.000 description 1
- 241000194046 Streptococcus intermedius Species 0.000 description 1
- 241001134658 Streptococcus mitis Species 0.000 description 1
- 241000194019 Streptococcus mutans Species 0.000 description 1
- 241000194025 Streptococcus oralis Species 0.000 description 1
- 241000193991 Streptococcus parasanguinis Species 0.000 description 1
- 241000252846 Streptococcus phocae Species 0.000 description 1
- 241000193998 Streptococcus pneumoniae Species 0.000 description 1
- 241000194053 Streptococcus porcinus Species 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- 241000194024 Streptococcus salivarius Species 0.000 description 1
- 241000194023 Streptococcus sanguinis Species 0.000 description 1
- 241000193987 Streptococcus sobrinus Species 0.000 description 1
- 241000194021 Streptococcus suis Species 0.000 description 1
- 241000194020 Streptococcus thermophilus Species 0.000 description 1
- 241000194051 Streptococcus vestibularis Species 0.000 description 1
- 241000318923 Streptomyces malaysiensis Species 0.000 description 1
- 241001580973 Subdoligranulum variabile Species 0.000 description 1
- 241000123713 Sutterella wadsworthensis Species 0.000 description 1
- 241000204007 Syntrophomonas bryantii Species 0.000 description 1
- 241000120020 Tela Species 0.000 description 1
- 244000299461 Theobroma cacao Species 0.000 description 1
- 235000009470 Theobroma cacao Nutrition 0.000 description 1
- 241001678097 Turicibacter sanguinis Species 0.000 description 1
- 101150013568 US16 gene Proteins 0.000 description 1
- 102100031083 Uteroglobin Human genes 0.000 description 1
- 241001533204 Veillonella dispar Species 0.000 description 1
- 241001148135 Veillonella parvula Species 0.000 description 1
- 241001592639 Veillonella tobetsuensis Species 0.000 description 1
- 241000703751 Victivallis vadensis Species 0.000 description 1
- 241001648876 Wandonia Species 0.000 description 1
- 241001216854 Wandonia haliotis Species 0.000 description 1
- 241000975185 Weissella cibaria Species 0.000 description 1
- 241000186675 Weissella confusa Species 0.000 description 1
- 241000399000 Weissella oryzae Species 0.000 description 1
- 241001148126 Yersinia aldovae Species 0.000 description 1
- 241000063704 Yersinia aleksiciae Species 0.000 description 1
- 241000607475 Yersinia bercovieri Species 0.000 description 1
- 241000607447 Yersinia enterocolitica Species 0.000 description 1
- 241000290086 Yersinia entomophaga Species 0.000 description 1
- 241001148127 Yersinia frederiksenii Species 0.000 description 1
- 241000607481 Yersinia intermedia Species 0.000 description 1
- 241001135251 Yersinia kristensenii Species 0.000 description 1
- 241001050874 Yersinia massiliensis Species 0.000 description 1
- 241001464926 Yersinia mollaretii Species 0.000 description 1
- 241001321500 Yersinia nurmii Species 0.000 description 1
- 241000622923 Yersinia pekkanenii Species 0.000 description 1
- 241000607477 Yersinia pseudotuberculosis Species 0.000 description 1
- 241001148128 Yersinia rohdei Species 0.000 description 1
- 241001148129 Yersinia ruckeri Species 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 241000557616 [Eubacterium] infirmum Species 0.000 description 1
- 241000498616 [Eubacterium] saphenum Species 0.000 description 1
- 241001282046 [Eubacterium] sulci Species 0.000 description 1
- 241000509594 [Hallella] seregens Species 0.000 description 1
- CCPIKNHZOWQALM-DLQJRSQOSA-N [[(2r,3s,5r)-5-(6-aminopurin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphinothioyl] phosphono hydrogen phosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=S)OP(O)(=O)OP(O)(O)=O)O1 CCPIKNHZOWQALM-DLQJRSQOSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000000443 aerosol Substances 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 235000013334 alcoholic beverage Nutrition 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 239000012298 atmosphere Substances 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 235000013405 beer Nutrition 0.000 description 1
- 229940118852 bifidobacterium animalis Drugs 0.000 description 1
- 229940002008 bifidobacterium bifidum Drugs 0.000 description 1
- 229940009291 bifidobacterium longum Drugs 0.000 description 1
- 239000012148 binding buffer Substances 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000005460 biophysical method Methods 0.000 description 1
- 235000020279 black tea Nutrition 0.000 description 1
- 239000007844 bleaching agent Substances 0.000 description 1
- 235000008429 bread Nutrition 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 235000009120 camo Nutrition 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 235000005607 chanvre indien Nutrition 0.000 description 1
- 238000012993 chemical processing Methods 0.000 description 1
- 238000012824 chemical production Methods 0.000 description 1
- 239000000460 chlorine Substances 0.000 description 1
- 229910052801 chlorine Inorganic materials 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 235000016213 coffee Nutrition 0.000 description 1
- 235000013353 coffee beverage Nutrition 0.000 description 1
- 238000004040 coloring Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000010205 computational analysis Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 244000038559 crop plants Species 0.000 description 1
- 101150006264 ctb-1 gene Proteins 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- 235000013365 dairy product Nutrition 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000005202 decontamination Methods 0.000 description 1
- 230000003588 decontaminative effect Effects 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000019621 digestibility Nutrition 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 229940092559 enterobacter aerogenes Drugs 0.000 description 1
- 229940032049 enterococcus faecalis Drugs 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- LIYGYAHYXQDGEP-UHFFFAOYSA-N firefly oxyluciferin Natural products Oc1csc(n1)-c1nc2ccc(O)cc2s1 LIYGYAHYXQDGEP-UHFFFAOYSA-N 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 235000019634 flavors Nutrition 0.000 description 1
- 230000002431 foraging effect Effects 0.000 description 1
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 1
- 101150111615 ftsZ gene Proteins 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 238000002309 gasification Methods 0.000 description 1
- 238000012100 gene-based analysis Methods 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 235000019674 grape juice Nutrition 0.000 description 1
- 235000009569 green tea Nutrition 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 244000005709 gut microbiome Species 0.000 description 1
- 239000011487 hemp Substances 0.000 description 1
- 244000005702 human microbiome Species 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 239000012770 industrial material Substances 0.000 description 1
- 239000004434 industrial solvent Substances 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 239000002054 inoculum Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000004922 lacquer Substances 0.000 description 1
- 229940039695 lactobacillus acidophilus Drugs 0.000 description 1
- 229940017800 lactobacillus casei Drugs 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000594 mannitol Substances 0.000 description 1
- 235000010355 mannitol Nutrition 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 238000007620 mathematical function Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000010197 meta-analysis Methods 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 230000004879 molecular function Effects 0.000 description 1
- 150000002772 monosaccharides Chemical class 0.000 description 1
- 229940076266 morganella morganii Drugs 0.000 description 1
- 101150088166 mt:Cyt-b gene Proteins 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 230000009022 nonlinear effect Effects 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 239000005416 organic matter Substances 0.000 description 1
- OFPXSFXSNFPTHF-UHFFFAOYSA-N oxaprozin Chemical compound O1C(CCC(=O)O)=NC(C=2C=CC=CC=2)=C1C1=CC=CC=C1 OFPXSFXSNFPTHF-UHFFFAOYSA-N 0.000 description 1
- JJVOROULKOMTKG-UHFFFAOYSA-N oxidized Photinus luciferin Chemical compound S1C2=CC(O)=CC=C2N=C1C1=NC(=O)CS1 JJVOROULKOMTKG-UHFFFAOYSA-N 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000003973 paint Substances 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 229940074571 peptostreptococcus anaerobius Drugs 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 150000002989 phenols Chemical class 0.000 description 1
- 239000003016 pheromone Substances 0.000 description 1
- 239000002367 phosphate rock Substances 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 230000008635 plant growth Effects 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 230000002335 preservative effect Effects 0.000 description 1
- ULWHHBHJGPPBCO-UHFFFAOYSA-N propane-1,1-diol Chemical class CCC(O)O ULWHHBHJGPPBCO-UHFFFAOYSA-N 0.000 description 1
- 229940055019 propionibacterium acne Drugs 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 239000008213 purified water Substances 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 239000013643 reference control Substances 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 101150085857 rpo2 gene Proteins 0.000 description 1
- 101150090202 rpoB gene Proteins 0.000 description 1
- 210000004767 rumen Anatomy 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 239000013535 sea water Substances 0.000 description 1
- 238000007789 sealing Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000011451 sequencing strategy Methods 0.000 description 1
- 229940115939 shigella sonnei Drugs 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 239000011122 softwood Substances 0.000 description 1
- 238000004856 soil analysis Methods 0.000 description 1
- 239000004016 soil organic matter Substances 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 235000021262 sour milk Nutrition 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 238000009987 spinning Methods 0.000 description 1
- 238000005507 spraying Methods 0.000 description 1
- 229910001220 stainless steel Inorganic materials 0.000 description 1
- 239000010935 stainless steel Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000000021 stimulant Substances 0.000 description 1
- 229940115920 streptococcus dysgalactiae Drugs 0.000 description 1
- 229940115921 streptococcus equinus Drugs 0.000 description 1
- 229940031000 streptococcus pneumoniae Drugs 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 239000002352 surface water Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 229920001864 tannin Polymers 0.000 description 1
- 235000018553 tannin Nutrition 0.000 description 1
- 239000001648 tannin Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 108010037277 thymic shared antigen-1 Proteins 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 239000000052 vinegar Substances 0.000 description 1
- 235000021419 vinegar Nutrition 0.000 description 1
- 238000003260 vortexing Methods 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
- 229940098232 yersinia enterocolitica Drugs 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/689—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B10/00—ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
Definitions
- the embodiments described herein relate to novel and unique methods, systems and processes for identifying, analyzing, monitoring, and controlling activities. Fermentation activities entail a wide range of endeavors directed toward agriculture, manufacturing, chemical processing.
- the herein described process includes systems and methods for determining and characterizing the microbiome of a fermentation operation or setting, obtaining microbiome information, converting such information such that it is useful for controlling, enhancing, monitoring, detecting deviations, and predicting performance of the fermentation process.
- Fermentation is a process in which an agent causes transformation of a raw material into a finished product.
- organic matter is decomposed in the absence or presence of air (oxygen) producing an accumulation of resulting fermentation product.
- oxygen oxygen
- Some of these products for example, alcohol and lactic acid
- fermentation has therefore been used for their manufacture on an industrial scale.
- Microorganisms like yeast, molds, and bacteria play an important role in the alcohol fermentation process for creating beer and wine, and the formation of acetic acid (vinegar). Lactic fermentation is driven by lactic-acid bacteria which break down monosaccharides into lactic acid. Lactic fermentation is used in the preparation of various sour milk products, yogurt, cheese, and bread. Many mold fungi (for example, Aspergillus niger ) ferment sugar, resulting in the formation of citric acid. A large proportion of the citric acid used in the food processing industry is obtained by microbiological means. Ethanol fuel is produced from the fermentation by yeast of common crops such as sugar cane, potato, cassava and corn to produce ethanol which is further processed to become fuel.
- yeast of common crops such as sugar cane, potato, cassava and corn
- sewage is digested by enzymes secreted by bacteria, to produce liquid and solid fertilizers, and biogas.
- Fungi have been employed to break down cellulosic wastes to increase protein content and improve in vitro digestibility.
- a wide variety of agroindustrial waste products can be fermented to use as food for animals, especially ruminants.
- Grapemaking or vinification is the production of wine by fermentation of raw material, and for grape wine, that starts with the grapes.
- Factors affecting grape quality known as the grape's terroir, include the variety of grapes, the weather during growing season, soil, time of harvest, and methods of pruning.
- the primary fermentation can be done with natural yeast normally already present on the grapes, visible as a powdery substance, or cultured yeast is added to the must.
- the sugar content of the grapes is monitored during fermentation and can be adjusted (by addition of sugar) since it affects both the taste and end product, as well as the speed of the fermentation.
- a secondary, or malolactic fermentation can be initiated by inoculation of desired bacteria which convert malic acid into lactic acid.
- This fermentation step can improve the taste of wine.
- fermentation continues very slowly in either stainless steel vessels or oak barrels.
- the wine Prior to bottling, the wine is usually filtered. Filtration results in clarification and microbial stabilization. In clarification, large particles that affect the visual appearance of the wine are removed. In microbial stabilization, the amount of yeast and bacteria are adjusted to prevent the likelihood of refermentation or spoilage.
- the present invention addresses the longstanding and unfulfilled need for better monitoring, analysis and control of fermentation activities, including, among others, those directed toward agriculture, biofuels, and food production.
- microbiome microbiome information, microbiome data, microbiome population, microbiome panel and similar terms are used in the broadest possible sense, unless expressly stated otherwise, and would include: a census of currently present microorganisms, both living and nonliving, which may have been present months, years, millennia or longer; a census of components of the microbiome other that bacteria and archea, e.g.
- viruses and microbial eukaryotes population studies and characterizations of microorganisms, genetic material, and biologic material; a census of any detectable biological material; and information that is derived or ascertained from genetic material, biomolecular makeup, fragments of genetic material, DNA, RNA, protein, carbohydrate, metabolite profile, fragment of biological materials and combinations and variations of these.
- real-time microbiome data or information includes microbiome information that is collected or obtained at a particular setting during the fermentation process, for example soil, plant/fruit samples taken during a planting or harvesting, must, sampling of wine during alcoholic fermentation (beginning, middle and end, or depending on parameters such as alcoholic graduation, amount of sugar, density), sampling during malolactic fermentation (beginning, middle and end, or depending on amount of malic and acetic acid), barrel (beginning, middle and end, or months) and bottling.
- microbiome information that is collected or obtained at a particular setting during the fermentation process, for example soil, plant/fruit samples taken during a planting or harvesting, must, sampling of wine during alcoholic fermentation (beginning, middle and end, or depending on parameters such as alcoholic graduation, amount of sugar, density), sampling during malolactic fermentation (beginning, middle and end, or depending on amount of malic and acetic acid), barrel (beginning, middle and end, or months) and bottling.
- derived microbiome information and derived microbiome data are to be given their broadest possible meaning, unless specified otherwise, and includes any real-time, microbiome information that has been computationally linked or used to create a relationship such as for example evaluating the microbiome of milk before, during, and after fermentation, or evaluating the microbiome between planting and harvesting of grapes.
- derived microbiome information provides information about the fermentation process setting or activity that may not be readily ascertained from nonderived information.
- predictive microbiome information and predictive microbiome data are to be given their broadest possible meaning, unless specified otherwise, and includes information that is based upon combinations and computational links or processing of historic, predictive, real-time, and derived microbiome information, data, and combinations, variations and derivatives of these, which information predicts, forecasts, directs, or anticipates a future occurrence, event, state, or condition in the industrial setting, or allows interpretation of a current or past occurrence.
- predictive microbiome information would include: a determination and comparison of real-time microbiome information and the derived microbiome information of quality of wine, i.e. abundance of a specific microorganism in a sample and possible positive or negative effect on the fermentation process; a comparison of real-time microbiome information collected during the fermentation of cheese and the quality of cheese.
- Real time, derived, and predicted data can be collected and stored, and thus, become historic data for ongoing or future decision-making for a process, setting, or application.
- a method of classifying a microorganism comprising: obtaining a nucleic acid sequence of a 16S ribosomal subunit, an ITS, internal transcribed spacer, and optionally, a single copy marker gene, of a first microbe; and comparing said nucleic acid sequence of a first microbe to a reference; and identifying the first microbe at the strain level or sub-strain level based on the comparing.
- a novel method of profiling a microbiome in a sample comprising: obtaining nucleic acids sequences of a 16S ribosomal subunit, an ITS, and a marker gene, from at least one microorganism in a sample; analyzing said at least one microorganism within said sample based upon the nucleic acids sequences obtained; and determining a profile of the microbiome based on said analyzing.
- 16S rDNA in combination with another single-copy marker gene provides prokaryotic species boundaries at higher resolution and allows identification of microbial diversity at the strain level.
- the novelty of this method is in the fact that unlike what is currently taught and used in the art, instead of combining the measurement of 16S region with a functional gene as is taught in the art, we combine the 16S region with single-copy marker genes (described in Sunagawa et al., 2013, Nature Methods 10, 1196-1199).
- This methodology required sequencing all the DNA in a sample in order to get a high filogenetic resolution level.
- the method described herein reduces the amount of sequencing data needed to identify species at high filogenetic resolution because the 16S amplicons and the single-copy marker genes produce an alignment rate below 7% and a false discovery rate below 10%.
- a novel method for sequencing two libraries in one sequencing run by pooling the prepared 16S and ITS libraries, and providing appropriate primers for sequencing both 16S and ITS in a sequencing method.
- determining a profile of the microbiome in said sample can be based on 50 or fewer microbes, 55 or fewer microbes, 60 or fewer microbes, or fewer microbes, 70 or fewer microbes, 75 or fewer microbes, 80 or fewer microbes, or fewer microbes, 90 or fewer microbes, 100 or fewer microbes, 200 or fewer microbes, 300 or fewer microbes, 400 or fewer microbe, 500 or fewer microbes, boo or fewer microbes, 700 or fewer microbes, or Boo or fewer microbes. In some embodiments determining a profile of the microbiome in said sample has an accuracy greater than 70% based on the measurements. In some embodiments, analyzing uses long read sequencing platforms.
- a process including: analyzing a material from a location associated with a fermentation process; obtaining microbiome information, selected from real time microbiome information, derived microbiome information and predictive microbiome information; and performing an evaluation on the microbiome information, the evaluation including: a relationship based processing including a related genetic material component and a fermentation setting component; and a bioinformatics stage; whereby the evaluation provides information to direct the fermentation process.
- operations and methods having one or more of the following features: wherein the real time microbiome information is selected from material selected from the group consisting of soil samples, soil sample taken during a planting, soil sample taken during growth, soil sample taken during harvesting, fermentation sample taken at the beginning of a fermentation process, in the middle of a fermentation process, at the end of a fermentation process, any time during a fermentation process; wherein the bioinformatics stage has one or more of the following: submitting the raw DNA sequencing data to bioinformatics pipeline for performing microbiome analysis, including demultiplexing and quality filtering, OTU picking, taxonomic assignment, phylogenetic reconstruction, compiling metadata, diversity analysis, and visualization.
- a method of controlling a fermentation operation including: analyzing a material from a location associated with an fermentation operation to provide a first microbiome information; associating the first microbiome information with a condition of the operation; obtaining a second microbiome information; associating the second microbiome information with the first microbiome information; and, evaluating the first microbiome information, the associated condition, and the second microbiome information, the evaluation including bioinformatics pipeline for performing microbiome analysis including demultiplexing and quality filtering, OTU picking, taxonomic assignment, phylogenetic reconstruction, compiling metadata, diversity analysis, and visualization; whereby the evaluation identifies a characteristic of the operation; and, directing the fermentation operation based in part on the identified characteristic of operation; whereby the fermentation operation is based upon the evaluation of microbiome information.
- a method for directing a fermentation operation including: analyzing a sample from a location associated with a fermentation operation; obtaining microbiome information; and, performing an evaluation on the microbiome information, whereby the evaluation provides information to direct the fermentation operation.
- the microbiome information has real time microbiome information; wherein, the microbiome information has derived microbiome information; wherein, the microbiome information has predictive microbiome information; wherein the analysis has selection and sequencing of the material; wherein the analysis has extracting genetic material from the material; wherein the analysis has preparation of libraries; wherein the analysis has extracting material including genetic material selected from the group consisting of a rRNA gene 16S, Internal transcribed spacer (ITS); wherein the analysis has providing a phylogenetic tree; wherein the analysis has a correction step; wherein the analysis has an extraction procedure selected from the group consisting of beating, sonicating, freezing and thawing, and chemical disruption; wherein the analysis has amplification of at least a portion of the material; wherein the analysis has providing a genetic barcode to a sample of the material; wherein the microbiome information defines a phylogenetic tree; wherein the microbiome information has
- the methods of the invention allow the identification of microorganisms capable of imparting one or more beneficial property to one or more phases of a fermentation process.
- the variability in the microbial populations present in the sample can be used to support a directed process of selection of one or more microorganisms for use in a phase of a fermentation process and for identifying particular combinations and abundances of microorganisms which are of benefit for a particular purpose, and which may never have been recognized using conventional techniques.
- the methods of the invention may be used as a part of a plant breeding program.
- the methods may allow for, or at least assist with, the selection of plants which have a particular genotype/phenotype which is influenced by the microbial flora, in addition to identifying microorganisms and/or compositions that are capable of imparting one or more property to one or more plants.
- the invention relates to a method for the selection of one or more microorganism(s) which are capable of imparting one or more beneficial property to a plant to be used as raw material in a fermentation process.
- the process will allow for enrichment of suitable microorganisms within the plant microbiome.
- microorganism(s) may be contained within a plant, on a plant, and/or within the plant's growing soil or water.
- a “beneficial property to a plant” should be interpreted broadly to mean any property which is beneficial for any particular purpose including properties which may be beneficial to human beings, other animals, the environment, a habitat, an ecosystem, the economy, of commercial benefit, or of any other benefit to any entity or system.
- the term should be taken to include properties which may suppress, decrease or block one or more characteristic of a plant, including suppressing, decreasing or inhibiting the growth or growth rate of a plant.
- the invention may be described herein, by way of example only, in terms of identifying positive benefits to one or more plants or improving plants. However, it should be appreciated that the invention is equally applicable to identifying negative benefits that can be conferred to plants.
- beneficial properties include, but are not limited to, for example: improved growth, health and/or survival characteristics, suitability or quality of the plant for a particular purpose, structure, color, chemical composition or profile, taste, smell, improved quality.
- beneficial properties include, but are not limited to, for example; decreasing, suppressing or inhibiting the growth of a plant; constraining the height and width of a plant to a desirable size; regulate production of and/or response to plant pheromones (resulting in increased tannin production in surrounding plant community and decreased appeal to foraging species)
- “improved” should be taken broadly to encompass improvement of a characteristic of a plant or a fermentation process which may already exist in a plant or process prior to application of the invention, or the presence of a characteristic which did not exist in a plant or process prior to application of the invention.
- “improved” growth should be taken to include growth of a plant where the plant was not previously known to grow under the relevant conditions.
- inhibiting and suppressing should be taken broadly and should not be construed to require complete inhibition or suppression, although this may be desired in some embodiments.
- microbes refers to any single-celled organisms, bacteria, archaea, protozoa, and unicellular fungi and protists.
- the microorganisms may include Proteobacteria (such as Pseudomonas, Enterobacter, Stenotrophomonas, Burkholderia, Rhizobium, Herbaspirillum, Pantoea, Serratia, Rahnella, Azospirillum, Azorhizobium, Azotobacter, Duganella, Delftia, Bradyrhizobiun, Sinorhizobium and Halomonas ), Firmicutes (such as Bacillus, Paenibacillus, Lactobacillus, Mycoplasma , and Acetobacterium ), Actinobacteria (such as Streptomyces, Rhodococcus, Microbacterium , and Curtobacterium ),
- the present disclosure provides a method for detecting contamination in a fermentation sample, comprising determining the microbiome from a fermentation sample, wherein the method comprises detecting at least one marker of a microorganism and preferably two markers of a microorganism; and a computer system for determining a microbiome profile in a sample, the computer system comprising: a memory unit for receiving data comprising measurement of a microbiome panel from a sample; computer-executable instructions for analyzing the measurement data according to a method of described herein; and computer-executable instructions for determining potential microbial contamination in the sample or fermentation process based upon said analyzing.
- the computer system further comprises computer-executable instructions to generate a report of the presence or absence of the at least one contamination microorganism in the sample.
- computer system can further comprises a user interface configured to communicate or display said report to a user.
- the present disclosure provides a computer readable medium comprising: computer-executable instructions for analyzing data comprising measurement of a microbiome profile from a fermentation sample obtained from a fermentation process or environment, wherein the microbiome profile comprises at least one marker and preferably two markers selected from at least one microbe; and computer-executable instructions for determining a presence or absence of a contamination in the fermentation process based upon the analyzing.
- machine learning algorithms examples include, but are not limited to: elastic networks, random forests, support vector machines, and logistic regression.
- the algorithms provided herein can aid in selection of important microbes and transform the underlying measurements into a score or probability relating to, for example, grape quality, wine quality, presence or absence of contamination, treatment response, and/or classification of organic soil status.
- kits comprising: one or more compositions for use in measuring a microbiome profile in a fermentation sample obtained from fermentation process or environment thereof, wherein the microbiome profile comprises at least one marker and preferably two markers to at least one microbe; and instructions for performing any of the preceding methods.
- a kit can further comprises a computer readable medium.
- Kit reagents may in one embodiment comprise at least one contiguous oligonucleotide that hybridizes to a fragment of the genome of a microorganism.
- the kit comprises at least one pair of oligonucleotides that hybridizes to opposite strands of a genomic segment of a microorganism, wherein each oligonucleotide primer pair is designed to selectively amplify a fragment of the 16S, ITS, and/or marker gene of the organism present in the sample.
- the oligonucleotide is completely complementary to the genome of the individual.
- the kit further contains buffer and enzyme for amplifying said segment.
- the reagents further comprise a label for detecting said fragment.
- FIG. 1 is a 3-dimensional illustration providing a comparative representations of microbiome profiles of bacterias for differing soil samples.
- FIG. 2 is a 3-dimensional illustration providing a comparative representations of microbiome profiles of yeast species for differing soil samples.
- FIG. 3 is a bar chart illustration of the visual comparative representations of microbiome profiles of bacterias found in different soil samples.
- FIG. 4 is a bar chart illustration of the visual comparative representations of microbiome profiles of yeast species found in different soil samples.
- the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
- any numerical value recited herein includes all values from the lower value to the upper value, i.e., all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application. For example, if a range is stated as 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification.
- Contacting refers to the process of bringing into contact at least two distinct species such that they can react. It should be appreciated, however, the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagent which can be produced in the reaction mixture.
- Nucleic acid refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides.
- nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.
- microbiome refers to the ecological community of commensal, symbiotic, or pathogenic microorganisms in a sample.
- genome refers to the entirety of an organism's hereditary information that is encoded in its primary DNA sequence.
- the genome includes both the genes and the non-coding sequences.
- the genome may represent a microbial genome or a mammalian genome.
- DNA region should be understood as a reference to a specific section of genomic DNA. These DNA regions are specified either by reference to a gene name or a set of chromosomal coordinates. Both the gene names and the chromosomal coordinates would be well known to, and understood by, the person of skill in the art. In general, a gene can be routinely identified by reference to its name, via which both its sequences and chromosomal location can be routinely obtained, or by reference to its chromosomal coordinates, via which both the gene name and its sequence can also be routinely obtained.
- genes/DNA regions detailed above should be understood as a reference to all forms of these molecules and to fragments or variants thereof.
- some genes are known to exhibit allelic variation or single nucleotide polymorphisms.
- SNPs encompass insertions and deletions of varying size and simple sequence repeats, such as dinucleotide and trinucleotide repeats.
- Variants include nucleic acid sequences from the same region sharing at least 90%, 95%, 98%, 99% sequence identity i.e. having one or more deletions, additions, substitutions, inverted sequences etc. relative to the DNA regions described herein.
- the present invention should be understood to extend to such variants which, in terms of the present applications, achieve the same outcome despite the fact that minor genetic variations between the actual nucleic acid sequences may exist between different bacterial strains.
- the present invention should therefore be understood to extend to all forms of DNA which arise from any other mutation, polymorphic or allelic variation.
- sequencing refers to sequencing methods for determining the order of the nucleotide bases—adenine, guanine, cytosine, and thymine—in a nucleic acid molecule (e.g., a DNA or RNA nucleic acid molecule.
- barcode refers to any unique, non-naturally occurring, nucleic acid sequence that may be used to identify the originating genome of a nucleic acid fragment.
- biochip or “array” can refer to a solid substrate having a generally planar surface to which an adsorbent is attached.
- a surface of the biochip can comprise a plurality of addressable locations, each of which location may have the adsorbent bound there.
- Biochips can be adapted to engage a probe interface, and therefore, function as probes.
- Protein biochips are adapted for the capture of polypeptides and can be comprise surfaces having chromatographic or biospecific adsorbents attached thereto at addressable locations.
- Microarray chips are generally used for DNA and RNA gene expression detection. Microbiome profiling can further comprise of use of a biochip.
- Biochips can be used to screen a large number of macromolecules.
- Biochips can be designed with immobilized nucleic acid molecules, full-length proteins, antibodies, affibodies (small molecules engineered to mimic monoclonal antibodies), aptamers (nucleic acid-based ligands) or chemical compounds.
- a chip could be designed to detect multiple macromolecule types on one chip.
- a chip could be designed to detect nucleic acid molecules, proteins and metabolites on one chip.
- the biochip can be used to and designed to simultaneously analyze a panel microbes in a single sample.
- a “computer-readable medium”, is an information storage medium that can be accessed by a computer using a commercially available or custom-made interface.
- Exemplary computer-readable media include memory (e.g., RAM, ROM, flash memory, etc.), optical storage media (e.g., CD-ROM), magnetic storage media (e.g., computer hard drives, floppy disks, etc.), punch cards, or other commercially available media.
- Information may be transferred between a system of interest and a medium, between computers, or between computers and the computer-readable medium for storage or access of stored information. Such transmission can be electrical, or by other available methods, such as IR links, wireless connections, etc.
- Any microbiome profile described herein can include one or more, but are not limited to the following microbes: Abiotrophia, Abiotrophia defectiva, Abiotrophia, Acetanaerobacterium, Acetanaerobacterium elongatum, Acetanaerobacterium, Acetivibrio, Acetivibrio bacterium, Acetivibrio, Acetobacterium, Acetobacterium, Acetobacterium woodii, Acholeplasma, Acholeplasma, Acidaminococcus, Acidaminococcus fermentans, Acidaminococcus, Acidianus, Acidianus brierleyi, Acidianus, Acidovorax, Acidovorax, Acinetobacter, Acinetobacter guillouiae, Acinetobacter junii, Acinetobacter, Actinobacillus, Actinobacillus MI933/96/1, Actinomyces, Actinomyces ICM34, Actinomyces ICM41, Actinomyces
- Bacteroides fragilis Bacteroides gallinarum, Bacteroides helcogenes, Bacteroides ic1292, Bacteroides intestinalis, Bacteroides massiliensis, Bacteroides mpnisolate, Bacteroides NB-8, Bacteroides new, Bacteroides nlaezlc13, Bacteroides nlaezlc158, Bacteroides nlaezlc159, Bacteroides nlaezlc161 , Bacteroides nlaezlc163, Bacteroides nlaezlc167, Bacteroides nlaezlc172, Bacteroides nlaezlc18, Bacteroides nlaezlc182, Bacteroides nlaezlc190, Bacteroides nlaezlc198, Bacteroides nlaezlc204, Bacteroides nlaezlc205, Bacteroides nlaezlc206, Bacteroides nlaezl
- FIGS. 1 and 2 are 3-dimensional illustrations providing comparative representations of microbiome profiles. These microbiomes were found in differing soil samples coming from exemplary vineyards in California, United States, and Spain, in accordance with certain embodiments.
- FIG. 1 is the profile for bacterias
- FIG. 2 is the profile for yeast species.
- Each winery is represented by a greyscale color on the respective legends as shown. The legends provide the number of samples for each winery, along with a code assigned to each winery.
- samples coming from the same winery are have greater similarities among themselves as compared to other samples. Additionally the samples coming from wineries from the same region have greater similarities as compared to samples coming from other wine regions.
- the samples illustrate clustering, for both bacterias and yeast species, demonstrating that applying the methodologies herein provides a scientific-based identity to the terroir concept in winemaking and provides validation to certain assumptions concerning the existence of bio-wine regions upon observation of microbiome profiles of soil.
- FIGS. 3 and 4 are bar charts providing visual comparative representations of the microbiome profiles found in different soil samples.
- FIG. 3 is a bar chart profile for bacterias
- FIG. 4 is a bar chart profile for yeast species.
- the x-axis provides sample identification codes, namely codes assigned to the different soil samples from vineyards.
- the y-axis provides the respective abundancies of the microbial species for each given vineyard sample, with each greyscale color representing a different microbiological specie.
- FIGS. 3 and 4 are visual comparative representations of respective microbiome profiles found in the differing soil samples, with one bar profile per sample, derived from the exemplary vineyards.
- the vertical distribution of these species, shown in greyscale, is the same along the samples to allow the visual comparison of similarities among the microbiome profiles of the sample.
- the methods provided herein can provide strain classification of a genera, species or sub-strain level of one or more microbes in a sample with an accuracy of greater than 1%, 20%, 30%, 40%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.2%, 99.5%, 99.7%, or 99.9%.
- the methods provided herein can provide strain quantification of a genera, species or sub-strain level of one or more microbes in a sample with an accuracy of greater than 1%, 20%, 30%, 40%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.2%, 99.5%, 99.7%, or 99.9%.
- the present inventions further relates to systems and methods for determining and characterizing the microbiomes of fermentation settings, and in particular determining through relationship-based processing, which include custom and unique analytics tools and algorithms, data management, cleansing, filtering, and quality control, which in turn provide information about the fermentation setting.
- Such characterized information for example, can have, and be used for, predictive, historical, analytic, development, control and monitoring purposes.
- This information, data, processing algorithms support software such as human machine interface (HMI) programs and graphic programs, and databases, may be cloud-based, locally-based, hosted on remote systems other than cloud-based systems, and combinations and variations of these.
- HMI human machine interface
- a computer system may be used to implement one or more steps including, sample collection, sample processing, detecting, quantifying one or more microbes, generating a profile data, comparing said data to a reference, generating a subject-specific microbiome profile, comparing the sample-specific profile to a reference profile, receiving sample-related data, receiving and storing data obtained by one or more methods described herein, analyzing said data, generating a report, and reporting results to a receiver.
- real-time, derived, and predicted data may be collected and stored and thus become historic data for an ongoing process, setting, or application.
- the collection, use, and computational links can create a real-time situation in which machine learning can be applied to further enhance and refine the fermentation activities or processes.
- real-time, derived, predictive, and historic data can be, and preferably is, associated with other data and information.
- the microbiome information can be associated with GPS data; location data, e.g., particular components and subsystems in an fermentation process such as for example a particular barrel type for wine storage; processing stage or step such as filtration of fermentation broth; geological parameters including formation permeability and porosity; soil moisture, nutrient, and rainfall conditions in agricultural processes; chemicals in wine, for example, sulfur acid.
- microbiome information may be further combined or processed with these other sources of information and data regarding the fermentation setting or process to provide combined, derived, and predictive information.
- the microbiome information is used in combination with other data and information to provide for unique and novel ways to conduct fermentation operations, to develop or plan fermentation operations, to refine and enhance existing fermentation operations and combinations of these and other activities.
- these various types of information and data are combined where one or more may become metadata for the other.
- information may be linked in a manner that provides for rapid, efficient, and accurate processing to provide useful information relating to the fermentation setting.
- the GPS location down to the square yard of a large farm may be linked as metadata to the real-time microbiome information during planting and compared with similarly linked metadata obtained during harvesting along with crop yield for that acre to refine and enhance the agricultural processing of the field in which the acre is located.
- historic microbiome data may be obtained from known databases or it may be obtained from conducting population studies or censuses of the microbiome for the particular fermentation setting.
- samples of biological materials are collected and characterized.
- This characterized information is then processed and stored.
- the data is processed and stored in a manner that provides for ready and efficient access and utilization in subsequent steps, often using auxiliary data structures such as indexes or hashes.
- real-time microbiome data may be obtained from conducting population studies or censuses of the microbiome as it exists at a particular point in time, or over a timeseries, for the particular fermentation setting.
- samples of biological materials are collected and characterized.
- This characterized information is then processed and stored.
- the data is processed and utilized in subsequent steps or may be stored as historic data in a manner that provides for ready and efficient access and utilization in subsequent steps.
- microbiome information may be contained in any type of data file that is utilized by current sequencing systems or that is a universal data format such as for example FASTQ (including quality scores), FASTA (omitting quality scores), GFF (for feature tables), etc.
- This data or files may then be combined using various software and computational techniques with identifiers or other data, examples of such software and identifiers for the combining of the various types of this information include the BIOM file format and the MI (x) S family of standards developed by the Genomic Standards Consortium.
- data from a harvesting combine regarding yield, microbiome information, and commodities price information may be displayed or stored or used for further processing.
- the combination and communication of these various systems can be implemented by various data processing techniques, conversions of files, compression techniques, data transfer techniques, and other techniques for the efficient, accurate, combination, signal processing and overlay of large data streams and packets.
- n-dimensional space a mathematical construct having 2, 3, 5, 12, 1000, or more dimensions
- the embodiments of the present invention provide further analysis to this n-dimensional space information, which analysis renders this information to a format which is more readily usable and processable and understandable.
- then-dimensional space information is analyzed and studied for patterns of significance pertinent to a particular fermentation setting and then converted to more readily usable data such as for example a 2-dimensional color-coded plot for presentation through a HMI (Human-Machine Interface).
- HMI Human-Machine Interface
- then-dimensional space information may be related, e.g., transformed or correlated with, physical, environmental, or other data such as the conditions under which a particular plant was grown, either by projection into the same spatial coordinates or by relation of the coordinate systems themselves, or by feature extraction or other machine learning or multivariate statistical techniques.
- This related n-dimensional space information may then be further processed into a more readily usable format such as a 2-dimensional representation. Further, this 2-dimensional representation and processing may, for example, be based upon particular factors or features that are of significance in a particular fermentation setting. The 2-dimensional information may also be further viewed and analyzed for determining particular factors or features of significance for a system. Yet further, either of these types of 2-dimensional information may be still further processed using for example mathematical transformation functions to return them to an n-dimensional space which mathematical functions which may be based upon known or computationally determined factors or features.
- the present inventions provide for derived and predicted information that can be based upon the computational distillation of complex n-dimensional space microbiome information, which may be further combined with other data.
- This computationally distilled data or information may then be displayed and used for operational purposes in the fermentation setting, it may be combined with additional data and displayed and used for operational purposes in the fermentation setting, it may be alone or in combination with additional information subjected to trend, analysis, to determine features or factors of significance, it may be used for planning and operational purposes in combinations and variations of these and other utilizations.
- the selection and sequencing of particular regions or portions of genetic materials may be used, including for example, the SSU rRNA gene (16S or 18S), the LSU rRNA gene (23S or 28S), the ITS in the rRNA operon, cpn60, gene marker regions such as metal-dependent proteases with possible chaperone activity, and various other segments consisting of base pairs, peptides or polysaccharides for use in characterizing the microbial community and the relationships among its constituents.
- an embodiment of a method of the present invention may include one or more of the following steps which may be conducted in various orders: sample preparation including obtaining the sample at the designated location, and manipulating the sample; extraction of the genetic material and other biomolecules from the microbial communities in the sample; preparation of libraries with identifiers such as an appropriate barcode such as DNA libraries, metabolite libraries, and protein libraries of the material; sequence elucidation of the material (including, for example, DNA, RNA, and protein) of the microbial communities in the sample; processing and analysis of the sequencing and potentially other molecular data; and exploitation of the information for fermentation uses.
- sampling may be for example from an agricultural, food, surfaces, water.
- the samples can include for example solid samples such as soil, sediment, rock, and food.
- the samples can include for example liquid samples such as surface water, and subsurface water, other liquid to be fermented or in a certain stage of fermentation, such as must, barrel fermented wine, yogurt, to name a few.
- the sample once obtained has the genetic material isolated or obtained from the sample, which for example can be DNA, RNA, proteins and fragments of these.
- Primers can be prepared by a variety of methods including, but not limited to, cloning of appropriate sequences and direct chemical synthesis using methods well known in the art (Narang et al., Methods Enzymol. 68:90 (1979); Brown et al., Methods Enzymol. 68:109 (1979)). Primers can also be obtained from commercial sources such as Integrated DNA Technologies, Operon Technologies, Amersham Pharmacia Biotech, Sigma, and Life Technologies.
- RNAs can also be used to design primers, including but not limited to Array Designer Software (Arrayit Inc.), Oligonucleotide Probe Sequence Design Software for Genetic Analysis (Olympus Optical Co.), NetPrimer, and DNAsis from Hitachi Software Engineering. Primers that can be used analyze the 16S ribosomal RNA gene include but are not limited to those described in the Examples below.
- Microbial diversity can be further described by approaches analyzing the intergenic region between 16S ribosomal RNA and 23S ribosomal RNA.
- Primers can be designed to specifically amplify any identified variable regions in a microbe or similar distinguishing genetic element.
- Primers or probes described herein can also include polynucleotides having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% homology to any of the nucleic acid sequences described herein.
- a library is prepared from the genetic material.
- the library can be prepared by use of amplification, shotgun, whole molecule techniques among others. Additionally, amplification to add adapters for sequencing, and barcoding for sequences can be preformed. Shotgun by sonication, enzymatic cleavage may be performed. Whole molecules can also be used to sequence all DNA in a sample.
- Sequencing is performed.
- the sequencing is with a high-throughput system, such as for example 454, Illumina, PacBio, or IonTorrent, Nanopore, to name a few.
- Sequence analysis is prepared. This analysis preferably can be performed using tools such as QIIME Analysis Pipeline, Machine learning, and UniFrac. Preferably, there is assigned a sequence to the sample via barcode, for among other things quality control of sequence data.
- the analysis is utilized in a fermentation application.
- the applications can include for example, cheese production, alcoholic and non-alcoholic beverage production, biofuel production, and alternative energy.
- the processing and analysis further involves matching the sequences to the samples, aligning the sequences to each other, and using the aligned sequences to build a phylogenetic tree, further distilling the data to form an n-dimensional plot and then a two or three dimensional plot or other graphical displays, including displays of the results of machine learning and multivariate statistical routines, and using the two or three-dimensional plot or other graphical displays to visualize patterns of the microbial communities in a particular sample over time and geographic space.
- HMI-type presentation of this information is presently preferred, it should be understood that such plots may be communicated directly to a computational means such as a large computer or computing cluster for performing further analysis to provide predictive information.
- a computational means such as a large computer or computing cluster for performing further analysis to provide predictive information.
- the matched sequence samples would be an example of real-time or historic microbiome information
- the phylogenetic tree would be an example of derived microbiome information
- portions of the graphical displays which have derived microbial information combined with other data would be an example of predictive microbiome information.
- a phylum is a group of organisms at the formal taxonomic level of Phylum based on sequence identity, physiology, and other such characteristics. There are approximately fifty bacterial phyla, which include Actinobacteria, Proteobacteria, and Firmicutes.
- Phylum is the classification that is a level below Kingdom, in terms of classifications of organisms. For example, for E. coli the taxonomy string is Kingdom: Bacteria; Phylum: Proteobacteria; Class: Gammaproteobacteria; Order: Enterobacteriales; Family: Enterobacteriaceae; Genus: Escherichia ; and Species: coli.
- phylogeny refers to the evolutionary relationship between a set of organisms. This relationship can be based on morphology, biochemical features, and/or nucleic acid (DNA or RNA) sequence.
- DNA or RNA nucleic acid
- phylotype also referred to as operational taxonomic unit (“OTU”)
- OTU operational taxonomic unit
- phylotypes can also be defined at other taxonomic levels and these other levels are sometimes critical for identifying microbial community features relevant to a specific analysis.
- reads short DNA, RNA or protein sequences
- these sequences may not accurately identify many organisms to the level of species, or even strain (the most detailed level of phylogenetic resolution, which is sometimes important because different strains can have different molecular functions).
- a “phylotype” matches a sequence or group of sequences from a known organism in the databases, it can used to say that a particular sequence is from an organism like, for example, E. coli.
- taxon is a group of organisms at any level of taxonomic classification.
- taxon plural: taxa
- taxon is a catchall term used in order to obviate the usage of the organism names repeatedly and to provide generality across taxonomic levels.
- Microbial community diversity and composition may vary considerably across fermentation environments and settings, and the embodiments of the present invention link these changes to biotic or abiotic factors and other factors and conditions in the fermentation environment to create derived and predictive information.
- these patterns of microbial communities for example geological patterns of microbial communities or patterns of microbial communities in an fermentation system (microbiosystem metrics) which are determined by the present invention can give rise to predictive information for use in the fermentation setting.
- Examinations of microbial populations may provide insights into the physiologies, environmental tolerances, and ecological strategies of microbial taxa, particularly those taxa which are difficult to culture and that often dominate in natural environments.
- this type of derived data is utilized in combination with other data in order to form predictive information.
- Microbes are diverse, ubiquitous, and abundant, yet their population patterns and the factors driving these patterns were prior to the present inventions not readily understood in fermentation settings and thus it is believed never effectively used for the purposes for ascertaining predictive information.
- Microorganisms just like macroorganisms (i.e., plants and animals), exhibit no single shared population pattern.
- the specific population patterns shown by microorganisms are variable and depend on a number of factors, including, the degree of phylogenetic resolution at which the communities are examined (e.g., Escherichia ), the taxonomic group in question, the specific genes and metabolic capabilities that characterize the taxon, and the taxon's interactions with members of other taxa.
- population patterns can be determined in fermentation settings and utilized as derived data for the purposes of ascertaining predictive information.
- biogeography e.g., microbial populations for example as determined from a census
- structure and diversity of soil bacterial communities have been found to be closely related to soil environmental characteristics such as soil pH.
- a comprehensive assessment of the biogeographical patterns of, for example, soil bacterial communities requires i) surveying individual communities at a reasonable level of phylogenetic detail (depth), and 2) examining a sufficiently large number of samples to assess spatial patterns (breadth).
- biogeographical patterns is not limited to soil, and will be extended to other environments, including but not limited to, any part of a living organisms, bodies of water, ice, the atmosphere, energy sources, factories, laboratories, farms, processing plants, hospitals, and other locations, systems and areas.
- samples will be collected in a manner ensuring that microbes from the target source are the most numerous in the samples while minimizing the contamination of the sample by the storage container, sample collection device, the sample collector, other target or other non-target sources that may introduce microbes into the sample from the target source. Further, samples will be collected in a manner to ensure the target source is accurately represented by single or multiple samples at an appropriate depth (if applicable) to meet the needs of the microbiome analysis, or with known reference controls for possible sources of contamination that can be subtracted by computational analysis. Precautions should be taken to minimize sample degradation during shipping by using commercially available liquids, dry ice or other freezing methods for the duration of transit.
- samples can be collected in sterile, DNA/DNase/RNA/RNase-free primary containers with leak resistant caps or lids and placed in a second leak resistant vessel to limit any leakage during transport.
- Appropriate primary containers can include any plastic container with a tight fitting lid or cap that is suitable for work in microbiology or molecular biology considered to be sterile and free of microbial DNA (or have as little as possible) at minimum. (However, it should be noted that human DNA contamination, depending upon the markers or specific type microbe that is being looked at may not present a problem.)
- the primary container can also be comprised of metal, clay, earthenware, fabric, wood, etc.
- the container may be sterilized and tested to ensure that it is ideally DNA/DNase/RNA/RNase-free (or at least contains levels of nucleic acid much lower than the biomass to be studied, and low enough concentration of nuclease that the nucleic acids collected are not degraded) and can be closed with a tight-fitting and leak resistant lid, cap or top, then it can be used as a primary container.
- DNA/DNase/RNA/RNase-free or at least contains levels of nucleic acid much lower than the biomass to be studied, and low enough concentration of nuclease that the nucleic acids collected are not degraded
- the primary container with the sample can then be placed into a secondary container, if appropriate.
- secondary containers can include plastic screw top vessels with tight fitting lids or caps and plastic bags such as freezer-grade zip-top type bags.
- the secondary container can also be comprised of metal, clay, earthenware, fabric, wood, etc. So long as the container can be dosed or sealed with a tight-fitting and leak resistant lid, cap or top, then it can be used as a secondary container.
- the secondary container can also form a seal on itself or it can be fastened shut for leak resistance.
- the samples should generally be collected with minimal contact between the target sample and the sample collector to minimize contamination.
- the sample collector if human, should generally collect the target sample using gloves or other barrier methods to reduce contamination of the samples with microbes from the skin.
- the sample can also be collected with instruments that have been cleaned.
- the sample collector if machine, should be cleaned and sterilized with UV light and/or by chemical means prior to each sample collection. If the machine sample collector requires any maintenance from a human or another machine, the machine sample collector must be additionally subjected to cleaning prior to collecting any samples.
- the samples will be preserved.
- One method of preservation is by freezing on dry ice or liquid nitrogen to between 4° C. to ⁇ 80° C.
- Another method of preservation is the addition of preservatives such as RNAstableTM, LifeGuardTM or another commercial preservative, and following the respective instructions. So long as the preservation method will allow for the microbial nucleic acid to remain stable upon storage and upon later usage, then the method can be used.
- the samples will be shipped in an expedient method to the testing facility.
- the testing of the sample can be done on location.
- the sample testing should be performed within a time period before there is substantial degradation of the microbial material with in the sample. So long as the sample remains preserved and there is no substantial degradation of the microbial material, any method of transport in a reasonable period of time is sufficient.
- Tracers will be added to the inflow of a sampling catchment to identify the organisms present in the system that are not from the target source.
- the tracer can be microorganisms or anything that will allow for analysis of the flow path.
- a tracer can be used to calibrate the effectiveness of a flooding operation (water, CO2, chemical, steam, etc.).
- the tracer will be used to determine factors such as the amount of injection fluid flowing through each zone at the production wellbore and the path of the injection fluid flow from the injection site to the production bore.
- the extraction of genetic material will be performed using methods with the ability to separate nucleic acids from other, unwanted cellular and sample matter in a way to make the genetic material suitable for library construction. For example, this can be done with methods including one or more of the following, but not limited to, mechanical disruption such as bead beating, sonicating, freezing and thawing cycles; chemical disruption by detergents, acids, bases, and enzymes; other organic or inorganic chemicals. Isolation of the genetic material can be done through methods including one or more of the following, but not limited to, binding and elution from silica matrices, washing and precipitation by organic or inorganic chemicals, electroelution or electrophoresis or other methods capable of isolating genetic material.
- Extractions will be done in an environment suitable to exclude microbes residing in the air or on other surfaces in the work area where the extraction is taking place. Care will be taken to ensure that all work surfaces and instruments are cleaned to remove unwanted microbes, nucleases and genetic material.
- Cleaning work surfaces and instruments can include, but is not limited to, spraying and/or wiping surfaces with a chlorine bleach solution, commercially available liquids such as DNAse AWAYTM or RNase AWAYTM or similar substances that are acceptable in routine decontamination of molecular biology work areas.
- aerosol barrier pipette tips used in manual, semi-automated or automated extraction process will be used to limit transfer of genetic material between instruments and samples.
- Controls for reagents for extractions and/or primary containers will be tested to ensure they are free of genetic material.
- Testing of the reagents includes, but is not limited to performing extraction “blanks” where only the reagents are used in the extraction procedure.
- When necessary primary collection containers may also be tested for the presence of genetic material serving as one type of ‘negative control’ in PCR of the genetic material of the sample.
- testing the blank or negative control may be accomplished, but not limited to, spectrophotometric, fluorometric, electrophoretic, PCR or other assays capable of detecting genetic material. followed by testing the blank for the presence of genetic material by, but not limited to, spectrophotometric, fluorometric, electrophoretic, PCR or other assays capable of detecting genetic material.
- PCR polymerase chain reaction
- PCR amplifies a single or a few copies of a piece of DNA across several orders of magnitude, generating thousands to millions, or more, of copies of a particular DNA sequence using a thermostable DNA polymerase.
- PCR will be used to amplify a portion of specific gene from the genome of the microbes present in the sample. Any method which can amplify genetic material quickly and accurately can be used for library preparation.
- the PCR primer will be designed carefully to meet the goals of the sequencing method.
- the PCR primer will contain a length of nucleotides specific to the target gene, may contain an adapter that will allow the amplicon, also known as the PCR product, to bind and be sequenced on a high-throughput sequencing platform, and additional nucleotides to facilitate sequencing.
- the portion of the gene with adapters, barcode and necessary additional nucleotides is known as the “amplicon.” It being understood that future systems may not use, or need, adaptors.
- forward and reverse primers as shown in the examples are used.
- the microbial ribosome is made up component proteins and non-coding RNA molecules, one of which is referred to as the 16S ribosomal RNA (or 16S rRNA).
- the 16S subunit is a component of the small subunit (SSU) of bacterial and archaeal ribosomes. It is 1.542 kb (or 1542 nucleotides) in length.
- the gene encoding the 16S subunit is referred to as the 16S rRNA gene.
- the 16S rRNA gene is used for reconstructing phylogenies because it is highly conserved between different species of bacteria and archaea, meaning that all of these organisms encode it in their genomes and it can be easily identified in genomic sequences, but it additionally contains regions that are highly variable, so there is a phylogenetic signature in the sequence of the gene.
- batch sequencing of all of the 16S rRNA gene sequence in a sample containing many microbial taxa are informative about which microbial taxa are present.
- 16S rRNA-based studies are extremely valuable given that they can be used to discover and record unexplored biodiversity and the ecological characteristics of either whole communities or individual microbial taxa. 16S rRNA phylogenies tend to correspond well to trends in overall gene content. Therefore the ability to relate trends at the species level to host or environmental parameters has proven exceptionally powerful to understanding the relationships between the microbes and the world.
- microbiome measurement techniques provide important information that is complementary to 16S rRNA or other marker-gene data: shotgun metagenomics provides genome content for the entire microbiome; transcriptomics measures gene expression by microbes, indicating which genes are actually being used by the microbes; proteomics measures actual production of enzymes and other functional proteins in the microbiome; metabolomics directly measures metabolite content in a sample.
- SSU small subunit
- LSU ribosomal gene
- ITS and mitochondrial marker such as Cytb or coxl
- ITS and mitochondrial marker such as Cytb or coxl
- ITS gene intragenic transcribed spacer gene
- LSU large subunit ribosomal gene
- the genetic material for any analysis could be derived from DNA or cDNA (i.e., complementary DNA) produced from the reverse transcription of RNA isolated from the target sample or samples.
- primer bias due to differential annealing leads to the over- or underrepresentation of specific taxa can lead to some groups being missed entirely if they match the consensus sequence poorly. Issues of primer bias can be important. Comparisons of relative abundance among different studies should thus be treated with caution. However, meta-analysis of presence/absence data from different studies is particularly useful for revealing broad trends, even when different studies use different primers.
- the primers designed for amplification will be well-suited for the phylogenetic analysis of sequencing reads.
- the primer design will be based on the system of sequencing, e.g., chain termination (Sanger) sequencing or high-throughput sequencing.
- chain termination (Sanger) sequencing or high-throughput sequencing.
- the sequencing can be performed by, but is not limited to, 454 Life SciencesTM Genome Sequencer FLX (Roche) machine or the IlluminaTM platforms (MiSeqTM or HiSeqTM), IonTorrent, Nanopores or PacBio. These will be described more in the Sequencing section below.
- High-throughput sequencing described below, has revolutionized many sequencing efforts, including studies of microbial community diversity.
- High-throughput sequencing is advantageous because it eliminates the labor-intensive step of producing clone libraries and generates hundreds of thousands of sequences in a single run.
- two primary factors limit culture-independent marker gene-based analysis of microbial community diversity through high-throughput sequencing: 1) each individual run is high in cost, and 2) separating a single plate across multiple runs is difficult.
- Double index barcoding protocol is used in the examples below.
- a unique tag will be added to each primer before PCR amplification.
- each sample will be amplified with a known tagged (barcoded) primer
- an equimolar mixture of PCR-amplified DNA can be sequenced from each sample and sequences can be assigned to samples based on these unique barcodes.
- the presence of these assigned barcodes allow for independent samples to be combined for sequencing, with subsequent bioinformatic separation of the sequencer output. By not relying on physical separators, this procedure maximizes sequence space and multiplexing capabilities.
- This technique will be used to process many samples (e.g. 25, 200, 1000, and above) as many as 25 samples in a single high-throughput sequencing run. This number will be increased depending on advances in high-throughput sequencing technology, without limit to the number of samples to be sequenced in a single high-throughput sequencing run.
- SIM barcodes Barcodes, or unique DNA sequence identifiers, have traditionally been used in different experimental contexts, such as sequence-tagged mutagenesis (SIM) screens where a sequence barcode acts as an identifier or type specifier in a heterogeneous cell-pool or organism-pool.
- SIM barcodes are usually 20-60 bases (or nt) long, are pre-selected or follow ambiguity codes, and exist as one unit or split into pairs. Such long barcodes are not particularly compatible with available high-throughput sequencing platforms because of restrictions on read length.
- Sample identifiers will be encoded with redundant parity bits. Then the sample identifiers will be “transmitted” as codewords. Each base (A, T, G, C) will be encoded using 2 bits and using 8 bases for each codeword. Therefore, 16-bit codewords will be transmitted.
- the codeword and bases is not limited to these numbers, as any number of bits and codewords can be designed by a person of ordinary skill in the art.
- the design of the barcode is based on the goals of the method.
- Hamming codes are unique in that they use only a subset of the possible codewords, particularly those that lie at the center of multidimensional spheres (hyperspheres) in a binary subspace. Single bit errors fall within hyperspheres associated with each codeword, and thus they can be corrected. Double bit errors do not fall within hyperspheres associated with each codeword, and thus they can be detected but not corrected.
- Golay codes of 12 bases can correct all triple-bit errors and detect all quadruple-bit errors.
- the extended binary Golay code encodes 12 bits of data in a 24-bit word in such a way that any 3-bit errors can be corrected or any 7-bit errors can be detected.
- the perfect binary Golay code has codewords of length 23 and is obtained from the extended binary Golay code by deleting one coordinate position (conversely, the extended binary Golay code is obtained from the perfect binary Golay code by adding a parity bit). In standard code notation the codes have parameters corresponding to the length of the codewords, the dimension of the code, and the minimum Hamming distance between two codewords, respectively.
- the primer will be designed to include nucleotides specific for the sequencing platform; nucleotides specific for the gene of interest; nucleotides for the barcode chosen; and the nucleotides of the gene.
- the primer Upon amplification, one contiguous string of nucleotides known as the “forward” primer will be formed from the platform specific sequencing adaptors and the gene specific primer and linker. Additionally formed upon amplification will be one contiguous string of nucleotides known as the “reverse” primer formed from the platform specific sequencing adaptors, the gene specific primer and linker, and the barcode.
- PCR using barcoded primers is known in the art.
- Other error-correcting codes may be utilized such as Gray codes, low-density parity check codes, etc.
- the barcoded high-throughput sequencing technique provides a robust description of the changes in bacterial community structure across the sample set.
- a high-throughput sequencing run is expensive, and the large number of custom primers required only adds to this cost.
- the barcoding technique allows for thousands of samples to be analyzed simultaneously, with each community analyzed in considerable detail.
- the barcoded high-throughput sequencing method may not allow for the identification of bacterial taxa at the finest levels of taxonomic resolution. However, with increasing read lengths in sequencing, this constraint will gradually become less relevant.
- 16S rRNA phylogenies tend to correspond well to trends in overall gene content, the ability to relate trends at the species level to host or environmental parameters has proven exceptionally powerful.
- the DNA encoding the 16S rRNA gene has been widely used to specify bacterial taxa, since the region can be amplified using PCR primers that bind to conserved sites in most or all species, and large databases are available relating 16S rRNA sequences to bacterial phylogenies.
- other genes can be used to specify the taxa, such as 18S, LSU, ITS, and SSU (e.g., 16S).
- cpn60 or ftsZ, or other markers may also be utilized.
- this limitation can be at least partially overcome by using a reference tree based on full-length sequences, such as the tree from the Greengenes 16S rRNA ARB Database, and then using an algorithm such as parsimony insertion to add the short sequence reads to this reference tree.
- a reference tree based on full-length sequences, such as the tree from the Greengenes 16S rRNA ARB Database, and then using an algorithm such as parsimony insertion to add the short sequence reads to this reference tree.
- These procedures are necessarily approximate, and may lead to errors in phylogenetic reconstruction that could affect later conclusions about which communities are more similar or different.
- One substantial concern is that because different regions of the rRNA sequence differ in variability, conclusions drawn about the similarities between communities from different studies might be affected more by the region of the 16S rRNA that was chosen for sequencing than by the underlying biological reality.
- the increase in number of sequences per run from parallel high-throughput sequencing technologies such as the Roche 454 GS FLXTM to Illumina GAIIxTM is on the order of 1,000-fold and greater than the increase in the number of sequences per run from Sanger to 454TM
- the transition from Sanger sequencing to 454TM sequencing has opened new frontiers in microbial community analysis by making it possible to collect hundreds of thousands of sequences spanning hundreds of samples.
- a transition to the IlluminaTM platform allows for more extensive sequencing than has previously been feasible, with the possibility of detecting even OTUs that are very rare.
- By using a variant of the barcoding strategy used for 454TM with the IlluminaTM platform thousands of samples could be analyzed in a single run, with each of the samples analyzed in unprecedented depth.
- a few sequencing runs using 454TM/Roche's pyrosequencing platform can generate sufficient coverage for assembling entire microbial genomes, for the discovery, identification and quantitation of small RNAs, and for the detection of rare variations in cancers, among many other applications.
- the coverage provided by this system becomes unnecessary for phylogenetic classification.
- the 454/RocheTM pyrosequencers can accommodate a maximum of only 16 independent samples, which have to be physically separated using manifolds on the sequencing medium, drastically limiting is utility in the effort to elucidate the diverse microbial communities in each sample. Relatively speaking, the IlluminaTM platforms are experiencing the most growth.
- the method describe herein will be used with any available high-throughput sequencing platform currently available or will be available in the future.
- the method described herein will be applied to a sequencing method wherein the genetic material will be sequenced without barcoding by simply placing the DNA or RNA directly into a sequencing machine.
- high-throughput sequencing technology allows for the characterization of microbial communities orders of magnitude faster and more cheaply than has previously been possible.
- the ability to barcode amplicons from individual samples means that hundreds of samples can be sequenced in parallel, further reducing costs and increasing the number of samples that can be analyzed.
- high-throughput sequencing reads tend to be short compared to those produced by the Sanger method, the sequencing effort is best focused on gathering more short sequences (less than 150 base pairs or less than 100 base pairs) rather than fewer longer ones as much of the diversity of microbial communities lies within the “rare biosphere,” also known as the “long tail,” that traditional culturing and sequencing technologies are slow to detect due to the limited amount of data generated from these techniques.
- the length of the read of a sequence describes the number of nucleotides in a row that the sequencer is able to obtain in one read. This length can determine the type of OTU obtained (e.g., family, genus or species). For example, a read length of approximately 300 base pairs will probably provide family information but not a species determination.
- Depth of coverage in DNA sequencing refers to the number of times a nucleotide is read during the sequencing process. On a genome basis, it means that, on average, each base has been sequenced a certain number of times (10 ⁇ , 20 ⁇ , . . . ). For a specific nucleotide, it represents the number of sequences that added information about that nucleotide.
- Coverage is the average number of reads representing a given nucleotide in the reconstructed sequence. Depth can be calculated from the length of the original genome (G), the number of reads (N), and the average read length (L) as N ⁇ L/G. For example, a hypothetical genome with 2,000 base pairs reconstructed from 8 reads with an average length of 500 nucleotides will have 2 ⁇ redundancy. This parameter also enables estimation of other quantities, such as the percentage of the genome covered by reads (coverage) Sometimes a distinction is made between sequence coverage and physical coverage. Sequence coverage is the average number of times a base is read. Physical coverage is the average number of times a base is read or spanned by mate paired reads.
- Organisms of lower abundance rank can be detected if more sequence reads are collected. To verify that these sequences are present, a higher read depth (i.e. more sequences) must be obtained. Analyzing the rare biosphere is attainable because sequencing depth provided by high-throughput sequencing allows for the detection of microbes that would otherwise be detected only occasionally by chance with traditional techniques. Thus high-throughput sequencing will allow for the analysis of the more rare members (low abundance organisms) of any environment which may play critical role in a fermentation process important in food production, agriculture and other industries where microbes are present within a time-frame feasible for industrial settings.
- Pyrosequencing based on the “sequencing by synthesis” principle, is a method of DNA sequencing widely used in microbial sequencing studies. Pyrosequencing involves taking a single strand of the DNA to be sequenced and then synthesizing its complementary strand enzymatically. The pyrosequencing method is based on observing the activity of DNA polymerase, which is a DNA synthesizing enzyme, with another chemiluminescent enzyme.
- the single stranded DNA template is hybridized to a sequencing primer and incubated with the enzymes DNA polymerase, ATP sulfurylase, luciferase and apyrase, and with the substrates adenosine 5′ phosphosulfate (APS) and luciferin. Synthesis of the complementary strand along the template DNA allows for sequencing of a single strand of DNA, one base pair at a time, by the detection of which base was actually added at each step.
- the template DNA is immobile, and solutions of A, C, G, and T nucleotides are sequentially added and removed from the reaction.
- the templates for pyrosequencing can be made both by solid phase template preparation (streptavidin-coated magnetic beads) and enzymatic template preparation (apyrase+exonuclease).
- dNTPs deoxynucleoside triphosphates
- dATPalphaS which is not a substrate for a luciferase, is added instead of dATP
- DNA polymerase incorporates the correct, complementary dNTPs onto the template. This base incorporation releases pyrophosphate (PPi) stoichiometrically.
- ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5′ phosphosulfate.
- This ATP acts to catalyze the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP.
- Light is produced only when the nucleotide solution complements the particular unpaired base of the template.
- the light output in the luciferase-catalyzed reaction is detected by a camera and analyzed in a program.
- the sequence of solutions which produce chemiluminescent signals allows the sequence determination of the template. Unincorporated nucleotides and ATP are degraded by the apyrase, and the reaction can restart with another nucleotide.
- Illumina'sTM sequencing by synthesis (SBS) technology with TruSeq technology supports massively parallel sequencing using a proprietary reversible terminator-based method that enables detection of single bases as they are incorporated into growing DNA strands.
- a fluorescently labeled terminator is imaged as each dNTP is added and then cleaved to allow incorporation of the next base. Since all four reversible terminator-bound dNTPs are present during each sequencing cycle, natural competition minimizes incorporation bias. The end result is true base-by-base. Although this is similar to pyrosequencing, the differences between the platforms are noteworthy.
- the method described herein can be applied to any high-throughput sequencing technology, past, present or future. Pyrosequencing and SBS are merely examples and do not limit the application of the method in terms of sequencing.
- Sequence data can be analyzed in a manner in which sequences are identified and labeled as being from a specific sample using the unique barcode introduced during library preparation, if barcodes are used, or sample identifiers will be associated with each run directly if barcodes are not used. Once sequences have been identified as belonging to a specific sample, the relationship between each pair of samples will be determined based on the distance between the collection of microbes present in each sample.
- QIIME is designed to take users from raw sequencing data (for example, as generated on the IlluminaTM and 454TM platforms) though the processing steps mentioned above, leading to quality statistics and visualizations used for interpretation of the data. Because QIIME scales to billions of sequences and runs on systems ranging from laptops to high-performance computer clusters, it will continue to keep pace with advances in sequencing technologies to facilitate characterization of microbial community patterns ranging from normal variations to pathological disturbances in many human, animal and environmental ecosystems.
- the first step in the bioinformatics stage of a microbial community analysis study is to consolidate the sample metadata in a spreadsheet.
- the sample metadata is all per-sample information, including technical information such as the barcode assigned to each sample, and “environmental” metadata.
- This environmental metadata will differ depending on the types of samples that are being analyzed. If, for example, the study is of microbial communities in soils, the pH and latitude where the soil was collected will be environment metadata categories. Alternatively, if the samples are of the wine microbiome, environmental metadata may include barrel and/or bottling identifiers and collection times.
- This spreadsheet will be referred to as the sample metadata mapping file in the following sections.
- sequence barcodes will be read to identify the source sample of each sequence, poor quality regions of sequence reads will be trimmed, and poor quality reads will be discarded. These steps will be combined for computational efficiency.
- the features included in quality filtering include whether the barcode will unambiguously be mapped to a sample barcode, per-base quality scores, and the number of ambiguous (N) base calls.
- the default settings for all quality control parameters in QIIME will be determined by benchmarking combinations of these parameters on artificial (i.e., “mock”) community data, where microbial communities were created in the lab from known concentrations of cultured microbes, and the composition of the communities is thus known in advance.
- sequences will be clustered into OTUs (Operational Taxonomic Units). This is typically the most computationally expensive step in microbiome data analysis, and will be performed to reduce the computational complexity at subsequent steps.
- OTUs Orthogonal Taxonomic Units
- Highly similar sequences e.g., those that are greater than 97% identical to one another
- the count of sequences that are contained in each cluster will be retained, and then a single representative sequence from that cluster for use in downstream analysis steps such as taxonomic assignment and phylogenetic tree construction will be chosen.
- OTU picking This process of clustering sequences is referred to as OTU picking, where the OTUs (i.e., the clusters of sequences) are considered to represent taxonomic units such as species.
- SILVA a comprehensive on-line resource for quality checked and aligned ribosomal RNA sequence data, provides regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for all three domains of life (Bacteria, Archaea and Eukarya).
- De novo OTU picking cannot be used if the comparison is between non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA gene or for very large data sets, like a full HiSegTM 2000 run. Although technically, de novo OTU picking can be used for very large data sets, the program would take too long to run to be practical.
- pick_closed_reference_otus.py is the primary interface for dosed-reference OTU picking in QIIME. If the user provides taxonomic assignments for sequences in the reference database, those are assigned to OTUs. Closed-reference OTU picking must be used if non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA, will be compared to each other. The reference sequences must span both of the regions being sequenced.
- Closed-reference OTU picking cannot be used if there is no reference sequence collection to cluster against, for example because an infrequently used marker gene is being used.
- a benefit of closed-reference OTU picking is speed in that the picking is fully parallelizable, and therefore useful for extremely large data sets. Another benefit is that because all OTUs are already defined in the reference sequence collection, a trusted tree and taxonomy for those OTUs may already exist. There is the option of using those, or building a tree and taxonomy from the sequence data.
- a drawback to reference-based OTU picking is that there is an inability to detect novel diversity with respect to the reference sequence collection. Because reads that do not hit the reference sequence collection are discarded, the analyses only focus on the diversity that is already known.
- a small fraction of the reads e.g., discarding 1-10% of the reads is common for 16S-based human microbiome studies, where databases like Greengenes cover most of the organisms that are typically present
- a large fraction of your reads e.g., discarding 50-80% of the reads has been observed for “unusual” environments like the Guerrero Negro microbial mats
- pick_open_reference_otus.py is the primary interface for open-reference OTU picking in QIIME, and includes taxonomy assignment, sequence alignment, and tree-building steps.
- Open-reference OTU picking with pick_open_reference_otus.py is the preferred strategy for OTU picking.
- Open-reference OTU picking cannot be used for comparing non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA, or when there is no reference sequence collection to cluster against, for example because an infrequently used marker gene is being used.
- open-reference OTU picking is that all reads are clustered. Another benefit is speed. Open-reference OTU picking is partially run in parallel. In particular, the subsampled open reference OTU picking process implemented in pick_open_reference_otus.py is much faster than pick_de_novo_otus.py as some strategies are applied to run several pieces of the workflow in parallel. However, a drawback of open-reference OTU picking is also speed. Some steps of this workflow run serially. For data sets with a lot of novel diversity with respect to the reference sequence collection, this can still take days to run.
- uclust is the preferred method for performing OTU picking.
- QIIME's uclust-based open reference OTU picking protocol will be used when circumstances allow (i.e., when none of the cases above, where open reference OTU picking is not possible, apply).
- the OTU-picking protocol described above is used for processing taxonomic marker gene sequences such as those from the 16S rRNA, ITS and LSU genes as well as other marker genes.
- the sequences themselves are not used to identify biological functions performed by members of the microbial community; they are instead used to identify which kinds of organisms are present.
- the data obtained are random fragments of all genomic DNA present in a given microbiome. These can be compared to reference genomes to identify the types of organisms present in a manner similar to marker gene sequences, but they may also be used to infer biological functions encoded by the genomes of microbes in the community.
- RNA rather than the DNA
- physical or chemical steps to deplete particular classes of sequence such as eukaryotic messenger RNA or ribosomal RNA are often used prior to library construction for sequencing.
- protein fragments are obtained and matched to reference databases.
- metabolites are obtained by biophysical methods including nuclear magnetic resonance or mass spectrometry.
- centroid sequence in each OTU will be selected as the representative sequence for that OTU.
- the centroid sequence will be chosen so that all sequences are within the similarity threshold to their representative sequence, and the centroid sequences are specifically chosen to be the most abundant sequence in each OTU.
- the OTU representative sequences will next be aligned using an alignment algorithm such as the PyNAST software package.
- PyNAST is a reference-based alignment approach, and is chosen because it achieves similar quality alignments to non-reference-based alignment approaches (e.g., muscle), where quality is defined as the effect of the alignment algorithm choice on the results of phylogenetic diversity analyses, but is easily run in parallel, which is not the case for non-reference-based alignment algorithms.
- positions that mostly contain gaps, or too high or too low variability, will be stripped to create a position-filtered alignment.
- This position-filtered alignment will be used to construct a phylogenetic tree using FastTree. This tree relates the OTUs to one another, will be used in phylogenetic diversity calculations (discussed below), and is referred to below as the OTU phylogenetic tree.
- OTU representative sequences will have taxonomy assigned to them. This can be performed using a variety of techniques, though our currently preferred approach is the uclust-based consensus taxonomy assigner implemented in QIIME.
- all representative sequences (the “query” sequences) are queried against a reference database (e.g., Greengenes, which contains near-full length 16S rRNA gene sequences with human-curated taxonomic assignments; UNITE database for ITS; SILVA for 18S rRNA) with uclust.
- the taxonomy assignments of the three best database hits for each query sequences are then compared, and a consensus of those assignments is assigned to the query sequence.
- BIOM Biological Observation Matrix
- BIOM table an OTU phylogenetic tree, and a sample metadata mapping file (n-dimensional plot) are compiled, the microbial communities present in each sample will be analyzed and compared.
- analyses include, but are not limited to, summarizing the taxonomic composition of the samples, understanding the “richness” and “evenness” of samples (defined below), understanding the relative similarity of communities, and identifying organisms or groups of organisms that are significantly different across community types. The different types of analysis on soil microbial community data will be illustrated in the Examples below.
- the taxonomic composition of samples is often something that researchers are most immediately interested in. This can be studied at various taxonomic levels (e.g., phylum, class, species) by collapsing OTUs in the BIOM table based on their taxonomic assignments. The abundance of each taxon on a per-sample basis is then typically presented in bar charts, area charts or pie charts, though this list is not comprehensive.
- taxonomic levels e.g., phylum, class, species
- Alpha diversity refers to diversity of single samples (i.e., within-sample diversity), including features such as taxonomic richness and evenness.
- the species richness is a measure of the number of different species of microbes in a given sample.
- Species evenness refers to how close in numbers the abundance of each species in an environment is.
- Alpha diversity scores have been shown to differ in different types of communities, for example, from different human body habitats. For instance, skin-surface bacterial communities have been found to be significantly more rich (i.e., containing more species) in females than in males, and at dry sites rather than sebaceous sites, and the gut microbiome of lean individuals have been found to be significantly more rich than those of obese individuals.
- alpha diversity in the context of environmental metadata, for example, the degree of phylogenetic diversity in a sample (a phylogeny-aware measure of richness) changes with soil pH, ranging from pH around 6.5 through 9.5, with a peak in richness around neutral pH of 7.
- alpha diversity will be useful input features for building predictive models via supervised classifiers.
- Beta diversity metrics provide a measure of community dissimilarity, allowing investigators to determine the relative similarity of microbial communities. Metrics of beta diversity are pairwise, operating on two samples at a time.
- the difference in overall community composition between each pair of samples can be determined using the phylogenetically-aware UniFrac distance metric, which allows researchers to address many of these broader questions about the composition of microbial communities.
- UniFrac calculates the fraction of branch length unique to a sample across a phylogenetic tree constructed from each pair of samples.
- the UniFrac metric measures the distance between communities as the percentage of branch length that leads to descendants from only one of a pair of samples represented in a single phylogenetic tree, or the fraction of evolution that is unique to one of the microbial communities.
- Phylogenetic techniques for comparing microbial communities avoid some of the pitfalls associated with comparing communities at only a single level of taxonomic resolution and provide a more robust index of community distances than traditional taxon-based methods, such as the Jaccard and Sorenson indices.
- species-based methods that measure the distance between communities based solely on the number of shared taxa do not consider the amount of evolutionary divergence between taxa, which can vary widely in diverse microbial populations.
- P Phylogenetic
- Fst test Pairwise significance tests are limited because they cannot be used to relate many samples simultaneously.
- phylogenetically-aware techniques such as UniFrac offer significant benefits
- techniques lacking phylogenetic awareness can also be implemented with success: after an alternative distance metric (e.g. Bray-Curtis, Jensen-Shannon divergence) has been applied, the resulting inter-sample distance matrix is processed in the same way as a UniFrac distance matrix as described below.
- an alternative distance metric e.g. Bray-Curtis, Jensen-Shannon divergence
- QIIME implements the UniFrac metric and uses multivariate statistical techniques to determine whether groups of microbial communities are significantly different.
- the UniFrac distances between all pairs of communities are computed to derive a distance matrix (using UniFrac or other distances) for all samples.
- This will be an n ⁇ n matrix, which is symmetric (because the distance between sample A and sample Bis always equal to the distance between sample Band sample A) and will have zeros on the diagonal (because the distance between any sample and itself is always zero).
- n e.g., n>S
- Ordination techniques such as principal coordinates analysis (PCoA) and non-metric multidimensional scaling (NMDS), together with approximations to these techniques that reduce computational cost or improve parallelism, will be used to summarize these patterns in two or three dimensional scatter plots.
- the patterns can also be represented in two dimensions using, for example, line graph, bar graphs, pie charts, Venn diagrams, etc. This is a non-exhaustive list.
- the patterns can also be represented in three dimensions using, for example, wire frame, ball and stick models, 3-D monitors, etc. This list is also non-exhaustive and does not limit the 2-D or 3-D forms by which the data can be represented.
- PCoA is a multivariate statistical technique for finding the most important orthogonal axes along which samples vary. Distances are converted into points in a space with a number of dimensions one less than the number of samples. The principal components, in descending order, describe how much of the variation (technically, the inertia) each of the axes in this new space explains. The first principal component separates the data as much as possible; the second principal component provides the next most separation along an orthogonal axis, and so forth. QIIME returns information on all principal component axes in a data table. It also allows easy visualization of that data in interactive scatter plots that allow users to choose which principal components to display.
- the points are typically marked with colored symbols, and users can interactively change the colors of the points to detect associations between sample microbial composition and sample metadata.
- PCoA often reveals patterns of similarity that are difficult to see in a distance matrix, and the axes along which variation occurs can sometimes be correlated with environmental variables such as pH or temperature.
- Industrial variables, or control data can include presence of oil, pressure, viscosity, etc. These control data can be filtered or removed in order to observe other control data factors to visualize possible patterns.
- QIIME 1.8.0 (released in December 2013) introduces several powerful tools to assist in visualizations of the results of PCoA, primarily the Emperor 3D scatter plot viewer. This includes (i) the ability to color large collections of samples using different user-defined subcategories (for example, coloring environmental samples according to temperature or pH), (ii) automatic scaled/unscaled views, which accentuate dimensions that explain more variance, (iii) the ability to interactively explore tens of thousands of points (and user-configurable labels) in 3D, and (iv) parallel coordinates displays that allow the dimensions that separate particular groups of environments to be readily identified.
- subcategories for example, coloring environmental samples according to temperature or pH
- automatic scaled/unscaled views which accentuate dimensions that explain more variance
- parallel coordinates displays that allow the dimensions that separate particular groups of environments to be readily identified.
- the significance of patterns identified in PCoA can be tested with a variety of methods.
- the significance of the clusters identified by UniFrac can be established using Monte Carlo based t-tests, where samples are grouped into categories based on their metadata, and distributions of distances within and between categories are compared. For example, if microbial communities are being compared between soils from a vineyard and soils unassociated with a vineyard, the distribution of UniFrac distances between soils from the same group can be compared to those between soils from different groups by computing a t-score (the actual t-score).
- the sample labels (vineyard, non-vineyard) can then be randomly shuffled 10,000 times, and at-score calculated for each of these randomized data sets (the randomized t-scores). If the vineyard soils and non-vineyard soils are significantly different from one another in composition, the actual t-score should be higher than the vast majority of the randomized t-scores. A p-value will be computed by dividing the number of randomized t-scores that are better than the actual t-score by 9999.
- the Monte Carlo simulations described here will be run in parallel, and are not limited to pairs of sample categories, so they support analysis of many different sample types.
- Supervised classification is a machine learning approach for developing predictive models from training data.
- Each training data point consists of a set of input features, for example, the relative abundance of taxa, and a qualitative dependent variable giving the correct classification of that data point.
- classifications might include soil nutrients, predominant weather patterns, disease states, therapeutic results, or forensic identification.
- the goal of supervised classification is to derive some function from the training data that can be used to assign the correct class or category labels to novel inputs (e.g. new samples), and to learn which features, for example, taxa, discriminate between classes.
- Common applications of supervised learning include text classification, microarray analysis, and other bioinformatics analyses. For example, when microbiologists use the Ribosomal Database Project website to classify 16S rRNA gene sequences taxonomically, a form of supervised classification is used.
- the primary goal of supervised learning is to build a model from a set of categorized data points that can predict the appropriate category membership of unlabeled future data.
- the category labels can be any type of important metadata, such as sugar content, viscosity, pH or temperature.
- the ability to classify unlabeled data is useful whenever alternative methods for obtaining data labels are difficult or expensive.
- a common way to estimate the EPE of a particular model is to fit the model to a subset (e.g., 90%) of the data and then test its predictive accuracy on the other 10% of the data. This can provide an idea of how well the model would perform on future data sets if the goal is to fit it to the entire current data set. To improve the estimate of the EPE, this process will be repeated ten times so that each data point is part of the held-out validation data once. This procedure, known as cross-validation, will allow for the comparison of models that use very different inner machinery or different subsets of input features. Of course if many different models are tried and one provides the lowest cross-validation error for the entire data set is selected, it is likely that the reported EPE will be too optimistic.
- Machine learning classification techniques will be applied to many types of microbial community data, for example, to the analysis of soil samples.
- the samples will be classified according to environment type using support vector machines (SVMs) and k-nearest neighbors (KNN).
- SVMs support vector machines
- KNN k-nearest neighbors
- Supervised learning will be used extensively in other classification domains with high-dimensional data, such as macroscopic ecology, microarray analysis, and text classification.
- the goal of feature selection will be to find the combination of the model parameters and the feature subset that provides the lowest expected error on novel input data.
- Feature selection will be of utmost importance in the realm of microbiome classification due to the generally large number of features (i.e., constituent species-level taxa, or genes, or transcripts, or metabolites, or some combination of these): in addition to improving predictive accuracy, reducing the number of features leads to the production of more interpretable models.
- Approaches to feature selection known to people in the art and are typically divided into three categories: filter methods, wrapper methods, and embedded methods.
- filter methods are completely agnostic to the choice of learning algorithm being used; that is, they treat the classifier as a black box.
- Filter methods use a two-step process. First a univariate test (e.g. t-test) or multivariate test (e.g., a linear classifier built with each unique pair of features) will be performed to estimate the relevance of each feature, and (1) all features whose scores exceed a predetermined threshold will be selected or (2) the best n features for inclusion in the model will be selected; then a classifier on the reduced feature set will be run. The choice of n can be determined using a validation data set or cross-validation on the training set.
- a univariate test e.g. t-test
- multivariate test e.g., a linear classifier built with each unique pair of features
- Filter methods have several benefits, including their low computational complexity, their ease of implementation, and their potential, in the case of multivariate filters, to identify important interactions between features.
- the fact that the filter has no knowledge about the classifier is advantageous in that it provides modularity, but it can also be disadvantageous, as there is no guarantee that the filter and the classifier will have the same optimal feature subsets.
- a linear filter e.g., correlation-based
- RF random forest
- a filter will be to identify features that are generally predictive of the response variable, or to remove features that are noisy or uninformative.
- Common filters include, but are not limited to, the between-class chit test, information gain (decrease in entropy when the feature is removed), various standard classification performance measures such as precision, recall, and the F-measure, and the accuracy of a univariate classifier, and the bi-normal separation (BNS), which treats the univariate true positive rate and the false-positive rate (tpr, fpr, based on document presence/absence in text classification) as though they were cumulative probabilities from the standard normal cumulative distribution function, and the difference between their respective z-scores, Fl (tpr)-Fl (fpr), will be used as a measure of that variable's relevance to the classification task.
- BNS bi-normal separation
- Wrapper methods are usually the most computationally intensive and perhaps the least elegant of the feature selection methods.
- a wrapper method like a filter method, will treat the classifier as a black box, but instead of using a simple univariate or multivariate test to determine which features are important, a wrapper will use the classifier itself to evaluate subsets of features.
- wrappers would be superior to filters because they would be able to find the optimal combination of features and classifier parameters.
- the search will not be tractable for high-dimensional data sets; hence, the wrapper will use heuristics during the search to find the optimal feature subset.
- wrappers instead of filters, namely that the wrapper can interact with the underlying classifier, is shared by embedded methods, and the additional computational cost incurred by wrappers therefore makes such methods unattractive.
- Embedded approaches to feature selection will perform an integrated search over the joint space of model parameters and feature subsets so that feature selection becomes an integral part of the learning process.
- Embedded feature selection will have the advantage over filters that it has the opportunity to search for the globally optimal parameter-feature combination. This is because feature selection will be performed with knowledge of the parameter selection process, whereas filter and wrapper methods treat the classifier as a “black box.”
- performing the search over the whole joint parameter-feature space is generally intractable, but embedded methods will use knowledge of the classifier structure to inform the search process, while in the other methods the classifier must be built from scratch for every feature set.
- the method described herein will be useful in a plethora of industrial settings.
- the scope of the information obtained can vary, based on the type of goal to be obtained.
- the method can be applied on a macro scale, for example, sampling and analysis from all vineyards throughout the world.
- the method can also be applied on a regional scale, for example, sampling and analysis of vineyards in a region of the United States.
- the method can be applied on a local scale, for example, sampling and analysis in a vineyard in Virginia.
- the method can be applied on a run-based scale, for example, sampling and analysis of different harvests in one winery.
- Vintners rely heavily on the soil for the growth of their vineyards. With microbiome analysis of particular soil that yielded a successful harvest generally or that was especially resistant to climatic variation, a vintner will use this information to predict a number of things. First, the vintner will use the microbiome information from a successful harvest of the previous season and compare with the soil on his vineyard currently to see if the soil is likely to yield a successful harvest this season. Second, if the soil microbiome is much different, he will use that information to plant a different grape variety that will flourish in the soil. This data will be obtained from previous years' soil analysis.
- the soil microbiome of the prospective vineyard will be tested to see which grape varieties have growth potential in that particular soil. If the vintner desires to plant a specific grape variety, the analysis of the soil may steer him away from the new land if the microbiome of the soil is more likely to yield a successful season of a different variety. Fourth, a particular high-end variety in which the vintner is interested in cultivating may only grow in certain soil conditions. An analysis of the soil (including the microbiome) where the particular crop has thrived compared to the vintner's current soil will inform the vintner of the feasibility of the new crop. Precision oenology is one of the advantages of the embodiments of this invention.
- the information related to the fermentation species identifies in the soil to provide advice to vintners and winemakers to improve the organoleptic properties of the wine.
- the soil being the repository of most of the fermentation species, the value of the soil/harvest could fluctuate depending on a Micro-Wine-Makers index identifying the percentage of fermentation species relevant for the specific winemaking process. The index would provide information on the optimal microbiome community needed in the soil to launch the fermentation process.
- a certain plant may naturally grow in sandy soil or sand of high salinity, or under extreme temperatures, or with little water, or it may be resistant to certain pests or disease present in the environment, and it may be desirable for a commercial crop to be grown in such conditions, particularly if they are, for example, the only conditions available in a particular geographic location.
- the microorganisms may be collected from commercial crops grown in such environments, or more specifically from individual crop plants best displaying a trait of interest amongst a crop grown in any specific environment: for example the fastest-growing plants amongst a crop grown in saline-limiting soils, or the least damaged plants in crops exposed to severe insect damage or disease epidemic, or plants having desired quantities of certain metabolites and other compounds, including fibre content, oil content, and the like, or plants displaying desirable colours, taste or smell.
- the microorganisms may be collected from a plant of interest or any material occurring in the environment of interest, including fungi and other animal and plant biota, soil, water, sediments, and other elements of the environment as referred to previously.
- a microorganism or a combination of microorganisms of use in the methods of the invention may be selected from a pre-existing collection of individual microbial species or strains based on some knowledge of their likely or predicted benefit to a plant.
- the microorganism may be predicted to: improve nitrogen fixation; release phosphate from the soil organic matter; release phosphate from the inorganic forms of phosphate (e.g.
- “fix carbon” in the root microsphere live in the rhizosphere of the plant thereby assisting the plant in absorbing nutrients from the surrounding soil and then providing these more readily to the plant; increase the number of nodules on the plant roots and thereby increase the number of symbiotic nitrogen fixing bacteria (e.g.
- Rhizobium species per plant and the amount of nitrogen fixed by the plant; elicit plant defensive responses such as ISR (induced systemic resistance) or SAR (systemic acquired resistance) which help the plant resist the invasion and spread of pathogenic microorganisms; compete with microorganisms deleterious to plant growth or health by antagonism, or competitive utilization of resources such as nutrients or space; change the color of one or more part of the plant, or change the chemical profile of the plant, its smell, taste or one or more other quality.
- ISR induced systemic resistance
- SAR systemic acquired resistance
- individual isolates should be taken to mean a composition or culture comprising a predominance of a single genera, species or strain of microorganism, following separation from one or more other microorganisms. The phrase should not be taken to indicate the extent to which the microorganism has been isolated or purified. However, “individual isolates” preferably comprise substantially only one genus, species, or strain of microorganism.
- microorganisms can be isolated from a plant or plant material, surface or growth media associates with a selected plant using any appropriate techniques known in the art, including but not limited to those techniques described herein. For example, whole plant could be obtained and optionally processed, such as mulched or crushed. Alternatively, individual tissues or parts of selected plants (such as leaves, stems, roots, and seeds) may be processed.
- a unit is defined as a parcel of land with the same grape variety, type of soil, culture techniques, and climate characteristics. If the vineyard is on the side of a hill, it should be divided into different independent units and different sampling kits used.
- Liquid samples We are developing test with different conservative buffers to identify the most ideal additive to inactivate microbial activity in a sample.
- the ideal buffer should be in form of powder instead of liquid: easier to preserve and easier to deliver.
- Each sample should be identified with an unique ID in order to provide each sample with its special character so that it can be treated as unique during the workflow.
- Sample ID has been conceived as a combination of six alphanumeric fields. The first three digits identify the client and the last three digits identify the sample number. With this unique code it is possible to create almost 50,000 sample IDs per client. If we run out of sample IDs, a new client ID could be assigned if necessary for the same client.
- the first step is to extract the DNA by breaking the molecular union of cells, releasing the DNA and concentrating it.
- RNA PowerSoil® Total RNA Isolation Kit MO BIO Laboratories, Inc. Carlsbad, CA
- Wash step Dilute the pellet using 1.5 ml of PBS and transfer to a 1.5 ml eppendorf.
- step 3-4 twice. Note: In this step you have to be aware of the pellet quantity so if you get little pellet avoid repeat the wash step and proceeds to step 6. If you are processing must, avoid the wash step.
- Wash step Dilute the pellet using 1.5 ml of PBS and transfer to a 1.5 ml eppendorf.
- step 3-4 twice. Note: In this step you have to be aware of the pellet quantity so if you get little pellet avoid repeating the wash step and proceeds to step 6. If you are processing must, avoid the wash step.
- step 3-4 twice. Note: In this step you have to be aware of the pellet quantity so if you get little pellet avoid repeat the wash step and proceed to step 6. If you are processing must, avoid the wash step.
- SEQ ID NO. 1 SA501 AATGATACGGCGACCACCGAGATCTACACAT CGTACGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f.
- SEQ ID NO. 2 SA502 AATGATACGGCGACCACCGAGATCTACACAC TATCTGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f.
- SEQ ID NO. 3 SA503 AATGATACGGCGACCACCGAGATCTACACTA GCGAGTGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f.
- SEQ ID NO. 1 SA501 AATGATACGGCGACCACCACCGAGATCTACACAT CGTACGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f.
- SEQ ID NO. 3 SA503 AATGATACGGCGACCACCGAGATCTACACTA GCGAGTGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f.
- thermocycler tap When 5 minutes after start the first cycle, open the thermocycler tap and without remove the plate add 10 ul of Five Prime Hot Master Mix per well.
- thermocycler Conditions for 96 well thermocyclers
- thermocycler Conditions for 96 well thermocyclers Thermocycler Conditions for 96 well thermocyclers:
- thermocycler tap When 5 minutes after start the first cycle, open the thermocycler tap and without remove the plate add 10 ul of Five Prime Hot Master Mix per well.
- Shotgun Metagenomic Library Prep Workflow for a Bottled Wine Sample
- 16S and ITS protocol are dual index PCR protocol, with only 20 different primers its possible to sequence 96 samples.
- the method is adapted from Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform publication (Kozinch, J. J. et al., 2013, Appl. Environ Microbial 79, 5112-5120) by designing and using different primer sequences.
- master mix plates can be stablilizated at room temperature using ADN AmpligelMaster Mix plastes (Biotools).
- 16S rDNA is a powerful phylogenetic marker commonly used for profiling diversity in microbial samples, yet its use is associated with known problems including biases introduced by copy-number variations, variability in amplification efficiency, inconsistencies when targeting different regions of the gene, and problems with accurately and consistently delineating prokaryotic species.
- 16S rDNA in combination with another single-copy marker gene. This results in prokaryotic species boundaries at higher resolution than 16S rDNA.
- the improved protocol is based on the following publication: Spencer et al., 2015, ISME J.
- Binding Buffer Mix by pipetting, sealing, vortexing, and spinning briefly.
- the pipeline is programmed to run on a custom made cloud-based computing platform such as Amazon Machine Image (AMI) on Amazon Web Services, Microsoft Azure Cloud Computing, or Compute Engine on Google Cloud Platform.
- AMI Amazon Machine Image
- Azure Cloud Computing or Compute Engine on Google Cloud Platform.
- the instance is able to connect directly to BaseSpace via Illumina's Basemount program.
- the pipeline can pick OTUs using two different algorithms: QIIME open reference, and minimum entropy decomposition (MED)
- GIS Geographical Information Systems
- GIS layers as for example wine regions, geography, climate, weather, soil composition, and other similar GIS data layers. Some of the layers have been developed by us and other are open data.
- a Geo-map identifying the different wine regions and the microbiome profile, highlighting the presence of the Micro-Wine-Makers is in preparation. This map will also match different grape varieties and microbiome profile worldwide.
- This technology helps to identify and quantify all the fermentation species from bacteria and fungi kingdoms for different samples.
- Appendix C lists of some species discovered in the different samples and their influence in wine.
- Some data mining and big data techniques are used to make queries to our databases and get useful information especially interesting to better understand the relevance of the microbiome profile in products as wine.
- An interesting example of the outcomes of this process is the matching between the composition of the microbiome community in the wine and the organoleptic characteristics (flavours/taste) of the wine.
- Yeasts selected provide specific and desirable phenotype with fermentation characteristics knowing and represent 80% of commercial world yeast.
- the objective of this work is to connect the phenotype known with the genotype of these strains to provide tools to:
- bio-based control tools designed to avoid possible problems in a certain phases of vinification process can be applied. For example, depending on our analysis of the soil microbiome, we can state if that soil has organic properties and has been cultivated environmentally sustainable.
- the “Genetic Friendly Label” is our first labelling product and it is used for soil quality assessment at a certain moment.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- Evolutionary Computation (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Animal Behavior & Ethology (AREA)
- Physiology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Monitoring, analysis and control of fermentation activities includes methods and corresponding systems directed toward agriculture, biofuels, and food production. Complex methods and corresponding systems are provided for classifying a microorganism; profiling a microbiome; sequencing multiple libraries in a single sequencing run; determining a microbiome profile in a sample; and analyzing a material from a location associated with a fermentation process. Additional implementations are directed to methods and corresponding systems for obtaining, deriving, predicting and evaluating microbiome information; control, analysis and direction of fermentation operations; and evaluating, analyzing and displaying microbiome related information in two and three dimensional plots. Yet additional methods and corresponding systems permit identification and analysis of microorganisms capable of imparting beneficial properties to phases of fermentation processes.
Description
- This application is a continuation of U.S. application Ser. No. 15/779,531 filed on 28 May 2018, which is a 371 National Stage Entry of PCT Application PCT/US16/64984 filed 5 Dec. 2016, which claims the benefit of U.S. Provisional Application No. 62/263,488 filed on 4 Dec. 2015, which are each incorporated in its entirety herein by this reference.
- The embodiments described herein relate to novel and unique methods, systems and processes for identifying, analyzing, monitoring, and controlling activities. Fermentation activities entail a wide range of endeavors directed toward agriculture, manufacturing, chemical processing.
- The herein described process includes systems and methods for determining and characterizing the microbiome of a fermentation operation or setting, obtaining microbiome information, converting such information such that it is useful for controlling, enhancing, monitoring, detecting deviations, and predicting performance of the fermentation process.
- Fermentation is a process in which an agent causes transformation of a raw material into a finished product. During fermentation organic matter is decomposed in the absence or presence of air (oxygen) producing an accumulation of resulting fermentation product. Some of these products (for example, alcohol and lactic acid) are of importance to humans, and fermentation has therefore been used for their manufacture on an industrial scale.
- Microorganisms like yeast, molds, and bacteria play an important role in the alcohol fermentation process for creating beer and wine, and the formation of acetic acid (vinegar). Lactic fermentation is driven by lactic-acid bacteria which break down monosaccharides into lactic acid. Lactic fermentation is used in the preparation of various sour milk products, yogurt, cheese, and bread. Many mold fungi (for example, Aspergillus niger) ferment sugar, resulting in the formation of citric acid. A large proportion of the citric acid used in the food processing industry is obtained by microbiological means. Ethanol fuel is produced from the fermentation by yeast of common crops such as sugar cane, potato, cassava and corn to produce ethanol which is further processed to become fuel. The production of butyl alcohol and acetone industrially is important for the paint and lacquer industries. In the process of sewage treatment, sewage is digested by enzymes secreted by bacteria, to produce liquid and solid fertilizers, and biogas. Fungi have been employed to break down cellulosic wastes to increase protein content and improve in vitro digestibility. A wide variety of agroindustrial waste products can be fermented to use as food for animals, especially ruminants.
- The processes described herein are useful for enhancing any fermentation process. The advantages of the herein described processes are shown for vinification, the process whereby fermentation changes grape juice into wine. However, it is understood that these methods can be applied for enhancement of other fermentation processes.
- Winemaking or vinification, is the production of wine by fermentation of raw material, and for grape wine, that starts with the grapes. Factors affecting grape quality, known as the grape's terroir, include the variety of grapes, the weather during growing season, soil, time of harvest, and methods of pruning.
- After harvesting the grapes, the fruit is crushed to produce juice, called must. The primary fermentation can be done with natural yeast normally already present on the grapes, visible as a powdery substance, or cultured yeast is added to the must. The sugar content of the grapes is monitored during fermentation and can be adjusted (by addition of sugar) since it affects both the taste and end product, as well as the speed of the fermentation.
- During or after the primary fermentation, a secondary, or malolactic fermentation can be initiated by inoculation of desired bacteria which convert malic acid into lactic acid. This fermentation step can improve the taste of wine. During this secondary fermentation and aging process, fermentation continues very slowly in either stainless steel vessels or oak barrels.
- Prior to bottling, the wine is usually filtered. Filtration results in clarification and microbial stabilization. In clarification, large particles that affect the visual appearance of the wine are removed. In microbial stabilization, the amount of yeast and bacteria are adjusted to prevent the likelihood of refermentation or spoilage.
- As is evident from the winemaking steps described above, byproducts of fermentation by the microbial population or microbiome panel present in the soil, on the fruit, or during the winemaking process, contribute to the taste and quality of the wine.
- Therefore, understanding the microbiome, and how it changes along each stage of vinification or wine production, would be advantageous and necessary for influencing the quality of the wine at every level. Using the herein described novel and unique sequencing methods, it is now possible to generate a unique identity for the wine, a genetic footprint, based on its microbiome. Such a footprint would allow winemakers to differentiate wines according to the microbiome panel, and detect and solve problems using bio-based controls such as Brettanomyces contamination, refermentation, mousiness, ropiness, mannitol, granium taint, diacetyl level, to name a few. These problems can be solved by bioremediation and/or changing the physical parameters, e.g. temperature, pH, enzymes, in the vinification process and influencing the microbiome community.
- The present invention addresses the longstanding and unfulfilled need for better monitoring, analysis and control of fermentation activities, including, among others, those directed toward agriculture, biofuels, and food production.
- The terms microbiome, microbiome information, microbiome data, microbiome population, microbiome panel and similar terms are used in the broadest possible sense, unless expressly stated otherwise, and would include: a census of currently present microorganisms, both living and nonliving, which may have been present months, years, millennia or longer; a census of components of the microbiome other that bacteria and archea, e.g. viruses and microbial eukaryotes; population studies and characterizations of microorganisms, genetic material, and biologic material; a census of any detectable biological material; and information that is derived or ascertained from genetic material, biomolecular makeup, fragments of genetic material, DNA, RNA, protein, carbohydrate, metabolite profile, fragment of biological materials and combinations and variations of these.
- As used herein, the terms real-time microbiome data or information includes microbiome information that is collected or obtained at a particular setting during the fermentation process, for example soil, plant/fruit samples taken during a planting or harvesting, must, sampling of wine during alcoholic fermentation (beginning, middle and end, or depending on parameters such as alcoholic graduation, amount of sugar, density), sampling during malolactic fermentation (beginning, middle and end, or depending on amount of malic and acetic acid), barrel (beginning, middle and end, or months) and bottling.
- As used herein, the terms derived microbiome information and derived microbiome data are to be given their broadest possible meaning, unless specified otherwise, and includes any real-time, microbiome information that has been computationally linked or used to create a relationship such as for example evaluating the microbiome of milk before, during, and after fermentation, or evaluating the microbiome between planting and harvesting of grapes. Thus, derived microbiome information provides information about the fermentation process setting or activity that may not be readily ascertained from nonderived information.
- As used herein, the terms predictive microbiome information and predictive microbiome data are to be given their broadest possible meaning, unless specified otherwise, and includes information that is based upon combinations and computational links or processing of historic, predictive, real-time, and derived microbiome information, data, and combinations, variations and derivatives of these, which information predicts, forecasts, directs, or anticipates a future occurrence, event, state, or condition in the industrial setting, or allows interpretation of a current or past occurrence. Thus, by way of example, predictive microbiome information would include: a determination and comparison of real-time microbiome information and the derived microbiome information of quality of wine, i.e. abundance of a specific microorganism in a sample and possible positive or negative effect on the fermentation process; a comparison of real-time microbiome information collected during the fermentation of cheese and the quality of cheese.
- Real time, derived, and predicted data can be collected and stored, and thus, become historic data for ongoing or future decision-making for a process, setting, or application.
- In one embodiment of the invention is provided a method of classifying a microorganism, comprising: obtaining a nucleic acid sequence of a 16S ribosomal subunit, an ITS, internal transcribed spacer, and optionally, a single copy marker gene, of a first microbe; and comparing said nucleic acid sequence of a first microbe to a reference; and identifying the first microbe at the strain level or sub-strain level based on the comparing.
- In another embodiment is provided a novel method of profiling a microbiome in a sample, comprising: obtaining nucleic acids sequences of a 16S ribosomal subunit, an ITS, and a marker gene, from at least one microorganism in a sample; analyzing said at least one microorganism within said sample based upon the nucleic acids sequences obtained; and determining a profile of the microbiome based on said analyzing. Using 16S rDNA in combination with another single-copy marker gene provides prokaryotic species boundaries at higher resolution and allows identification of microbial diversity at the strain level. The novelty of this method is in the fact that unlike what is currently taught and used in the art, instead of combining the measurement of 16S region with a functional gene as is taught in the art, we combine the 16S region with single-copy marker genes (described in Sunagawa et al., 2013, Nature
Methods 10, 1196-1199). This methodology required sequencing all the DNA in a sample in order to get a high filogenetic resolution level. The method described herein, reduces the amount of sequencing data needed to identify species at high filogenetic resolution because the 16S amplicons and the single-copy marker genes produce an alignment rate below 7% and a false discovery rate below 10%. - In another embodiment is provided a novel method for sequencing two libraries in one sequencing run, by pooling the prepared 16S and ITS libraries, and providing appropriate primers for sequencing both 16S and ITS in a sequencing method.
- In some embodiments, determining a profile of the microbiome in said sample can be based on 50 or fewer microbes, 55 or fewer microbes, 60 or fewer microbes, or fewer microbes, 70 or fewer microbes, 75 or fewer microbes, 80 or fewer microbes, or fewer microbes, 90 or fewer microbes, 100 or fewer microbes, 200 or fewer microbes, 300 or fewer microbes, 400 or fewer microbe, 500 or fewer microbes, boo or fewer microbes, 700 or fewer microbes, or Boo or fewer microbes. In some embodiments determining a profile of the microbiome in said sample has an accuracy greater than 70% based on the measurements. In some embodiments, analyzing uses long read sequencing platforms.
- In yet another embodiment is provided a process including: analyzing a material from a location associated with a fermentation process; obtaining microbiome information, selected from real time microbiome information, derived microbiome information and predictive microbiome information; and performing an evaluation on the microbiome information, the evaluation including: a relationship based processing including a related genetic material component and a fermentation setting component; and a bioinformatics stage; whereby the evaluation provides information to direct the fermentation process.
- In a further embodiment is provided operations and methods having one or more of the following features: wherein the real time microbiome information is selected from material selected from the group consisting of soil samples, soil sample taken during a planting, soil sample taken during growth, soil sample taken during harvesting, fermentation sample taken at the beginning of a fermentation process, in the middle of a fermentation process, at the end of a fermentation process, any time during a fermentation process; wherein the bioinformatics stage has one or more of the following: submitting the raw DNA sequencing data to bioinformatics pipeline for performing microbiome analysis, including demultiplexing and quality filtering, OTU picking, taxonomic assignment, phylogenetic reconstruction, compiling metadata, diversity analysis, and visualization.
- Still in another embodiment is provided a method of controlling a fermentation operation including: analyzing a material from a location associated with an fermentation operation to provide a first microbiome information; associating the first microbiome information with a condition of the operation; obtaining a second microbiome information; associating the second microbiome information with the first microbiome information; and, evaluating the first microbiome information, the associated condition, and the second microbiome information, the evaluation including bioinformatics pipeline for performing microbiome analysis including demultiplexing and quality filtering, OTU picking, taxonomic assignment, phylogenetic reconstruction, compiling metadata, diversity analysis, and visualization; whereby the evaluation identifies a characteristic of the operation; and, directing the fermentation operation based in part on the identified characteristic of operation; whereby the fermentation operation is based upon the evaluation of microbiome information.
- Yet still in another embodiment is provided a method for directing a fermentation operation including: analyzing a sample from a location associated with a fermentation operation; obtaining microbiome information; and, performing an evaluation on the microbiome information, whereby the evaluation provides information to direct the fermentation operation.
- In another embodiment is provided operations and methods having one or more of the following features: wherein, the microbiome information has real time microbiome information; wherein, the microbiome information has derived microbiome information; wherein, the microbiome information has predictive microbiome information; wherein the analysis has selection and sequencing of the material; wherein the analysis has extracting genetic material from the material; wherein the analysis has preparation of libraries; wherein the analysis has extracting material including genetic material selected from the group consisting of a rRNA gene 16S, Internal transcribed spacer (ITS); wherein the analysis has providing a phylogenetic tree; wherein the analysis has a correction step; wherein the analysis has an extraction procedure selected from the group consisting of beating, sonicating, freezing and thawing, and chemical disruption; wherein the analysis has amplification of at least a portion of the material; wherein the analysis has providing a genetic barcode to a sample of the material; wherein the microbiome information defines a phylogenetic tree; wherein the microbiome information has a UM; wherein the microbiome information defines an UM; wherein the microbiome information defines a biogeographical pattern; wherein the microbiome information has information obtained from the 16S rRNA and another marker gene; wherein the another marker gene is metal-dependent proteases with possible chaperone activity; wherein the evaluation has forming an n-dimensional plot, where n is selected from the group of integers consisting of 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, and 14; wherein the evaluation has measuring a change in gene sequences; wherein the evaluation has measuring a change in gene sequences and using the measured change as a molecular clock in the evaluation to determine the related nature of material; and wherein the material is selected from the group consisting of soil, agricultural material, material from dairy processing, a material from a fermentation operation.
- There is further provided systems, operations and methods having one or more of the following features: wherein at least a portion of the information resulting from the evaluation is displayed in a two dimensional plot; wherein at least a portion of the information resulting from the evaluation is displayed in a three dimensional plot; wherein at least a portion of the information resulting from the evaluation is displayed in a plot including colors associated with microbiome information; wherein at least a portion of the information resulting from the evaluation is displayed in a plot including colors associated with a type of information selected from the group consisting of microbiome information and non-genetic information; each type of information including a different color; wherein at least a portion of the information resulting from the evaluation is displayed in a plot including colors associated with a type of information selected from the group consisting of microbiome information and non-genetic information; each type of information including a different color; and the nongenetic information selected from the group consisting of temperature, geographical location, climate; wherein at least a portion of the information resulting from the evaluation is transmitted to a memory storage device; wherein at least a portion of the information resulting from the evaluation is communicated to a controller; wherein at least a portion of the information resulting from the evaluation is displayed in a two dimensional plot; and, wherein at least a portion of the information resulting from the evaluation is displayed in a three dimensional plot. In some embodiments, the system can further comprise a user interface configured to communicate or display a report to a user.
- In one aspect, the methods of the invention allow the identification of microorganisms capable of imparting one or more beneficial property to one or more phases of a fermentation process. The variability in the microbial populations present in the sample can be used to support a directed process of selection of one or more microorganisms for use in a phase of a fermentation process and for identifying particular combinations and abundances of microorganisms which are of benefit for a particular purpose, and which may never have been recognized using conventional techniques.
- The methods of the invention may be used as a part of a plant breeding program. The methods may allow for, or at least assist with, the selection of plants which have a particular genotype/phenotype which is influenced by the microbial flora, in addition to identifying microorganisms and/or compositions that are capable of imparting one or more property to one or more plants.
- In one aspect the invention relates to a method for the selection of one or more microorganism(s) which are capable of imparting one or more beneficial property to a plant to be used as raw material in a fermentation process. In other words, the process will allow for enrichment of suitable microorganisms within the plant microbiome. Such microorganism(s) may be contained within a plant, on a plant, and/or within the plant's growing soil or water. It should be appreciated that as referred to herein a “beneficial property to a plant” should be interpreted broadly to mean any property which is beneficial for any particular purpose including properties which may be beneficial to human beings, other animals, the environment, a habitat, an ecosystem, the economy, of commercial benefit, or of any other benefit to any entity or system. Accordingly, the term should be taken to include properties which may suppress, decrease or block one or more characteristic of a plant, including suppressing, decreasing or inhibiting the growth or growth rate of a plant. The invention may be described herein, by way of example only, in terms of identifying positive benefits to one or more plants or improving plants. However, it should be appreciated that the invention is equally applicable to identifying negative benefits that can be conferred to plants.
- Such beneficial properties include, but are not limited to, for example: improved growth, health and/or survival characteristics, suitability or quality of the plant for a particular purpose, structure, color, chemical composition or profile, taste, smell, improved quality. In other embodiments, beneficial properties include, but are not limited to, for example; decreasing, suppressing or inhibiting the growth of a plant; constraining the height and width of a plant to a desirable size; regulate production of and/or response to plant pheromones (resulting in increased tannin production in surrounding plant community and decreased appeal to foraging species)
- As used herein, “improved” should be taken broadly to encompass improvement of a characteristic of a plant or a fermentation process which may already exist in a plant or process prior to application of the invention, or the presence of a characteristic which did not exist in a plant or process prior to application of the invention. By way of example, “improved” growth should be taken to include growth of a plant where the plant was not previously known to grow under the relevant conditions.
- As used herein, “inhibiting and suppressing” and like terms should be taken broadly and should not be construed to require complete inhibition or suppression, although this may be desired in some embodiments.
- The term “microbes”, “microorganisms” as used herein should be taken broadly. It refers to any single-celled organisms, bacteria, archaea, protozoa, and unicellular fungi and protists. By way of example, the microorganisms may include Proteobacteria (such as Pseudomonas, Enterobacter, Stenotrophomonas, Burkholderia, Rhizobium, Herbaspirillum, Pantoea, Serratia, Rahnella, Azospirillum, Azorhizobium, Azotobacter, Duganella, Delftia, Bradyrhizobiun, Sinorhizobium and Halomonas), Firmicutes (such as Bacillus, Paenibacillus, Lactobacillus, Mycoplasma, and Acetobacterium), Actinobacteria (such as Streptomyces, Rhodococcus, Microbacterium, and Curtobacterium), and the fungi Ascomycota (such as Trichoderma, Ampelomyces, Coniothyrium, Paecoelomyces, Penicillium, Cladosporium, Hypocrea, Beauveria, Metarhizium, Verticullium, Cordyceps, Pichea, and Candida, Basidiomycota (such as Coprinus, Corticium, and Agaricus) and Oomycota (such as Pythium, Mucor, and Mortierella).
- In yet another embodiment, the present disclosure provides a method for detecting contamination in a fermentation sample, comprising determining the microbiome from a fermentation sample, wherein the method comprises detecting at least one marker of a microorganism and preferably two markers of a microorganism; and a computer system for determining a microbiome profile in a sample, the computer system comprising: a memory unit for receiving data comprising measurement of a microbiome panel from a sample; computer-executable instructions for analyzing the measurement data according to a method of described herein; and computer-executable instructions for determining potential microbial contamination in the sample or fermentation process based upon said analyzing. In some embodiments, the computer system further comprises computer-executable instructions to generate a report of the presence or absence of the at least one contamination microorganism in the sample. In some embodiments, computer system can further comprises a user interface configured to communicate or display said report to a user.
- The present disclosure provides a computer readable medium comprising: computer-executable instructions for analyzing data comprising measurement of a microbiome profile from a fermentation sample obtained from a fermentation process or environment, wherein the microbiome profile comprises at least one marker and preferably two markers selected from at least one microbe; and computer-executable instructions for determining a presence or absence of a contamination in the fermentation process based upon the analyzing.
- Examples of machine learning algorithms that can be used include, but are not limited to: elastic networks, random forests, support vector machines, and logistic regression. The algorithms provided herein can aid in selection of important microbes and transform the underlying measurements into a score or probability relating to, for example, grape quality, wine quality, presence or absence of contamination, treatment response, and/or classification of organic soil status.
- The present disclosure provides a kit, comprising: one or more compositions for use in measuring a microbiome profile in a fermentation sample obtained from fermentation process or environment thereof, wherein the microbiome profile comprises at least one marker and preferably two markers to at least one microbe; and instructions for performing any of the preceding methods. In some embodiments, a kit can further comprises a computer readable medium.
- Kit reagents may in one embodiment comprise at least one contiguous oligonucleotide that hybridizes to a fragment of the genome of a microorganism. In another embodiment, the kit comprises at least one pair of oligonucleotides that hybridizes to opposite strands of a genomic segment of a microorganism, wherein each oligonucleotide primer pair is designed to selectively amplify a fragment of the 16S, ITS, and/or marker gene of the organism present in the sample. In one embodiment, the oligonucleotide is completely complementary to the genome of the individual. In another embodiment, the kit further contains buffer and enzyme for amplifying said segment. In another embodiment, the reagents further comprise a label for detecting said fragment.
- Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
- The foregoing and other features and advantages of the invention will be apparent from the following, more particular description of various exemplary embodiments including a preferred embodiment of the invention, as illustrated in the accompanying drawings.
-
FIG. 1 is a 3-dimensional illustration providing a comparative representations of microbiome profiles of bacterias for differing soil samples. -
FIG. 2 is a 3-dimensional illustration providing a comparative representations of microbiome profiles of yeast species for differing soil samples. -
FIG. 3 is a bar chart illustration of the visual comparative representations of microbiome profiles of bacterias found in different soil samples. -
FIG. 4 is a bar chart illustration of the visual comparative representations of microbiome profiles of yeast species found in different soil samples. - In the description that follows, a number of terms used are extensively utilized. In order to provide a clearer and consistent understanding of the specification and claims, including the scope to be given such terms, the following definitions are provided.
- The use of the word “a” or “an” when used in conjunction with the term “comprising” in the claims and/or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.”
- Throughout this application, the term “about” is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value.
- The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.”
- As used in this specification and claim(s), the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
- It also is specifically understood that any numerical value recited herein includes all values from the lower value to the upper value, i.e., all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application. For example, if a range is stated as 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification.
- “Contacting” refers to the process of bringing into contact at least two distinct species such that they can react. It should be appreciated, however, the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagent which can be produced in the reaction mixture.
- “Nucleic acid,” “oligonucleotide,” and “polynucleotide” refer to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.
- The term “microbiome”, as used herein, refers to the ecological community of commensal, symbiotic, or pathogenic microorganisms in a sample.
- The term “genome” as used herein, refers to the entirety of an organism's hereditary information that is encoded in its primary DNA sequence. The genome includes both the genes and the non-coding sequences. For example, the genome may represent a microbial genome or a mammalian genome.
- Reference to “DNA region” should be understood as a reference to a specific section of genomic DNA. These DNA regions are specified either by reference to a gene name or a set of chromosomal coordinates. Both the gene names and the chromosomal coordinates would be well known to, and understood by, the person of skill in the art. In general, a gene can be routinely identified by reference to its name, via which both its sequences and chromosomal location can be routinely obtained, or by reference to its chromosomal coordinates, via which both the gene name and its sequence can also be routinely obtained.
- Reference to each of the genes/DNA regions detailed above should be understood as a reference to all forms of these molecules and to fragments or variants thereof. As would be appreciated by the person of skill in the art, some genes are known to exhibit allelic variation or single nucleotide polymorphisms. SNPs encompass insertions and deletions of varying size and simple sequence repeats, such as dinucleotide and trinucleotide repeats. Variants include nucleic acid sequences from the same region sharing at least 90%, 95%, 98%, 99% sequence identity i.e. having one or more deletions, additions, substitutions, inverted sequences etc. relative to the DNA regions described herein. Accordingly, the present invention should be understood to extend to such variants which, in terms of the present applications, achieve the same outcome despite the fact that minor genetic variations between the actual nucleic acid sequences may exist between different bacterial strains. The present invention should therefore be understood to extend to all forms of DNA which arise from any other mutation, polymorphic or allelic variation.
- The term “sequencing” as used herein refers to sequencing methods for determining the order of the nucleotide bases—adenine, guanine, cytosine, and thymine—in a nucleic acid molecule (e.g., a DNA or RNA nucleic acid molecule.
- The term “barcode” as used herein, refers to any unique, non-naturally occurring, nucleic acid sequence that may be used to identify the originating genome of a nucleic acid fragment.
- The term “biochip” or “array” can refer to a solid substrate having a generally planar surface to which an adsorbent is attached. A surface of the biochip can comprise a plurality of addressable locations, each of which location may have the adsorbent bound there. Biochips can be adapted to engage a probe interface, and therefore, function as probes. Protein biochips are adapted for the capture of polypeptides and can be comprise surfaces having chromatographic or biospecific adsorbents attached thereto at addressable locations. Microarray chips are generally used for DNA and RNA gene expression detection. Microbiome profiling can further comprise of use of a biochip.
- Biochips can be used to screen a large number of macromolecules. Biochips can be designed with immobilized nucleic acid molecules, full-length proteins, antibodies, affibodies (small molecules engineered to mimic monoclonal antibodies), aptamers (nucleic acid-based ligands) or chemical compounds. A chip could be designed to detect multiple macromolecule types on one chip. For example, a chip could be designed to detect nucleic acid molecules, proteins and metabolites on one chip. The biochip can be used to and designed to simultaneously analyze a panel microbes in a single sample.
- A “computer-readable medium”, is an information storage medium that can be accessed by a computer using a commercially available or custom-made interface. Exemplary computer-readable media include memory (e.g., RAM, ROM, flash memory, etc.), optical storage media (e.g., CD-ROM), magnetic storage media (e.g., computer hard drives, floppy disks, etc.), punch cards, or other commercially available media. Information may be transferred between a system of interest and a medium, between computers, or between computers and the computer-readable medium for storage or access of stored information. Such transmission can be electrical, or by other available methods, such as IR links, wireless connections, etc.
- Any microbiome profile described herein can include one or more, but are not limited to the following microbes: Abiotrophia, Abiotrophia defectiva, Abiotrophia, Acetanaerobacterium, Acetanaerobacterium elongatum, Acetanaerobacterium, Acetivibrio, Acetivibrio bacterium, Acetivibrio, Acetobacterium, Acetobacterium, Acetobacterium woodii, Acholeplasma, Acholeplasma, Acidaminococcus, Acidaminococcus fermentans, Acidaminococcus, Acidianus, Acidianus brierleyi, Acidianus, Acidovorax, Acidovorax, Acinetobacter, Acinetobacter guillouiae, Acinetobacter junii, Acinetobacter, Actinobacillus, Actinobacillus MI933/96/1, Actinomyces, Actinomyces ICM34, Actinomyces ICM41, Actinomyces ICM54, Actinomyces lingnae, Actinomyces odontolyticus, Actinomyces oral, Actinomyces ph3, Actinomyces, Adlercreutzia, Adlercreutzia equolifaciens, Adlercreutzia intestinal, Adlercreutzia, Aerococcus, Aeromonas, Aeromonas 165C, Aeromonas hydrophila, Aeromonas RCSo, Aeromonas, Aeropyrum, Aeropyrum pernix, Aeropyrum, Aggregatibacter, Aggregatibacter, Agreia, Agreia bicolorata, Agreia, Agromonas, Agromonas CS30, Akkermansia, Akkermansia muciniphila, Akkermansia, Alistipes, Alistipes ANH, Alistipes AP11, Alistipes bacterium, Alistipes CCUG, Alistipes DJF_B185, Alistipes DSM, Alistipes EBA6-25c12, Alistipes finegoldii, Alistipes indistinctus, Alistipes JC136, Alistipes NML05A004, Alistipes onderdonkii, Alistipes putredinis, Alistipes RMA,
- Alistipes senegalensis, Alistipes shahii, Alistipes Smarlab, Alistipes, Alkalibaculum, Alkalibaculum, Alkaliflexus, Alkaliflexus, Allisonella, Allisonella histaminiformans, Allisonella, Alloscardovia, Alloscardovia omnicolens, Anaerofilum, Anaerofilum, Anaerofustis, Anaerofustis stercorihominis, Anaerofustis, Anaeroplasma, Anaeroplasma, Anaerostipes, Anaerostipes 08964, Anaerostipes ly-2, Anaerostipes 494a, Anaerostipes 5.sub.-1.sub.-63FAA, Anaerostipes AIP, Anaerostipes bacterium, Anaerostipes butyraticus, Anaerostipes caccae, Anaerostipes hadrum, Anaerostipes IE4, Anaerostipes indolis, Anaerostipes, Anaerotruncus, Anaerotruncus colihominis, Anaerotruncus NML, Anaerotruncus, Aquincola, Aquincola, Arcobacter, Arcobacter, Arthrobacter, Arthrobacter FV1-1, Asaccharobacter, Asaccharobacter celatus, Asaccharobacter, Asteroleplasma, Asteroleplasma, Atopobacter, Atopobacter phocae, Atopobium, Atopobium parvulum, Atopobium rimae, Atopobium, Bacteriovorax, Bacteriovorax, Bacteroides, Bacteroides 31SF18, Bacteroides 326-8, Bacteroides 35AE31, Bacteroides 35AE37, Bacteroides 35BE34, Bacteroides 4072, Bacteroides 7853, Bacteroides acidifaciens, Bacteroides APl, Bacteroides AR20, Bacteroides AR29, Bacteroides B2, Bacteroides bacterium, Bacteroides barnesiae, Bacteroides BLBE-6, Bacteroides BV-1, Bacteroides caccae, Bacteroides Canne1Catfish9, Bacteroides cellulosilyticus, Bacteroides chinchillae, Bacteroides CIP103040, Bacteroides clarus, Bacteroides coprocola, Bacteroides coprophilus, Bacteroides D8, Bacteroides DJF_B097, Bacteroides dnLKV2, Bacteroides dnLKV7, Bacteroides dnLKV9, Bacteroides dorei, Bacteroides EBAS-17, Bacteroides eggerthii, Bacteroides enrichment, Bacteroides F-4, Bacteroides faecichinchillae, Bacteroides faecis, Bacteroides fecal, Bacteroides finegoldii,
- Bacteroides fragilis, Bacteroides gallinarum, Bacteroides helcogenes, Bacteroides ic1292, Bacteroides intestinalis, Bacteroides massiliensis, Bacteroides mpnisolate, Bacteroides NB-8, Bacteroides new, Bacteroides nlaezlc13, Bacteroides nlaezlc158, Bacteroides nlaezlc159, Bacteroides nlaezlc161, Bacteroides nlaezlc163, Bacteroides nlaezlc167, Bacteroides nlaezlc172, Bacteroides nlaezlc18, Bacteroides nlaezlc182, Bacteroides nlaezlc190, Bacteroides nlaezlc198, Bacteroides nlaezlc204, Bacteroides nlaezlc205, Bacteroides nlaezlc206, Bacteroides nlaezlc207, Bacteroides nlaezlc211, Bacteroides nlaezlc218, Bacteroides nlaezlc257, Bacteroides nlaezlc260, Bacteroides nlaezlc261, Bacteroides nlaezlc263, Bacteroides nlaezlc308, Bacteroides nlaezlc315, Bacteroides nlaezlc322, Bacteroides nlaezlc324, Bacteroides nlaezlc331, Bacteroides nlaezlc339, Bacteroides nlaezlc36, Bacteroides nlaezlc367, Bacteroides nlaezlc375, Bacteroides nlaezlc376, Bacteroides nlaezlc380, Bacteroides nlaezlc391, Bacteroides nlaezlc459, Bacteroides nlaezlc484, Bacteroides nlaezlc501, Bacteroides nlaezlc504, Bacteroides nlaezlc515, Bacteroides nlaezlc519, Bacteroides nlaezlc532, Bacteroides nlaezlc557, Bacteroides nlaezlc57, Bacteroides nlaezlc574, Bacteroides nlaezlc592, Bacteroides nlaezlgloS, Bacteroides nlaezlg117, Bacteroides nlaezlg127, Bacteroides nlaezlg136, Bacteroides nlaezlg143, Bacteroides nlaezlg157, Bacteroides nlaezlg167, Bacteroides nlaezlg171, Bacteroides nlaezlg187, Bacteroides nlaezlg194, Bacteroides nlaezlg195, Bacteroides nlaezlg199, Bacteroides nlaezlg209, Bacteroides nlaezlg212, Bacteroides nlaezlg213, Bacteroides nlaezlg218, Bacteroides nlaezlg221, Bacteroides nlaezlg228, Bacteroides nlaezlg234, Bacteroides nlaezlg237, Bacteroides nlaezlg24, Bacteroides nlaezlg245, Bacteroides nlaezlg257, Bacteroides nlaezlg27, Bacteroides nlaezlg285, Bacteroides nlaezlg288, Bacteroides nlaezlg295, Bacteroides nlaezlg296, Bacteroides nlaezlg303, Bacteroides nlaezlg310, Bacteroides nlaezlg312, Bacteroides nlaezlg327, Bacteroides nlaezlg329, Bacteroides nlaezlg336, Bacteroides nlaezlg338, Bacteroides nlaezlg347, Bacteroides nlaezlg356, Bacteroides nlaezlg373, Bacteroides nlaezlg376, Bacteroides nlaezlg380, Bacteroides nlaezlg382, Bacteroides nlaezlg385, Bacteroides nlaezlg4, Bacteroides nlaezlg422, Bacteroides nlaezlg437, Bacteroides nlaezlg454, Bacteroides nlaezlg455, Bacteroides nlaezlg456, Bacteroides nlaezlg458, Bacteroides nlaezlg459, Bacteroides nlaezlg46, Bacteroides nlaezlg461, Bacteroides nlaezlg475, Bacteroides nlaezlg481, Bacteroides nlaezlg484, Bacteroides nlaezlgS, Bacteroides nlaezlg502, Bacteroides nlaezlg515, Bacteroides nlaezlg518, Bacteroides nlaezlg521, Bacteroides nlaezlg54, Bacteroides nlaezlg6, Bacteroides nlaezlg8, Bacteroides nlaezlg80, Bacteroides nlaezlg98, Bacteroides nlaezlh120, Bacteroides nlaezlhlS, Bacteroides nlaezlh162, Bacteroides nlaezlh17, Bacteroides nlaezlh174, Bacteroides nlaezlh18, Bacteroides nlaezlh188, Bacteroides nlaezlh192, Bacteroides nlaezlh194, Bacteroides nlaezlh195, Bacteroides nlaezlh207, Bacteroides nlaezlh22, Bacteroides nlaezlh250, Bacteroides nlaezlh251, Bacteroides nlaezlh28, Bacteroides nlaezlh313, Bacteroides nlaezlh319, Bacteroides nlaezlh321, Bacteroides nlaezlh328, Bacteroides nlaezlh334, Bacteroides nlaezlh390, Bacteroides nlaezlh391, Bacteroides nlaezlh414, Bacteroides nlaezlh416, Bacteroides nlaezlh419, Bacteroides nlaezlh429, Bacteroides nlaezlh439, Bacteroides nlaezlh444, Bacteroides nlaezlh45, Bacteroides nlaezlh46, Bacteroides nlaezlh462, Bacteroides nlaezlh463, Bacteroides nlaezlh465, Bacteroides nlaezlh468, Bacteroides nlaezlh471, Bacteroides nlaezlh472, Bacteroides nlaezlh474, Bacteroides nlaezlh479, Bacteroides nlaezlh482, Bacteroides nlaezlh49, Bacteroides nlaezlh493, Bacteroides nlaezlh496, Bacteroides nlaezlh497, Bacteroides nlaezlh499, Bacteroides nlaezlhSo, Bacteroides nlaezlh531, Bacteroides nlaezlh535, Bacteroides nlaezlh8, Bacteroides nlaezlp104, Bacteroides nlaezlploS, Bacteroides nlaezlp108, Bacteroides nlaezlp132, Bacteroides nlaezlp133, Bacteroides nlaezlp151, Bacteroides nlaezlp157, Bacteroides nlaezlp166, Bacteroides nlaezlp167, Bacteroides nlaezlp171, Bacteroides nlaezlp178, Bacteroides nlaezlp187, Bacteroides nlaezlp191, Bacteroides nlaezlp196, Bacteroides nlaezlp208, Bacteroides nlaezlp213, Bacteroides nlaezlp228, Bacteroides nlaezlp233, Bacteroides nlaezlp267, Bacteroides nlaezlp278, Bacteroides nlaezlp282, Bacteroides nlaezlp286, Bacteroides nlaezlp295, Bacteroides nlaezlp299, Bacteroides nlaezlp301, Bacteroides nlaezlp302, Bacteroides nlaezlp304, Bacteroides nlaezlp317, Bacteroides nlaezlp319, Bacteroides nlaezlp32, Bacteroides nlaezlp332, Bacteroides nlaezlp349, Bacteroides nlaezlp35, Bacteroides nlaezlp356, Bacteroides nlaezlp370, Bacteroides nlaezlp371, Bacteroides nlaezlp376, Bacteroides nlaezlp395, Bacteroides nlaezlp402, Bacteroides nlaezlp403, Bacteroides nlaezlp409, Bacteroides nlaezlp412, Bacteroides nlaezlp436, Bacteroides nlaezlp438, Bacteroides nlaezlp440, Bacteroides nlaezlp447, Bacteroides nlaezlp448, Bacteroides nlaezlp451, Bacteroides nlaezlp476, Bacteroides nlaezlp478, Bacteroides nlaezlp483, Bacteroides nlaezlp489, Bacteroides nlaezlp493, Bacteroides nlaezlp557, Bacteroides nlaezlp559, Bacteroides nlaezlp564, Bacteroides nlaezlp565, Bacteroides nlaezlp572, Bacteroides nlaezlp573, Bacteroides nlaezlp576, Bacteroides nlaezlp591, Bacteroides nlaezlp592, Bacteroides nlaezlp631, Bacteroides nlaezlp633, Bacteroides nlaezlp696, Bacteroides nlaezlp7, Bacteroides nlaezlp720, Bacteroides nlaezlp730, Bacteroides nlaezlp736, Bacteroides nlaezlp737, Bacteroides nlaezlp754, Bacteroides nlaezlp759, Bacteroides nlaezlp774, Bacteroides nlaezlp828, Bacteroides nlaezlp854, Bacteroides nlaezlp860, Bacteroides nlaezlp886, Bacteroides nlaezlp887, Bacteroides nlaezlp900, Bacteroides nlaezlp909, Bacteroides nlaezlp913, Bacteroides nlaezlp916, Bacteroides nlaezlp920, Bacteroides nlaezlp96, Bacteroides nordii, Bacteroides oleiciplenus, Bacteroides ovatus, Bacteroides paurosaccharolyticus, Bacteroides plebeius, Bacteroides R6, Bacteroides rodentium, Bacteroides S-17, Bacteroides S-18, Bacteroides salyersiae, Bacteroides SLCl-38, Bacteroides Smarlab, Bacteroides 'Smarlab, Bacteroides stercorirosoris, Bacteroides stercoris, Bacteroides str, Bacteroides thetaiotaomicron, Bacteroides TP-5, Bacteroides, Bacteroides uniformis, Bacteroides vulgatus, Bacteroides WAl, Bacteroides WH2, Bacteroides WH302, Bacteroides WH305, Bacteroides XBi2B, Bacteroides XB44A, Bacteroides X077B42, Bacteroides xylanisolvens, Barnesiella, Barnesiella intestinihominis, Barnesiella NSB1, Barnesiella, Barnesiella viscericola, Bavariicoccus, Bavariicoccus, Bdellovibrio, Bdellovibrio oral, Bergeriella, Bergeriella, Bifidobacterium, Bifidobacterium 103, Bifidobacterium 108, Bifidobacterium 113, Bifidobacterium 120, Bifidobacterium 138, Bifidobacterium 33, Bifidobacterium AcbbtoS, Bifidobacterium adolescentis, Bifidobacterium Amsbbt12, Bifidobacterium angulatum, Bifidobacterium animalis, Bifidobacterium bacterium, Bifidobacterium bifidum, Bifidobacterium Bisn6, Bifidobacterium Bma6, Bifidobacterium breve, Bifidobacterium catenulatum, Bifidobacterium choerinum, Bifidobacterium coryneforme, Bifidobacterium dentium, Bifidobacterium DJF_WC44, Bifidobacterium F-10, Bifidobacterium F-11, Bifidobacterium group, Bifidobacterium h12, Bifidobacterium HMLNl, Bifidobacterium HMLN12, Bifidobacterium HMLNS, Bifidobacterium iarfr2341d, Bifidobacterium iarfr642d48, Bifidobacterium ic1332, Bifidobacterium indicum, Bifidobacterium kashiwanohense, Bifidobacterium LISLUCIII-2, Bifidobacterium longum, Bifidobacterium M45, Bifidobacterium merycicum, Bifidobacterium minimum, Bifidobacterium MSXSB, Bifidobacterium oral, Bifidobacterium PG12A, Bifidobacterium PL1, Bifidobacterium pseudocatenulatum, Bifidobacterium pseudolongum, Bifidobacterium pullorum, Bifidobacterium ruminantium, Bifidobacterium S-10, Bifidobacterium saeculare, Bifidobacterium saguini, Bifidobacterium scardovii, Bifidobacterium simiae, Bifidobacterium SLPYG-1, Bifidobacterium stellenboschense, Bifidobacterium stercoris, Bifidobacterium TM-7, Bifidobacterium Trmg, Bifidobacterium, Bilophila, Bilophila nlaezlh528, Bilophila, Bilophila wadsworthia, Blautia, Blautia bacterium, Blautia CE2, Blautia CE6, Blautia coccoides, Blautia DJF_VR52, Blautia DJF_VR67, Blautia DJF_VR70kl, Blautia formate, Blautia glucerasea, Blautia hansenii, Blautia ic1272, Blautia IES, Blautia K-1, Blautia luti, Blautia M-1, Blautia mpnisolate, Blautia nlaezlc25, Blautia nlaezlc259, Blautia nlaezlc51, Blautia nlaezlc520, Blautia nlaezlc542, Blautia nlaezlc544, Blautia nlaezlh27, Blautia nlaezlh316, Blautia nlaezlh317, Blautia obeum, Blautia producta, Blautia productus, Blautia schinkii, Blautia sers, Blautia Ser8, Blautia, Blautia WAL, Blautia wexlerae, Blautia YHC-4, Brenneria, Brenneria, Brevibacterium, Brevibacterium, Brochothrix, Brochothrix thermosphacta, Buttiauxella, Buttiauxella 57916, Buttiauxella gaviniae, Butyricicoccus, Butyricicoccus bacterium, Butyricicoccus, Butyricimonas, Butyricimonas 180-3, Butyricimonas 214-4, Butyricimonas bacterium, Butyricimonas GD2, Butyricimonas synergistica, Butyricimonas, Butyricimonas virosa, Butyrivibrio, Butyrivibrio fibrisolvens, Butyrivibrio hungatei, Butyrivibrio, Caldimicrobium, Caldimicrobium, Caldisericum, Caldisericum, Campylobacter, Campylobacter coli, Campylobacter hominis, Campylobacter, Capnocytophaga, Capnocytophaga, Carnobacterium, Carnobacterium alterfunditum, Carnobacterium, Caryophanon, Caryophanon, Catenibacterium, Catenibacterium mitsuokai, Catenibacterium, Catonella, Catonella, Caulobacter, Caulobacter, Cellulophaga, Cellulophaga, Cellulosilyticum, Cellulosilyticum, Cetobacterium, Cetobacterium, Chelatococcus, Chelatococcus, Chlorobium, Chlorobium, Chryseobacterium, Chryseobacterium AlooS, Chryseobacterium KJ9C8, Chryseobacterium, Citrobacter, Citrobacter 1, Citrobacter agglomerans, Citrobacter amalonaticus, Citrobacter ascorbata, Citrobacter bacterium, Citrobacter BinzhouCLT, Citrobacter braakii, Citrobacter enrichment, Citrobacter F24, Citrobacter F96, Citrobacter farmeri, Citrobacter freundii, Citrobacter gillenii, Citrobacter HBKC_SRl, Citrobacter HD4.9, Citrobacter hormaechei, Citrobacter 191-3, Citrobacter ka55, Citrobacter lapagei, Citrobacter LAR-1, Citrobacter ludwigii, Citrobacter MEBS, Citrobacter MS36, Citrobacter murliniae, Citrobacter nlaezlc269, Citrobacter P014, Citrobacter PO42bN, Citrobacter PO46a, Citrobacter P073, Citrobacter SR3, Citrobacter Tl, Citrobacter tnt4, Citrobacter tntS, Citrobacter trout, Citrobacter TSA-1, Citrobacter, Citrobacter werkmanii, Cloacibacillus, Cloacibacillus adv66, Cloacibacillus nlaezlp702, Cloacibacillus NML05A017, Cloacibacillus, Cloacibacterium, Cloacibacterium, Collinsella, Collinsella A-1, Collinsella aerofaciens, Collinsella AUH-Julong21, Collinsella bacterium, Collinsella CCUG, Collinsella, Comamonas, Comamonas straminea, Comamonas testosteroni, Conexibacter, Conexibacter, Coprobacillus, Coprobacillus bacterium, Coprobacillus cateniformis, Coprobacillus TM-40, Coprobacillus, Coprococcus, Coprococcus 14505, Coprococcus bacterium, Coprococcus catus, Coprococcus comes, Coprococcus eutactus, Coprococcus nexile, Coprococcus, Coraliomargarita, Coraliomargarita fucoidanolyticus, Coraliomargarita marisflavi, Coraliomargarita, Corynebacterium, Corynebacterium amy o colatum, Corynebacterium durum, Coxiella, Coxiella, Cronobacter, Cronobacter dublinensis, Cronobacter sakazakii, Cronobacter turicensis, Cryptobacterium, Cryptobacterium curtum, Cupriavidus, Cupriavidus eutropha, Dechloromonas, Dechloromonas, HZ, Desulfobacterium, Desulfobacterium, Desulfobulbus, Desulfobulbus, Desulfopila, Desulfopila La4.1, Desulfovibrio, Desulfovibrio D4, Desulfovibrio desulfuricans, Desulfovibrio DSM12803, Desulfovibrio enrichment, Desulfovibrio fairfieldensis, Desulfovibrio LNBl, Desulfovibrio piger, Desulfovibrio, Dialister, Dialister E2.sub.-20, Dialister GBA27, Dialister invisus, Dialister oral, Dialister succinatiphilus, Dialister, Dorea, Dorea auhjulong64, Dorea bacterium, Dorea formicigenerans, Dorea longicatena, Dorea mpnisolate, Dorea, Dysgonomonas, Dysgonomonas gadei, Dysgonomonas, Edwardsiella, Edwardsiella tarda, Eggerthella, Eggerthella El, Eggerthella lenta, Eggerthella MLGO43, Eggerthella MVAl, Eggerthella S6-Cl, Eggerthella SDG-2, Eggerthella sinensis, Eggerthella str, Eggerthella, Enhydrobacter, Enhydrobacter, Enterobacter, Enterobacter 1050, Enterobacter 1122, Enterobacter 77000, Enterobacter 82353, Enterobacter 9C, Enterobacter ASC, Enterobacter adecarboxylata, Enterobacter aerogenes, Enterobacter agglomerans, Enterobacter AJAR-A2, Enterobacter amnigenus, Enterobacter asburiae, Enterobacter B1(2012), Enterobacter B363, Enterobacter B509, Enterobacter bacterium, Enterobacter Badong3, Enterobacter BEC441, Enterobacter C8, Enterobacter cancerogenus, Enterobacter cloacae, Enterobacter CO, Enterobacter core2, Enterobacter cowanii, Enterobacter dc6, Enterobacter DRSBII, Enterobacter enrichment, Enterobacter FL13-2-1, Enterobacter GIST-NKstlo, Enterobacter GIST-NKst9, Enterobacter GJl-11, Enterobacter gx-148, Enterobacter hormaechei, Enterobacter I-Bh20-21, Enterobacter ICB 113, Enterobacter kobei, Enterobacter KW 14, Enterobacter 112, Enterobacter ludwigii, Enterobacter Mlo.sub.-lB, Enterobacter M1R3, Enterobacter marine, Enterobacter NCCP-167, Enterobacter of, Enterobacter oryzae, Enterobacter oxytoca, Enterobacter Plol, Enterobacter S11, Enterobacter SEL2, Enterobacter SPh, Enterobacter SSASPS, Enterobacter terrigena, Enterobacter TNT3, Enterobacter TP2MC, Enterobacter TS4, Enterobacter TSSAS2-48, i Enterobacter, Enterobacter ZYXCAl, Enterococcus, Enterococcus 020824/02-A, Enterococcus 1275b, Enterococcus 16C, Enterococcus 48, Enterococcus 6114, Enterococcus ABRIINW-H61, Enterococcus asini, Enterococcus avium, Enterococcus azikeevi, Enterococcus bacterium, Enterococcus BBDP57, Enterococcus BPH34, Enterococcus Bt, Enterococcus canis, Enterococcus casseliflavus, Enterococcus CmNA2, Enterococcus Da-20, Enterococcus devriesei, Enterococcus dispar, Enterococcus DJF_O30, Enterococcus DMB4, Enterococcus durans, Enterococcus enrichment, Enterococcus Fth, Enterococcus faecalis, Enterococcus faecium, Enterococcus fcc9, Enterococcus fecal, Enterococcus flavescens, Enterococcus fluvialis, Enterococcus FR-3, Enterococcus FUA3374, Enterococcus gallinarum, Enterococcus GHAPRB1, Enterococcus GSC-2, Enterococcus GYPBol, Enterococcus hermanniensis, Enterococcus hirae, Enterococcus lactis, Enterococcus malodoratus, Enterococcus manure, Enterococcus marine, Enterococcus MNC1, Enterococcus moraviensis, Enterococcus MS2, Enterococcus mundtii,
- Enterococcus NAB 15, Enterococcus NBRC, Enterococcus nlaezlc434, Enterococcus nlaezlg106, Enterococcus nlaezlg87, Enterococcus nlaezlh339, Enterococcus nlaezlh375, Enterococcus nlaezlh381, Enterococcus nlaezlh383, Enterococcus nlaezlh405, Enterococcus nlaezlp116, Enterococcus nlaezlp148, Enterococcus nlaezlp401, Enterococcus nlaezlp650, Enterococcus pseudoavium, Enterococcus R-25205, Enterococcus raffinosus, Enterococcus rottae, Enterococcus RU07, Enterococcus saccharolyticus, Enterococcus saccharominimus, Enterococcus sanguinicola, Enterococcus SCA16, Enterococcus SCA2, Enterococcus SEI38, Enterococcus SF-1, Enterococcus sulfureus, Enterococcus SV6, Enterococcus tela, Enterococcus te32a, Enterococcus te42a, Enterococcus te45r, Enterococcus te49a, Enterococcus teSla, Enterococcus te58r, Enterococcus te59r, Enterococcus te61r, Enterococcus te93r, Enterococcus te95a, Enterococcus, Enterorhabdus, Enterorhabdus caecimuris, Enterorhabdus, Erwinia, Erwinia agglomerans, Erwinia enterica, Erwinia rhapontici, Erwinia tasmaniensis, Erwinia, Erysipelotrichaceae_incertae_sedis, Erysipelotrichaceae_incertae_sedis aff, Erysipelotrichaceae_incertae_sedis bacterium, Erysipelotrichaceae_incertae_sedis biforme, Erysipelotrichaceae_incertae_sedis C-1, Erysipelotrichaceae_incertae_sedis cylindroides, Erysipelotrichaceae_incertae_sedis GIC12, Erysipelotrichaceae_incertae_sedis innocuum, Erysipelotrichaceae_incertae_sedis nlaezlc332, Erysipelotrichaceae_incertae_sedis nlaezlc340, Erysipelotrichaceae_incertae_sedis nlaezlg420, Erysipelotrichaceae_incertae_sedis nlaezlg425, Erysipelotrichaceae_incertae_sedis nlaezlg440, Erysipelotrichaceae_incertae_sedis nlaezlg463, Erysipelotrichaceae_incertae_sedis nlaezlh440, Erysipelotrichaceae_incertae_sedis nlaezlh354, Erysipelotrichaceae_incertae_sedis nlaezlh379, Erysipelotrichaceae_incertae_sedis nlaezlh380, Erysipelotrichaceae_incertae_sedis nlaezlh385, Erysipelotrichaceae_incertae_sedis nlaezlh410, Erysipelotrichaceae_incertae_sedis tortuosum, Erysipelotrichaceae_incertae_sedis, Escherichia/Shigella, Escherichia/Shigella 29(2010), Escherichia/Shigella 4091, Escherichia/Shigella 4104, Escherichia/Shigella 8gw18, Escherichia/Shigella A94, Escherichia/Shigella albertii, Escherichia/Shigella B-1012, Escherichia/Shigella B4, Escherichia/Shigella bacterium, Escherichia/Shigella BBDP15, Escherichia/Shigella BBDP80, Escherichia/Shigella boydii, Escherichia/Shigella carotovorum, Escherichia/Shigella CERAR, Escherichia/Shigella coli, Escherichia/Shigella DBC-1, Escherichia/Shigella dc262011, Escherichia/Shigella dysenteriae, Escherichia/Shigella enrichment, Escherichia/Shigella escherichia, Escherichia/Shigella fecal, Escherichia/Shigella fergusonii, Escherichia/Shigella flexneri, Escherichia/Shigella GDRoS, Escherichia/Shigella GDRo7, Escherichia/Shigella H7, Escherichia/Shigella marine, Escherichia/Shigella ML2-46, Escherichia/Shigella mpnisolate, Escherichia/Shigella NA, Escherichia/Shigella nlaezlg330, Escherichia/Shigella nlaezlg400, Escherichia/Shigella nlaezlg441, Escherichia/Shigella nlaezlg506, Escherichia/Shigella nlaezlh204, Escherichia/Shigella nlaezlh208, Escherichia/Shigella nlaezlh209, Escherichia/Shigella nlaezlh213, Escherichia/Shigella nlaezlh214, Escherichia/Shigella nlaezlh4, Escherichia/Shigella nlaezlh435, Escherichia/Shigella nlaezlh8i, Escherichia/Shigella nlaezlp126, Escherichia/Shigella nlaezlp198, Escherichia/Shigella nlaezlp21, Escherichia/Shigella nlaezlp235, Escherichia/Shigella nlaezlp237, Escherichia/Shigella nlaezlp239, Escherichia/Shigella nlaezlp25, Escherichia/Shigella nlaezlp252, Escherichia/Shigella nlaezlp275, Escherichia/Shigella nlaezlp280, Escherichia/Shigella nlaezlp51, Escherichia/Shigella nlaezlp53, Escherichia/Shigella nlaezlp669, Escherichia/Shigella nlaezlp676, Escherichia/Shigella nlaezlp717, Escherichia/Shigella nlaezlp731, Escherichia/Shigella nlaezlp826, Escherichia/Shigella nlaezlp877, Escherichia/Shigella nlaezlp884, Escherichia/Shigella NMU-ST2, Escherichia/Shigella oc182011, Eschericia/Shigella of, Escherichia/Shigella proteobacterium, Escherichia/Shigella Ql, Escherichia/Shigella sakazakii, Escherichia/Shigella SF6, Escherichia/Shigella sm1719, Escherichia/Shigella SOD-7317, Escherichia/Shigella sonnei, Escherichia/Shigella SW86, Escherichia/Shigella, Escherichia/Shigella vulneris, Ethanoligenens, Ethanoligenens harbinense, Ethanoligenens, Eubacterium, Eubacterium ARC-2, Eubacterium callanderi, Eubacterium E-1, Eubacterium G3(2011), Eubacterium infirmum, Eubacterium limosum, Eubacterium methylotrophicum, Eubacterium ulaezlp439, Eubacterium nlaezlp457, Eubacterium nlaezlp458, Eubacterium nlaezlp469, Eubacterium nlaezlp474, Eubacterium oral, Eubacterium saphenum, Eubacterium sulci, Eubacterium, Eubacterium WAL, Euglenida, Euglenida longa, Faecalibacterium, Faecalibacterium bacterium, Faecalibacterium canine, Faecalibacterium DJF_VR20, Faecalibacterium ic1379, Faecalibacterium prausnitzii, Faecalibacterium, Filibacter, Filibacter globispora, Flavobacterium, Flavobacterium SSL03, Flavobacterium, Flavonifractor, Flavonifractor AUH-JLC235, Flavonifractor enrichment, Flavonifractor nlaezlc354, Flavonifractor orbiscindens, Flavonifractor plautii, Flavonifractor, Francisella, Francisella piscicida, Fusobacterium, Fusobacterium nucleatum, Fusobacterium, Gardnerella, Gardnerella, Gardnerella vaginalis, Gemmiger, Gemmiger DJF_VR33k2, Gemmiger formicilis, Gemmiger, Geobacter, Geobacter, Gordonibacter, Gordonibacter bacterium, Gordonibacter intestinal, Gordonibacter pamelaeae, Gordonibacter, Gp2, Gp2, Gp21, Gp21, Gp4, Gp4, Gp6, Gp6, Granulicatella, Granulicatella adiacens, Granulicatella enrichment, Granulicatella oral, Granulicatella paraadiacens, Granulicatella, Haemophilus, Haemophilus, Hafnia, Hafnia 3-12(2010), Hafnia alvei, Hafnia CC16, Hafnia proteus, Hafnia, Haliea, Haliea, Hallella, Hallella seregens, Hallella, Herbaspirillum, Herbaspirillum 022S4-ll, Herbaspirillum seropedicae, Hespellia, Hespellia porcina, Hespellia stercorisuis, Hespellia, Holdemania, Holdemania AP2, Holdemania filiformis, Holdemania, Howardella, Howardella, Howardella ureilytica, Hydrogenoanaerobacterium, Hydrogenoanaerobacterium saccharovorans, Hydrogenophaga, Hydrogenophaga bacterium, Ilumatobacter, Ilumatobacter, Janthinobacterium, Janthinobacterium C30An7, Janthinobacterium, Jeotgalicoccus, Jeotgalicoccus, Klebsiella, Klebsiella aerogenes, Klebsiella bacterium, Klebsiella E1L1, Klebsiella EB2-THQ, Klebsiella enrichment, Klebsiella F83, Klebsiella G1-6, Klebsiella gg160e, Klebsiella granulomatis, Klebsiella HaNA20, Klebsiella HF2, Klebsiella ii.sub.-3_chl.sub.-1, Klebsiella KALAICIBA17, Klebsiella kpu, Klebsiella M3, Klebsiella MB45, Klebsiella milletis, Klebsiella NCCP-138, Klebsiella okl.sub.-1.sub.-9_S16, Klebsiella okl.sub.-1.sub.-9_S54, Klebsiella planticola, Klebsiella pneumoniae, Klebsiella poinarii, Klebsiella PSB26, Klebsiella RS, Klebsiella Se14, Klebsiella SRC_DSD12, Klebsiella td153s, Klebsiella TG-1, Klebsiella TPSS, Klebsiella, Klebsiella variicola, Klebsiella WB-2, Klebsiella Y9, Klebsiella zlmy, Kluyvera, Kluyvera AnS-1, Kluyvera cryocrescens, Kluyvera, Kocuria, Kocuria 2216.35.31, Kurthia, Kurthia, Lachnobacterium, Lachnobacterium Cl2b, Lachnobacterium, Lachnospiracea_incertae_sedis, Lachnospiracea_incertae_sedis bacterium, Lachnospiracea_incertae_sedis contortum, Lachnospiracea_incertae_sedis Egg, Lachnospiracea_incertae_sedis eligens, Lachnospiracea_incertae_sedis ethanolgignens, Lachnospiracea_incertae_sedis galacturonicus, delbrueckii, Lactobacillus fermentum, Lactobacillus gasseri, Lactobacillus helveticus, Lactobacillus hominis, Lactobacillus ID9203, Lactobacillus IDSAc, Lactobacillus intestinal, Lactobacillus johnsonii, Lactobacillus lactis, Lactobacillus manihotivorans, Lactobacillus mucosae, Lactobacillus NA, Lactobacillus oris, Lactobacillus P23, Lactobacillus P8, Lactobacillus paracasei, Lactobacillus paraplantarum, Lactobacillus pentosus, Lactobacillus plantarum, Lactobacillus pontis, Lactobacillus rennanqilfylo, Lactobacillus rennangilfy14, Lactobacillus rennangilyf9, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactobacillus salivarius, Lactobacillus sanfranciscensis, Lactobacillus suntoryeus, Lactobacillus T3R1Cl, Lactobacillus, Lactobacillus vaginalis, Lactobacillus zeae, Lactococcus, Lactococcus 56, Lactococcus CR-317S, Lactococcus CW-1, Lactococcus D8, Lactococcus Da-18, Lactococcus DAP39, Lactococcus delbrueckii, Lactococcus F116, Lactococcus fujiensis, Lactococcus G22, Lactococcus garvieae, Lactococcus lactis, Lactococcus manure, Lactococcus RTS, Lactococcus SXVIII1(2011), Lactococcus TP2MJ, Lactococcus TP2ML, Lactococcus TP2MN, Lactococcus US-1, Lactococcus, Lactonifactor, Lactonifactor bacterium, Lactonifactor longoviformis, Lactonifactor nlaezlc533, Lactonifactor, Leclercia, Leclercia, Lentisphaera, Lentisphaera, Leuconostoc, Leuconostoc carnosum, Leuconostoc citreum, Leuconostoc garlicum, Leuconostoc gasicomitatum, Leuconostoc gelidum, Leuconostoc inhae, Leuconostoc lactis, Leuconostoc MEBE2, Leuconostoc mesenteroides, Leuconostoc pseudomesenteroides, Leuconostoc, Limnobacter, Limnobacter spf3, Luteolibacter, Luteolibacter bacterium, Lutispora, Lutispora, Marinifilum, Marinifilum, Marinobacter, Marinobacter arcticus, Mariprofundus, Mariprofundus, Marvinbryantia, Lachnospiracea_incertae_sedis gnavus, Lachnospiracea_incertae_sedis hallii, Lachnospiracea_incertae_sedis hydrogenotrophica, Lachnospiracea_incertae_sedis IDS, Lachnospiracea_incertae_sedis intestinal, Lachnospiracea_incertae_sedis mpnisolate, Lachnospiracea_incertae_sedis pectinoschiza, Lachnospiracea_incertae_sedis ramulus, Lachnospiracea_incertae_sedis rectale, Lachnospiracea_incertae_sedis RLBl, Lachnospiracea_incertae_sedis rumen, Lachnospiracea_incertae_sedis SY8519, Lachnospiracea_incertae_sedis torques, Lachnospiracea_incertae_sedis, Lachnospiracea_incertae_sedis uniforme, Lachnospiracea_incertae_sedis ventriosum, Lachnospiracea_incertae_sedis xylanophilum, Lachnospiracea_incertae_sedis ye62, Lactobacillus, Lactobacillus 5-1-2, Lactobacillus 66c, Lactobacillus acidophilus, Lactobacillus arizonensis, Lactobacillus B5406, Lactobacillus brevis, Lactobacillus casei, Lactobacillus crispatus, Lactobacillus curvatus, Lactobacillus delbrueckii, Lactobacillus fermentum, Lactobacillus gasseri, Lactobacillus helveticus, Lactobacillus hominis, Lactobacillus ID9203, Lactobacillus IDSAc, Lactobacillus intestinal, Lactobacillus johnsonii, Lactobacillus lactis, Lactobacillus manihotivorans, Lactobacillus mucosae, Lactobacillus NA, Lactobacillus oris, Lactobacillus P23, Lactobacillus P8, Lactobacillus paracasei, Lactobacillus paraplantarum, Lactobacillus pentosus, Lactobacillus plantarum, Lactobacillus pontis, Lactobacillus rennanqilfylo, Lactobacillus rennanqilfy14, Lactobacillus rennanqilyf9, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactobacillus salivarius, Lactobacillus sanfranciscensis, Lactobacillus suntoryeus, Lactobacillus T3R1C1, Lactobacillus, Lactobacillus vaginalis, Lactobacillus zeae, Lactococcus, Lactococcus 56, Lactococcus CR-317S, Lactococcus CW-1, Lactococcus D8, Lactococcus Da-18, Lactococcus DAP39, Lactococcus delbrueckii, Lactococcus F116, Lactococcus fujiensis, Lactococcus G22, Lactococcus garvieae, Lactococcus lactis, Lactococcus manure, Lactococcus RTS, Lactococcus SXVIII1(2011), Lactococcus TP2MJ, Lactococcus TP2ML, Lactococcus TP2MN, Lactococcus US-1, Lactococcus, Lactonifactor, Lactonifactor bacterium, Lactonifactor longoviformis, Lactonifactor nlaezlc533, Lactonifactor, Leclercia, Leclercia, Lentisphaera, Lentisphaera, Leuconostoc, Leuconostoc carnosum, Leuconostoc citreum, Leuconostoc garlicum, Leuconostoc gasicomitatum, Leuconostoc gelidum, Leuconostoc inhae, Leuconostoc lactis, Leuconostoc MEBE2, Leuconostoc mesenteroides, Leuconostoc pseudomesenteroides, Leuconostoc, Limnobacter, Limnobacter spf3, Luteolibacter, Luteolibacter bacterium, Lutispora, Lutispora, Marinifilum, Marinifilum, Marinobacter, Marinobacter arcticus, Mariprofundus, Mariprofundus, Marvinbryantia, Megamonas, Megamonas, Megasphaera, Megasphaera, Melissococcus, Melissococcus faecalis, Methanobacterium, Methanobacterium subterraneum, Methanobrevibacter, Methanobrevibacter arboriphilus, Methanobrevibacter millerae, Methanobrevibacter olleyae, Methanobrevibacter oralis, Methanobrevibacter SM9, Methanobrevibacter smithii, Methanobrevibacter, Methanosphaera, Methanosphaera stadtmanae, Methanosphaera, Methylobacterium, Methylobacterium adhaesivum, Methylobacterium bacterium, Methylobacterium iEII3, Methylobacterium MP3, Methylobacterium oryzae, Methylobacterium PB132, Methylobacterium PB20, Methylobacterium PB280, Methylobacterium PDD-23b-14, Methylobacterium radiotolerans, Methylobacterium SKJH-1, Methylobacterium, Mitsuokella, Mitsuokella jalaludinii, Mitsuokella, Morganella, Morganella morganii, Morganella, Moritella, Moritella 2D2, Moryella, Moryella indoligenes, Moryella naviforme, Moryella, Mycobacterium, Mycobacterium tuberculosis, Mycobacterium, Negativicoccus, Negativicoccus, Nitrosomonas, Nitrosomonas eutropha, Novosphingobium, Novosphingobium, Odoribacter, Odoribacter laneus, Odoribacter splanchnicus, Odoribacter, Olsenella, Olsenella 1832, Olsenella Fo206, Olsenella, Orbus, Orbus gilliamella, Oribacterium, Oribacterium, Oscillibacter, Oscillibacter bacterium, Oscillibacter enrichment, Oscillibacter, Owenweeksia, Owenweeksia, Oxalobacter, Oxalobacter formigenes, Oxalobacter, Paludibacter, Paludibacter, Pantoea, Pantoea agglomerans, Pantoea eucalypti, Pantoea, Papillibacter, Papillibacter cinnamivorans, Papillibacter, Parabacteroides, Parabacteroides ASF519, Parabacteroides CR-34, Parabacteroides distasonis, Parabacteroides DJF_B084, Parabacteroides DJF_B086, Parabacteroides dnLKV8, Parabacteroides enrichment, Parabacteroides fecal, Parabacteroides goldsteinii, Parabacteroides gordonii, Parabacteroides johnsonii, Parabacteroides merdae, Parabacteroides mpnisolate, Parabacteroides nlaezlp340, Parabacteroides, Paraeggerthella, Paraeggerthella hongkongensis, Paraeggerthella nlaezlp797, Paraeggerthella nlaezlp896, Paraprevotella, Paraprevotella clara, Paraprevotella, Paraprevotella xylaniphila, Parasutterella, Parasutterella excrementihominis, Parasutterella, Pectobacterium, Pectobacterium carotovorum, Pectobacterium wasabiae, Pediococcus, Pediococcus te2r, Pediococcus, Pedobacter, Pedobacter b3Nlb-b5, Pedobacter daechungensis, Pedobacter, Peptostreptococcus, Peptostreptococcus anaerobius, Peptostreptococcus stomatis, Peptostreptococcus, Phascolarctobacterium, Phascolarctobacterium faecium, Phascolarctobacterium, Photobacterium, Photobacterium MIE, Pilibacter, Pilibacter, Planctomyces, Planctomyces, Planococcaceae_incertae_sedis, Planococcaceae incertae sedis, Planomicrobium, Planomicrobium, Plesiomonas, Plesiomonas, Porphyrobacter, Porphyrobacter KK348, Porphyromonas, Porphyromonas asaccharolytica, Porphyromonas bennonis, Porphyromonas canine, Porphyromonas somerae, Porphyromonas, Prevotella, Prevotella bacterium, Prevotella BI-42, Prevotella bivia, Prevotella buccalis, Prevotella copri, Prevotella DJF_B112, Prevotella mpnisolate, Prevotella oral, Prevotella, Propionibacterium, Propionibacterium acnes, Propionibacterium freudenreichii, Propionibacterium LG, Propionibacterium, Proteiniborus, Proteiniborus, Proteiniphilum, Proteiniphilum, Proteus, Proteus HS7514, Providencia, Providencia, Pseudobutyrivibrio, Pseudobutyrivibrio bacterium, Pseudobutyrivibrio fibrisolvens, Pseudobutyrivibrio ruminis, Pseudobutyrivibrio, Pseudochrobactrum, Pseudochrobactrum, Pseudoflavonifractor, Pseudoflavonifractor asfSOO, Pseudoflavonifractor bacterium, Pseudoflavonifractor capillosus, Pseudoflavonifractor NML, Pseudoflavonifractor, Pseudomonas, Pseudomonas 1043, Pseudomonas 10569, Pseudomonas 127(39-zx), Pseudomonas 12A.sub.-19, Pseudomonas 145(38zx), Pseudomonas 22010, Pseudomonas 32010, Pseudomonas 34t20, Pseudomonas 3C.sub.-10, Pseudomonas 4-5(2010), Pseudomonas 4-9(2010), Pseudomonas 6-13.J, Pseudomonas 63596, Pseudomonas 82010, Pseudomonas a001-142L, Pseudomonas alOl-18-2, Pseudomonas alll-5, Pseudomonas aeruginosa, Pseudomonas agarici, Pseudomonas amspl, Pseudomonas AU2390, Pseudomonas AZ18Rl, Pseudomonas azotoformans, Pseudomonas B122, Pseudomonas B65(2012), Pseudomonas bacterium, Pseudomonas BJSX, Pseudomonas BLH-8D5, Pseudomonas BWDY-29, Pseudomonas CAM, Pseudomonas Cantasi2, Pseudomonas CB 11, Pseudomonas CBZ-4, Pseudomonas cedrina, Pseudomonas CGMCC, Pseudomonas CL16, Pseudomonas CNE, Pseudomonas corrugata, Pseudomonas cuatrocienegasensis, Pseudomonas CYEB-7, Pseudomonas DS, Pseudomonas DAP37, Pseudomonas DB48, Pseudomonas deceptionensis, Pseudomonas Den-OS, Pseudomonas DF7EH1, Pseudomonas DhA-91, Pseudomonas DVSI4a, Pseudomonas DYJK4-9, Pseudomonas DZQS, Pseudomonas Ell_ICE19B, Pseudomonas E2.2, Pseudomonas e2-CDC-TB4D2, Pseudomonas EM189, Pseudomonas enrichment, Pseudomonas extremorientalis, Pseudomonas FAIR/BE/F/GH37, Pseudomonas FAIR/BE/F/GH39, Pseudomonas FAIR/BE/F/GH94, Pseudomonas FLMOS-3, Pseudomonas fluorescens, Pseudomonas fragi, Pseudomonas 'FSL, Pseudomonas Glo13, Pseudomonas gingeri, Pseudomonas HC2-2, Pseudomonas HC2-4, Pseudomonas HC2-5, Pseudomonas HC4-8, Pseudomonas HC6-6, Pseudomonas H94-o6, Pseudomonas HLB8-2, Pseudomonas HLS12-1, Pseudomonas HSF20-13, Pseudomonas HWo8, Pseudomonas 11-44, Pseudomonas IpA-92, Pseudomonas IV, Pseudomonas JCM, Pseudomonas jessenii, Pseudomonas JSPBS, Pseudomonas K3R3.1A, Pseudomonas KB40, Pseudomonas KB42, Pseudomonas KB44, Pseudomonas KB63, Pseudomonas KB73, Pseudomonas KK-21-4, Pseudomonas KOPRI, Pseudomonas L1R3.5, Pseudomonas LAB-27, Pseudomonas LAB-44, Pseudomonas Lclo-2, Pseudomonas libanensis, Pseudomonas LnSC.7, Pseudomonas LS197, Pseudomonas lundensis, Pseudomonas marginalis, Pseudomonas MFY143, Pseudomonas MFY146, Pseudomonas MY1404, Pseudomonas MY1412, Pseudomonas MY1416, Pseudomonas MY1420, Pseudomonas N14zhy, Pseudomonas NBRC, Pseudomonas NCCP-506, Pseudomonas NFU20-14, Pseudomonas NJ-22, Pseudomonas NJ-24, Pseudomonas Nj-3, Pseudomonas Nj-55, Pseudomonas Nj-56, Pseudomonas Nj-59, Pseudomonas Nj-60, Pseudomonas Nj-62, Pseudomonas Nj-70, Pseudomonas NP41, Pseudomonas OCW4, Pseudomonas OW3-15-3-2, Pseudomonas P1(2010), Pseudomonas P2(2010), Pseudomonas P3(2010), Pseudomonas P4(2010), Pseudomonas PD, Pseudomonas PF1B4, Pseudomonas PF2M10, Pseudomonas PILI11, Pseudomonas poae, Pseudomonas proteobacterium, Pseudomonas ps4-12, Pseudomonas ps4-2, Pseudomonas ps4-28, Pseudomonas ps4-34, Pseudomonas ps4-4, Pseudomonas psychrophila, Pseudomonas putida, Pseudomonas R-35721, Pseudomonas R-37257, Pseudomonas R-37265, Pseudomonas R-37908, Pseudomonas RBEICD-48, Pseudomonas RBE2CD-42, Pseudomonas regd9, Pseudomonas RKS7-3, Pseudomonas S2, Pseudomonas seawater, Pseudomonas SGbo8, Pseudomonas SGb 120, Pseudomonas SGb396, Pseudomonas sgn, Pseudomonas 'Shk, Pseudomonas stutzeri, Pseudomonas syringae, Pseudomonas taetrolens, Pseudomonas tolaasii, Pseudomonas trivialis, Pseudomonas TUT1023, Pseudomonas, Pseudomonas W15Feb26, Pseudomonas W15Feb4, Pseudomonas W15Feb6, Pseudomonas WD-3, Pseudomonas WR4-13, Pseudomonas WR7 #2, Pseudomonas Yl000, Pseudomonas ZS29-8, Psychrobacter, Psychrobacter umb13d, Psychrobacter, Pyramidobacter, Pyramidobacter piscolens, Pyramidobacter, Rahnella, Rahnella aquatilis, Rahnella carotovorum, Rahnella GIST-WP4w 1, Rahnella LR113, Rahnella, Rahnella Z2-S 1, Ralstonia, Ralstonia bacterium, Ralstonia, Raoultella, Raoultella B 19, Raoultella enrichment, Raoultella planticola, Raoultella sv6xvii, Raoultella SZ015, Raoultella, Renibacterium, Renibacterium G20, Rhizobium, Rhizobium leguminosarum, Rhodococcus, Rhodococcus erythropolis, Rhodopirellula, Rhodopirellula, Riemerella, Riemerella anatipestifer, Rikenella, Rikenella, Robinsoniella, Robinsoniella peoriensis, Robinsoniella, Roseburia, Roseburia 11SE37, Roseburia bacterium, Roseburia cecicola, Roseburia DJF_VR77, Roseburia faecis, Roseburia fibrisolvens, Roseburia hominis, Roseburia intestinalis, Roseburia inulinivorans, Roseburia, Roseibacillus, Roseibacillus, Rothia, Rothia, Rubritalea, Rubritalea, Ruminococcus, Ruminococcus 25F6, Ruminococcus albus, Ruminococcus bacterium, Ruminococcus bromii, Ruminococcus callidus, Ruminococcus champanellensis, Ruminococcus DJF_VR87, Ruminococcus flavefaciens, Ruminococcus gauvreauii, Ruminococcus lactaris, Ruminococcus NK3A76, Ruminococcus, Ruminococcus YE71, Saccharofermentans, Saccharofermentans, Salinicoccus, Salinicoccus, Salinimicrobium, Salinimicrobium, Salmonella, Salmonella agglomerans, Salmonella bacterium, Salmonella enterica, Salmonella freundii, Salmonella hermannii, Salmonella paratyphi, Salmonella SL0604, Salmonella subterranea, Salmonella, Scardovia, Scardovia oral, Schwartzia, Schwartzia, Sedimenticola, Sedimenticola, Sediminibacter, Sediminibacter, Selenomonas, Selenomonas fecal, Selenomonas, Serpens, Serpens, Serratia, Serratia 1135, Serratia 136-2, Serratia 5.1R, Serratia AC-CS-1B, Serratia AC-CS-B2, Serratia aquatilis, Serratia bacterium, Serratia BS26, Serratia carotovorum, Serratia DAP6, Serratia enrichment, Serratia F2, Serratia ficaria, Serratia fonticola, Serratia grimesii, Serratia J145, Serratia J111983, Serratia liquefaciens, Serratia marcescens, Serratia plymuthica, Serratia proteamaculans, Serratia proteolyticus, Serratia ptz-16s, Serratia quinivorans, Serratia SBS, Serratia SS22, Serratia trout, Serratia UA-G004, Serratia, Serratia White, Serratia yellow, Shewanella, Shewanella baltica, Shewanella, Slackia, Slackia intestinal, Slackia isoflavoniconvertens, Slackia NATTS, Slackia, Solibacillus, Solibacillus, Solobacterium, Solobacterium moorei, Solobacterium, Spartobacteria_genera_incertae_sedis, Spartobacteria_genera_incertae_sedis, Sphingobium, Sphingobium, Sphingomonas, Sphingomonas, Sporacetigenium, Sporacetigenium, Sporobacter, Sporobacter, Sporobacterium, Sporobacterium olearium, Staphylococcus, Staphylococcus epidermidis, Staphylococcus PCA17, Staphylococcus, Stenotrophomonas, Stenotrophomonas, Streptococcus, Streptococcus 1606-2B, Streptococcus agalactiae, Streptococcus alactolyticus, Streptococcus anginosus, Streptococcus bacterium, Streptococcus bovis, Streptococcus ChDC, Streptococcus constellatus, Streptococcus CR-314S, Streptococcus criceti, Streptococcus cristatus, Streptococcus downei, Streptococcus dysgalactiae, Streptococcus enrichment, Streptococcus equi, Streptococcus equinus, Streptococcus ES11, Streptococcus eubacterium, gallinaceus, Streptococcus fecal, Streptococcus Streptococcus gallolyticus, Streptococcus gastrococcus, Streptococcus genomosp, Streptococcus gordonii, Streptococcus IS, Streptococcus infantarius, Streptococcus intermedius, Streptococcus Je2, Streptococcus JS-CD2, Streptococcus LRC, Streptococcus luteciae, Streptococcus lutetiensis, Streptococcus M09-11185, Streptococcus mitis, Streptococcus mutans, Streptococcus NA, Streptococcus nlaezlc353, Streptococcus nlaezlp68, Streptococcus nlaezlp758, Streptococcus nlaezlp807, Streptococcus oral, Streptococcus oralis, Streptococcus parasanguinis, Streptococcus phocae, Streptococcus pneumoniae, Streptococcus porcinus, Streptococcus pyogenes, Streptococcus S 16-o8, Streptococcus salivarius, Streptococcus sanguinis, Streptococcus sobrinus, Streptococcus suis, Streptococcus symbiont, Streptococcus thermophilus, Streptococcus TW1, Streptococcus, Streptococcus vestibularis, Streptococcus warneri, Streptococcus XJ-RY-3, Streptomyces, Streptomyces malaysiensis, Streptomyces MVCS6, Streptophyta, Streptophyta cordifolium, Streptophyta ginseng, Streptophyta hirsutum, Streptophyta oleracea, Streptophyta sativa, Streptophyta sativum, Streptophyta sativus, Streptophyta tabacum, Streptophyta, Subdivision3_genera_incertae_sedis, Subdivision3_genera_incertae_sedis, Subdoligranulum, Subdoligranulum bacterium, Subdoligranulum ic1393, Subdoligranulum ic1395, Subdoligranulum, Subdoligranulum variabile, Succiniclasticum, Succiniclasticum, Sulfuricella, Sulfuricella, Sulfurospirillum, Sulfurospirillum, Sutterella, Sutterella, Sutterella wadsworthensis, Syntrophococcus, Syntrophococcus, Syntrophomonas, Syntrophomonas bryantii, Syntrophomonas, Syntrophus, Syntrophus, Tannerella, Tannerella, Tatumella, Tatumella, Thermofilum, Thermofilum, Thermogymnomonas, Thermogymnomonas, Thermovirga, Thermovirga, Thiomonas, Thiomonas MLl-46, Thorsellia, Thorsellia carsonella, TM7_genera_incertae_sedis, TM7_genera_incertae_sedis, Trichococcus, Trichococcus, Turicibacter, Turicibacter sanguinis, Turicibacter, Vagococcus, Vagococcus bfsll-15, Vagococcus, Vampirovibrio, Vampirovibrio, Varibaculum, Varibaculum, Variovorax, Variovorax KS2D-23, Veillonella, Veillonella dispar, Veillonella MSA12, Veillonella OK8, Veillonella oral, Veillonella parvula, Veillonella tobetsuensis, Veillonella, Vibrio, Vibrio 3C1, Vibrio, Victivallis, Victivallis, Victivallis vadensis, Vitellibacter, Vitellibacter, Wandonia, Wandonia haliotis, Weissella, Weissella cibaria, Weissella confusa, Weissella oryzae, Weissella, Yersinia, Yersinia ggw38, Yersinia A125, Yersinia aldovae, Yersinia aleksiciae, Yersinia b702011, Yersinia bacterium, Yersinia bercovieri, Yersinia enterocolitica, Yersinia entomophaga, Yersinia frederiksenii, Yersinia intermedia, Yersinia kristensenii, Yersinia MAC, Yersinia massiliensis, Yersinia mollaretii, Yersinia nurmii, Yersinia pekkanenii, Yersinia pestis, Yersinia pseudotuberculosis, Yersinia rohdei, Yersinia ruckeri, Yersinia s1Ofe31, Yersinia s17fe31, Yersinia s4fe31, Yersinia, Yersinia YEM17B.
- Additional microbes are listed in Appendix A and Appendix B herein below. 3D images description:
-
FIGS. 1 and 2 are 3-dimensional illustrations providing comparative representations of microbiome profiles. These microbiomes were found in differing soil samples coming from exemplary vineyards in California, United States, and Spain, in accordance with certain embodiments.FIG. 1 is the profile for bacterias, whereasFIG. 2 is the profile for yeast species. Each winery is represented by a greyscale color on the respective legends as shown. The legends provide the number of samples for each winery, along with a code assigned to each winery. - It was found that the samples coming from the same winery are have greater similarities among themselves as compared to other samples. Additionally the samples coming from wineries from the same region have greater similarities as compared to samples coming from other wine regions. The samples illustrate clustering, for both bacterias and yeast species, demonstrating that applying the methodologies herein provides a scientific-based identity to the terroir concept in winemaking and provides validation to certain assumptions concerning the existence of bio-wine regions upon observation of microbiome profiles of soil.
-
FIGS. 3 and 4 are bar charts providing visual comparative representations of the microbiome profiles found in different soil samples.FIG. 3 is a bar chart profile for bacterias, whereasFIG. 4 is a bar chart profile for yeast species. For each of these charts, the x-axis provides sample identification codes, namely codes assigned to the different soil samples from vineyards. In the study, there were 83 samples in the bacteria chart ofFIG. 3 and 41 samples in yeast chart ofFIG. 4 . The y-axis provides the respective abundancies of the microbial species for each given vineyard sample, with each greyscale color representing a different microbiological specie. - Accordingly, illustrated in
FIGS. 3 and 4 are visual comparative representations of respective microbiome profiles found in the differing soil samples, with one bar profile per sample, derived from the exemplary vineyards. The vertical distribution of these species, shown in greyscale, is the same along the samples to allow the visual comparison of similarities among the microbiome profiles of the sample. - This representation, for both bacterias and yeast species, demonstrates that we are able to generate and compare microbiome profiles of samples applying the methodology described herein and serves to validate the assumptions of the existence of large microbial diversity for both yeast and bacteria in the vineyard samples.
- The methods provided herein can provide strain classification of a genera, species or sub-strain level of one or more microbes in a sample with an accuracy of greater than 1%, 20%, 30%, 40%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.2%, 99.5%, 99.7%, or 99.9%. The methods provided herein can provide strain quantification of a genera, species or sub-strain level of one or more microbes in a sample with an accuracy of greater than 1%, 20%, 30%, 40%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 99.2%, 99.5%, 99.7%, or 99.9%.
- In general, the present inventions further relates to systems and methods for determining and characterizing the microbiomes of fermentation settings, and in particular determining through relationship-based processing, which include custom and unique analytics tools and algorithms, data management, cleansing, filtering, and quality control, which in turn provide information about the fermentation setting. Such characterized information, for example, can have, and be used for, predictive, historical, analytic, development, control and monitoring purposes.
- This information, data, processing algorithms support software, such as human machine interface (HMI) programs and graphic programs, and databases, may be cloud-based, locally-based, hosted on remote systems other than cloud-based systems, and combinations and variations of these.
- The current disclosure provides computer systems for implementing any of the methods described herein. A computer system may be used to implement one or more steps including, sample collection, sample processing, detecting, quantifying one or more microbes, generating a profile data, comparing said data to a reference, generating a subject-specific microbiome profile, comparing the sample-specific profile to a reference profile, receiving sample-related data, receiving and storing data obtained by one or more methods described herein, analyzing said data, generating a report, and reporting results to a receiver.
- Thus, real-time, derived, and predicted data may be collected and stored and thus become historic data for an ongoing process, setting, or application. In this manner, the collection, use, and computational links can create a real-time situation in which machine learning can be applied to further enhance and refine the fermentation activities or processes. Further, real-time, derived, predictive, and historic data can be, and preferably is, associated with other data and information. Thus, the microbiome information can be associated with GPS data; location data, e.g., particular components and subsystems in an fermentation process such as for example a particular barrel type for wine storage; processing stage or step such as filtration of fermentation broth; geological parameters including formation permeability and porosity; soil moisture, nutrient, and rainfall conditions in agricultural processes; chemicals in wine, for example, sulfur acid.
- Thus, real-time, derived, historic, and predictive microbiome information may be further combined or processed with these other sources of information and data regarding the fermentation setting or process to provide combined, derived, and predictive information. In this manner, the microbiome information is used in combination with other data and information to provide for unique and novel ways to conduct fermentation operations, to develop or plan fermentation operations, to refine and enhance existing fermentation operations and combinations of these and other activities.
- Preferably, these various types of information and data are combined where one or more may become metadata for the other. In this manner, information may be linked in a manner that provides for rapid, efficient, and accurate processing to provide useful information relating to the fermentation setting. Thus for example, in agricultural setting the soil moisture content, the GPS location down to the square yard of a large farm may be linked as metadata to the real-time microbiome information during planting and compared with similarly linked metadata obtained during harvesting along with crop yield for that acre to refine and enhance the agricultural processing of the field in which the acre is located.
- In general, historic microbiome data may be obtained from known databases or it may be obtained from conducting population studies or censuses of the microbiome for the particular fermentation setting. Thus samples of biological materials are collected and characterized. This characterized information is then processed and stored. Preferably, the data is processed and stored in a manner that provides for ready and efficient access and utilization in subsequent steps, often using auxiliary data structures such as indexes or hashes.
- In general, real-time microbiome data may be obtained from conducting population studies or censuses of the microbiome as it exists at a particular point in time, or over a timeseries, for the particular fermentation setting. Thus samples of biological materials are collected and characterized. This characterized information is then processed and stored. Preferably, the data is processed and utilized in subsequent steps or may be stored as historic data in a manner that provides for ready and efficient access and utilization in subsequent steps.
- Generally, microbiome information may be contained in any type of data file that is utilized by current sequencing systems or that is a universal data format such as for example FASTQ (including quality scores), FASTA (omitting quality scores), GFF (for feature tables), etc. This data or files may then be combined using various software and computational techniques with identifiers or other data, examples of such software and identifiers for the combining of the various types of this information include the BIOM file format and the MI (x) S family of standards developed by the Genomic Standards Consortium. Additionally by way of example, in agricultural settings, data from a harvesting combine regarding yield, microbiome information, and commodities price information may be displayed or stored or used for further processing. The combination and communication of these various systems can be implemented by various data processing techniques, conversions of files, compression techniques, data transfer techniques, and other techniques for the efficient, accurate, combination, signal processing and overlay of large data streams and packets.
- In general, real-time, historic, and combinations and variations of this microbiome information is analyzed to provide a census or population distribution of various microbes. Unlike conventional identification of a particular species that is present, the analysis of the present invention determines in an n-dimensional space (a mathematical construct having 2, 3, 5, 12, 1000, or more dimensions), the interrelationship of the various microbes present in the system, and potentially also interrelationship of their genes, transcripts, proteins and/or metabolites. The embodiments of the present invention provide further analysis to this n-dimensional space information, which analysis renders this information to a format which is more readily usable and processable and understandable. Thus, for example, by using the techniques of the present invention, then-dimensional space information is analyzed and studied for patterns of significance pertinent to a particular fermentation setting and then converted to more readily usable data such as for example a 2-dimensional color-coded plot for presentation through a HMI (Human-Machine Interface).
- Additionally, then-dimensional space information may be related, e.g., transformed or correlated with, physical, environmental, or other data such as the conditions under which a particular plant was grown, either by projection into the same spatial coordinates or by relation of the coordinate systems themselves, or by feature extraction or other machine learning or multivariate statistical techniques. This related n-dimensional space information may then be further processed into a more readily usable format such as a 2-dimensional representation. Further, this 2-dimensional representation and processing may, for example, be based upon particular factors or features that are of significance in a particular fermentation setting. The 2-dimensional information may also be further viewed and analyzed for determining particular factors or features of significance for a system. Yet further, either of these types of 2-dimensional information may be still further processed using for example mathematical transformation functions to return them to an n-dimensional space which mathematical functions which may be based upon known or computationally determined factors or features.
- Thus the present inventions provide for derived and predicted information that can be based upon the computational distillation of complex n-dimensional space microbiome information, which may be further combined with other data. This computationally distilled data or information may then be displayed and used for operational purposes in the fermentation setting, it may be combined with additional data and displayed and used for operational purposes in the fermentation setting, it may be alone or in combination with additional information subjected to trend, analysis, to determine features or factors of significance, it may be used for planning and operational purposes in combinations and variations of these and other utilizations.
- Generally and for example, in ascertaining microbiome information the selection and sequencing of particular regions or portions of genetic materials may be used, including for example, the SSU rRNA gene (16S or 18S), the LSU rRNA gene (23S or 28S), the ITS in the rRNA operon, cpn60, gene marker regions such as metal-dependent proteases with possible chaperone activity, and various other segments consisting of base pairs, peptides or polysaccharides for use in characterizing the microbial community and the relationships among its constituents.
- In general, an embodiment of a method of the present invention may include one or more of the following steps which may be conducted in various orders: sample preparation including obtaining the sample at the designated location, and manipulating the sample; extraction of the genetic material and other biomolecules from the microbial communities in the sample; preparation of libraries with identifiers such as an appropriate barcode such as DNA libraries, metabolite libraries, and protein libraries of the material; sequence elucidation of the material (including, for example, DNA, RNA, and protein) of the microbial communities in the sample; processing and analysis of the sequencing and potentially other molecular data; and exploitation of the information for fermentation uses.
- For example sampling may be for example from an agricultural, food, surfaces, water. The samples can include for example solid samples such as soil, sediment, rock, and food. The samples can include for example liquid samples such as surface water, and subsurface water, other liquid to be fermented or in a certain stage of fermentation, such as must, barrel fermented wine, yogurt, to name a few. The sample once obtained has the genetic material isolated or obtained from the sample, which for example can be DNA, RNA, proteins and fragments of these.
- The accuracy of these analyses depends strongly on the choice of primers. Primers can be prepared by a variety of methods including, but not limited to, cloning of appropriate sequences and direct chemical synthesis using methods well known in the art (Narang et al., Methods Enzymol. 68:90 (1979); Brown et al., Methods Enzymol. 68:109 (1979)). Primers can also be obtained from commercial sources such as Integrated DNA Technologies, Operon Technologies, Amersham Pharmacia Biotech, Sigma, and Life Technologies. In addition, computer programs can also be used to design primers, including but not limited to Array Designer Software (Arrayit Inc.), Oligonucleotide Probe Sequence Design Software for Genetic Analysis (Olympus Optical Co.), NetPrimer, and DNAsis from Hitachi Software Engineering. Primers that can be used analyze the 16S ribosomal RNA gene include but are not limited to those described in the Examples below.
- Microbial diversity can be further described by approaches analyzing the intergenic region between 16S ribosomal RNA and 23S ribosomal RNA. Primers can be designed to specifically amplify any identified variable regions in a microbe or similar distinguishing genetic element.
- Primers or probes described herein can also include polynucleotides having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% homology to any of the nucleic acid sequences described herein.
- A library is prepared from the genetic material. In this stage of the process the library can be prepared by use of amplification, shotgun, whole molecule techniques among others. Additionally, amplification to add adapters for sequencing, and barcoding for sequences can be preformed. Shotgun by sonication, enzymatic cleavage may be performed. Whole molecules can also be used to sequence all DNA in a sample.
- Sequencing is performed. Preferably, the sequencing is with a high-throughput system, such as for example 454, Illumina, PacBio, or IonTorrent, Nanopore, to name a few.
- Sequence analysis is prepared. This analysis preferably can be performed using tools such as QIIME Analysis Pipeline, Machine learning, and UniFrac. Preferably, there is assigned a sequence to the sample via barcode, for among other things quality control of sequence data.
- The analysis is utilized in a fermentation application. The applications can include for example, cheese production, alcoholic and non-alcoholic beverage production, biofuel production, and alternative energy.
- Thus as explained in greater detail below, generally, the processing and analysis further involves matching the sequences to the samples, aligning the sequences to each other, and using the aligned sequences to build a phylogenetic tree, further distilling the data to form an n-dimensional plot and then a two or three dimensional plot or other graphical displays, including displays of the results of machine learning and multivariate statistical routines, and using the two or three-dimensional plot or other graphical displays to visualize patterns of the microbial communities in a particular sample over time and geographic space.
- Although HMI-type presentation of this information is presently preferred, it should be understood that such plots may be communicated directly to a computational means such as a large computer or computing cluster for performing further analysis to provide predictive information. Thus, the matched sequence samples would be an example of real-time or historic microbiome information, the phylogenetic tree would be an example of derived microbiome information, and portions of the graphical displays which have derived microbial information combined with other data would be an example of predictive microbiome information.
- Generally, a phylum is a group of organisms at the formal taxonomic level of Phylum based on sequence identity, physiology, and other such characteristics. There are approximately fifty bacterial phyla, which include Actinobacteria, Proteobacteria, and Firmicutes. Phylum is the classification that is a level below Kingdom, in terms of classifications of organisms. For example, for E. coli the taxonomy string is Kingdom: Bacteria; Phylum: Proteobacteria; Class: Gammaproteobacteria; Order: Enterobacteriales; Family: Enterobacteriaceae; Genus: Escherichia; and Species: coli.
- Generally, phylogeny refers to the evolutionary relationship between a set of organisms. This relationship can be based on morphology, biochemical features, and/or nucleic acid (DNA or RNA) sequence. One can measure the changes in gene sequences and use that as a molecular clock to determine how closely or distantly the sequences, and hence the organisms that contain them, are related.
- Generally, phylotype (also referred to as operational taxonomic unit (“OTU”)) is analogous to “species”, although phylotypes can also be defined at other taxonomic levels and these other levels are sometimes critical for identifying microbial community features relevant to a specific analysis. Because short DNA, RNA or protein sequences (“reads”) can be used, these sequences may not accurately identify many organisms to the level of species, or even strain (the most detailed level of phylogenetic resolution, which is sometimes important because different strains can have different molecular functions). In cases where a “phylotype” matches a sequence or group of sequences from a known organism in the databases, it can used to say that a particular sequence is from an organism like, for example, E. coli.
- Generally, a taxon is a group of organisms at any level of taxonomic classification. Here, taxon (plural: taxa) is a catchall term used in order to obviate the usage of the organism names repeatedly and to provide generality across taxonomic levels.
- Microbial community diversity and composition may vary considerably across fermentation environments and settings, and the embodiments of the present invention link these changes to biotic or abiotic factors and other factors and conditions in the fermentation environment to create derived and predictive information. Thus these patterns of microbial communities for example geological patterns of microbial communities or patterns of microbial communities in an fermentation system (microbiosystem metrics) which are determined by the present invention can give rise to predictive information for use in the fermentation setting.
- Examinations of microbial populations, e.g., a census, may provide insights into the physiologies, environmental tolerances, and ecological strategies of microbial taxa, particularly those taxa which are difficult to culture and that often dominate in natural environments. Thus, this type of derived data is utilized in combination with other data in order to form predictive information.
- Microbes are diverse, ubiquitous, and abundant, yet their population patterns and the factors driving these patterns were prior to the present inventions not readily understood in fermentation settings and thus it is believed never effectively used for the purposes for ascertaining predictive information. Microorganisms, just like macroorganisms (i.e., plants and animals), exhibit no single shared population pattern. The specific population patterns shown by microorganisms are variable and depend on a number of factors, including, the degree of phylogenetic resolution at which the communities are examined (e.g., Escherichia), the taxonomic group in question, the specific genes and metabolic capabilities that characterize the taxon, and the taxon's interactions with members of other taxa. Thus, such population patterns can be determined in fermentation settings and utilized as derived data for the purposes of ascertaining predictive information.
- However, for certain environments, common patterns may emerge if the biogeography (e.g., microbial populations for example as determined from a census), of that particular environment is specifically examined. In particular, the structure and diversity of soil bacterial communities have been found to be closely related to soil environmental characteristics such as soil pH. A comprehensive assessment of the biogeographical patterns of, for example, soil bacterial communities requires i) surveying individual communities at a reasonable level of phylogenetic detail (depth), and 2) examining a sufficiently large number of samples to assess spatial patterns (breadth). The studies of biogeographical patterns is not limited to soil, and will be extended to other environments, including but not limited to, any part of a living organisms, bodies of water, ice, the atmosphere, energy sources, factories, laboratories, farms, processing plants, hospitals, and other locations, systems and areas.
- Generally, samples will be collected in a manner ensuring that microbes from the target source are the most numerous in the samples while minimizing the contamination of the sample by the storage container, sample collection device, the sample collector, other target or other non-target sources that may introduce microbes into the sample from the target source. Further, samples will be collected in a manner to ensure the target source is accurately represented by single or multiple samples at an appropriate depth (if applicable) to meet the needs of the microbiome analysis, or with known reference controls for possible sources of contamination that can be subtracted by computational analysis. Precautions should be taken to minimize sample degradation during shipping by using commercially available liquids, dry ice or other freezing methods for the duration of transit.
- For example, samples can be collected in sterile, DNA/DNase/RNA/RNase-free primary containers with leak resistant caps or lids and placed in a second leak resistant vessel to limit any leakage during transport. Appropriate primary containers can include any plastic container with a tight fitting lid or cap that is suitable for work in microbiology or molecular biology considered to be sterile and free of microbial DNA (or have as little as possible) at minimum. (However, it should be noted that human DNA contamination, depending upon the markers or specific type microbe that is being looked at may not present a problem.) The primary container can also be comprised of metal, clay, earthenware, fabric, wood, etc. So long as the container may be sterilized and tested to ensure that it is ideally DNA/DNase/RNA/RNase-free (or at least contains levels of nucleic acid much lower than the biomass to be studied, and low enough concentration of nuclease that the nucleic acids collected are not degraded) and can be closed with a tight-fitting and leak resistant lid, cap or top, then it can be used as a primary container.
- The primary container with the sample can then be placed into a secondary container, if appropriate. Appropriate secondary containers can include plastic screw top vessels with tight fitting lids or caps and plastic bags such as freezer-grade zip-top type bags. The secondary container can also be comprised of metal, clay, earthenware, fabric, wood, etc. So long as the container can be dosed or sealed with a tight-fitting and leak resistant lid, cap or top, then it can be used as a secondary container. The secondary container can also form a seal on itself or it can be fastened shut for leak resistance.
- The samples should generally be collected with minimal contact between the target sample and the sample collector to minimize contamination. The sample collector, if human, should generally collect the target sample using gloves or other barrier methods to reduce contamination of the samples with microbes from the skin. The sample can also be collected with instruments that have been cleaned. The sample collector, if machine, should be cleaned and sterilized with UV light and/or by chemical means prior to each sample collection. If the machine sample collector requires any maintenance from a human or another machine, the machine sample collector must be additionally subjected to cleaning prior to collecting any samples.
- After the sample is collected and placed in a primary and secondary container, the samples will be preserved. One method of preservation is by freezing on dry ice or liquid nitrogen to between 4° C. to −80° C. Another method of preservation is the addition of preservatives such as RNAstable™, LifeGuard™ or another commercial preservative, and following the respective instructions. So long as the preservation method will allow for the microbial nucleic acid to remain stable upon storage and upon later usage, then the method can be used.
- The samples will be shipped in an expedient method to the testing facility. In another embodiment, the testing of the sample can be done on location. The sample testing should be performed within a time period before there is substantial degradation of the microbial material with in the sample. So long as the sample remains preserved and there is no substantial degradation of the microbial material, any method of transport in a reasonable period of time is sufficient.
- Tracers will be added to the inflow of a sampling catchment to identify the organisms present in the system that are not from the target source. The tracer can be microorganisms or anything that will allow for analysis of the flow path. For example, in an oil setting, a tracer can be used to calibrate the effectiveness of a flooding operation (water, CO2, chemical, steam, etc.). The tracer will be used to determine factors such as the amount of injection fluid flowing through each zone at the production wellbore and the path of the injection fluid flow from the injection site to the production bore.
- The extraction of genetic material will be performed using methods with the ability to separate nucleic acids from other, unwanted cellular and sample matter in a way to make the genetic material suitable for library construction. For example, this can be done with methods including one or more of the following, but not limited to, mechanical disruption such as bead beating, sonicating, freezing and thawing cycles; chemical disruption by detergents, acids, bases, and enzymes; other organic or inorganic chemicals. Isolation of the genetic material can be done through methods including one or more of the following, but not limited to, binding and elution from silica matrices, washing and precipitation by organic or inorganic chemicals, electroelution or electrophoresis or other methods capable of isolating genetic material.
- Extractions will be done in an environment suitable to exclude microbes residing in the air or on other surfaces in the work area where the extraction is taking place. Care will be taken to ensure that all work surfaces and instruments are cleaned to remove unwanted microbes, nucleases and genetic material. Cleaning work surfaces and instruments can include, but is not limited to, spraying and/or wiping surfaces with a chlorine bleach solution, commercially available liquids such as DNAse AWAY™ or RNase AWAY™ or similar substances that are acceptable in routine decontamination of molecular biology work areas. Furthermore, aerosol barrier pipette tips used in manual, semi-automated or automated extraction process will be used to limit transfer of genetic material between instruments and samples.
- Controls for reagents for extractions and/or primary containers (when appropriate) will be tested to ensure they are free of genetic material. Testing of the reagents includes, but is not limited to performing extraction “blanks” where only the reagents are used in the extraction procedure. When necessary primary collection containers may also be tested for the presence of genetic material serving as one type of ‘negative control’ in PCR of the genetic material of the sample. In either case, testing the blank or negative control may be accomplished, but not limited to, spectrophotometric, fluorometric, electrophoretic, PCR or other assays capable of detecting genetic material. followed by testing the blank for the presence of genetic material by, but not limited to, spectrophotometric, fluorometric, electrophoretic, PCR or other assays capable of detecting genetic material.
- The methods described in more detail below allow identification of bacteria and fungi present in the fermentation sample. Different biomarkers are used for each kingdom, 16S for bacteria, ITS for fungi. In one improvement of building a library is the use of an additional single-copy marker gene allowing a more precise definition of bacterial strains in the sample.
- Genetic material from the samples will be subjected to polymerase chain reaction (PCR) to amplify the gene of interest and encode each copy with barcode unique to the sample. Generally, PCR amplifies a single or a few copies of a piece of DNA across several orders of magnitude, generating thousands to millions, or more, of copies of a particular DNA sequence using a thermostable DNA polymerase. PCR will be used to amplify a portion of specific gene from the genome of the microbes present in the sample. Any method which can amplify genetic material quickly and accurately can be used for library preparation.
- The PCR primer will be designed carefully to meet the goals of the sequencing method. The PCR primer will contain a length of nucleotides specific to the target gene, may contain an adapter that will allow the amplicon, also known as the PCR product, to bind and be sequenced on a high-throughput sequencing platform, and additional nucleotides to facilitate sequencing. The portion of the gene with adapters, barcode and necessary additional nucleotides is known as the “amplicon.” It being understood that future systems may not use, or need, adaptors. In one embodiment, forward and reverse primers as shown in the examples are used.
- The microbial ribosome is made up component proteins and non-coding RNA molecules, one of which is referred to as the 16S ribosomal RNA (or 16S rRNA). The 16S subunit is a component of the small subunit (SSU) of bacterial and archaeal ribosomes. It is 1.542 kb (or 1542 nucleotides) in length. The gene encoding the 16S subunit is referred to as the 16S rRNA gene. The 16S rRNA gene is used for reconstructing phylogenies because it is highly conserved between different species of bacteria and archaea, meaning that all of these organisms encode it in their genomes and it can be easily identified in genomic sequences, but it additionally contains regions that are highly variable, so there is a phylogenetic signature in the sequence of the gene. As a result of these same properties, batch sequencing of all of the 16S rRNA gene sequence in a sample containing many microbial taxa are informative about which microbial taxa are present. These studies are made possible by the remarkable observation that a small fragment of the 16S rRNA gene is sufficient as a proxy for the full-length sequence for many community analyses, including those based on a phylogenetic tree. However, such trees should, at most, be used as a guide to community comparisons and not for inferring true phylogenetic relationships among reads. Advances in sequencing technology, such as the availability of 400-base reads with the Titanium™ kit from Roche; the Illumina™ platforms which can produce 450 Gb per day, and in the course of a 10.8 day run produces 1.6 billion 100-base paired-end reads (HiSeq2000) or for single-day experiments can generate 1.5 Gb per day from 5 million 150-base paired- end reads (MiSeq′), or in the future, the availability of instruments providing 1500-base single-molecule reads, as reported by Pacific Biosciences™, will also improve the accuracy/productivity of existing methods for building phylogenetic trees and classifying functions of metagenomic reads.
- Although metagenomics and other alternative techniques provide insight into all of the genes (and potentially gene functions) present in a given community, 16S rRNA-based studies are extremely valuable given that they can be used to discover and record unexplored biodiversity and the ecological characteristics of either whole communities or individual microbial taxa. 16S rRNA phylogenies tend to correspond well to trends in overall gene content. Therefore the ability to relate trends at the species level to host or environmental parameters has proven immensely powerful to understanding the relationships between the microbes and the world.
- Alternative microbiome measurement techniques provide important information that is complementary to 16S rRNA or other marker-gene data: shotgun metagenomics provides genome content for the entire microbiome; transcriptomics measures gene expression by microbes, indicating which genes are actually being used by the microbes; proteomics measures actual production of enzymes and other functional proteins in the microbiome; metabolomics directly measures metabolite content in a sample.
- Generally, analysis of ribosomal genes (SSU, LSU, ITS) will be used for the determination and characterization of microbes in industrial settings where the only requirement for choosing the particular gene for amplification is that the gene is at least somewhat conserved between different species of microbes. For instance, the amplification, sequencing and analysis of the small subunit (“SSU”) of the ribosomal gene (16S rRNA gene) would be used for bacteria and archaea while analysis of the microeukarytotes such as nematodes, ciliates and amoeba would analyze the small subunit ribosomal gene (18S rRNA gene) common in these organisms. Further LSU, ITS and mitochondrial marker such as Cytb or coxl, generally may also be used and could provide enhanced performance. We have found that using 16S rRNA in combination with other single-copy marker genes provided prokaryotic species boundaries at higher resolution than 16S rRNA alone. Fungal populations may also be characterized by the intragenic transcribed spacer gene (“ITS gene”) in addition to 18S rRNA gene or other single gene markers. Furthermore, the large subunit ribosomal gene (“LSU”) could be analyzed alone or in combination with portions of the SSU in a single amplicon. The genetic material for any analysis could be derived from DNA or cDNA (i.e., complementary DNA) produced from the reverse transcription of RNA isolated from the target sample or samples.
- Complete marker genes generally cannot, because of their length, be sequenced using high-throughput methods. However, the use of PacBio, Nanopores, or Moleculo can provide the ability to obtain such a complete sequence. Therefore, a shorter region of the marker gene sequence must be selected to act as proxy. Currently, there is no consensus on a single best region, and consequently different groups are sequencing different or multiple regions. This diversity of methods hinders direct comparisons among studies. Standardization on a single region would be helpful on this front. Of the nine variable regions in the 16S rRNA gene, several of the more popular regions include the regions surrounding V2, V4, and V6. Generally, a combination of variable and moderately conserved regions appears to be optimal for performing analyses at different phylogenetic depths. Both the choice of region and the design of the primers are crucial, and poor design of primers can lead to radically different experimental conclusions. Additionally, primer bias due to differential annealing leads to the over- or underrepresentation of specific taxa can lead to some groups being missed entirely if they match the consensus sequence poorly. Issues of primer bias can be important. Comparisons of relative abundance among different studies should thus be treated with caution. However, meta-analysis of presence/absence data from different studies is particularly useful for revealing broad trends, even when different studies use different primers.
- As more sequence data and better taxonomic assignments become available, improved primer sets, with better coverage (including primers for archaea and eukaryotes), will likely provide a substantial advantage over present degenerate primer techniques. Specifically, 16S rRNA and 18s rRNA reads from metagenomic studies provide a source of sequences that is not subject to PCR primer bias (although other biases are present) and therefore covers taxa that are missed by existing but popular primer sets, although in practice exploiting this information has been quite challenging. Another promising approach is the use of miniprimers, which, together with an engineered DNA polymerase, may allow greater coverage of desired groups.
- Furthermore, improvements in the ability to produce high quantities of primers (e.g. millions of individual primers) will enable amplification of high quantities of regions (e.g. millions of individual regions), which may be distinct to each microbe or targeted at multiple sites obtained from existing databases or from shotgun sequencing. Such an application could be used to improve discrimination and/or prediction for a particular environment and target parameter.
- The primers designed for amplification will be well-suited for the phylogenetic analysis of sequencing reads. Thus, the primer design will be based on the system of sequencing, e.g., chain termination (Sanger) sequencing or high-throughput sequencing. Within the system, there are also many options on the method. For example, for high-throughput sequencing, the sequencing can be performed by, but is not limited to, 454 Life Sciences™ Genome Sequencer FLX (Roche) machine or the Illumina™ platforms (MiSeq™ or HiSeq™), IonTorrent, Nanopores or PacBio. These will be described more in the Sequencing section below.
- High-throughput sequencing, described below, has revolutionized many sequencing efforts, including studies of microbial community diversity. High-throughput sequencing is advantageous because it eliminates the labor-intensive step of producing clone libraries and generates hundreds of thousands of sequences in a single run. However, two primary factors limit culture-independent marker gene-based analysis of microbial community diversity through high-throughput sequencing: 1) each individual run is high in cost, and 2) separating a single plate across multiple runs is difficult.
- A solution to these limitations is barcoding. Double index barcoding protocol is used in the examples below. For barcoding, a unique tag will be added to each primer before PCR amplification. Because each sample will be amplified with a known tagged (barcoded) primer, an equimolar mixture of PCR-amplified DNA can be sequenced from each sample and sequences can be assigned to samples based on these unique barcodes. The presence of these assigned barcodes allow for independent samples to be combined for sequencing, with subsequent bioinformatic separation of the sequencer output. By not relying on physical separators, this procedure maximizes sequence space and multiplexing capabilities. This technique will be used to process many samples (e.g. 25, 200, 1000, and above) as many as 25 samples in a single high-throughput sequencing run. This number will be increased depending on advances in high-throughput sequencing technology, without limit to the number of samples to be sequenced in a single high-throughput sequencing run.
- Barcodes, or unique DNA sequence identifiers, have traditionally been used in different experimental contexts, such as sequence-tagged mutagenesis (SIM) screens where a sequence barcode acts as an identifier or type specifier in a heterogeneous cell-pool or organism-pool. However, SIM barcodes are usually 20-60 bases (or nt) long, are pre-selected or follow ambiguity codes, and exist as one unit or split into pairs. Such long barcodes are not particularly compatible with available high-throughput sequencing platforms because of restrictions on read length.
- Although very short (2- or 4-nt) barcodes can be used with high-throughput sequencing platforms, a more definitive assignment of samples and/or for enhanced multiplexing capabilities can be accomplished by lengthening the barcodes or variations in the fixed forward and reverse linkers used to generate the initial cDNA libraries. Shorter barcodes also have a steeper trade-off between number of possible barcodes and the minimum number of nucleotide variations between individual barcodes.
- Existing barcoding methods have limits both in the number of unique barcodes used and in their ability to detect sequencing errors that change sample assignments (this robustness is especially important for sample assignment because the 5′ end of the read (sequence for one strand of nucleic acid in a sample) is somewhat more error-prone). Barcodes based on error-correcting codes, which are widely used in devices in other technologies like telecommunications and electronics, will be applied for high-throughput sequencing barcoding purposes. A class of error-correcting codes called Hamming codes, which use a minimum amount of redundancy and will be simple to implement using standard linear algebra techniques. Hamming codes, like all error-correcting codes, employ the principle of redundancy and add redundant parity bits to transmit data over a noisy medium. Sample identifiers will be encoded with redundant parity bits. Then the sample identifiers will be “transmitted” as codewords. Each base (A, T, G, C) will be encoded using 2 bits and using 8 bases for each codeword. Therefore, 16-bit codewords will be transmitted. The codeword and bases is not limited to these numbers, as any number of bits and codewords can be designed by a person of ordinary skill in the art. The design of the barcode is based on the goals of the method. Hamming codes are unique in that they use only a subset of the possible codewords, particularly those that lie at the center of multidimensional spheres (hyperspheres) in a binary subspace. Single bit errors fall within hyperspheres associated with each codeword, and thus they can be corrected. Double bit errors do not fall within hyperspheres associated with each codeword, and thus they can be detected but not corrected.
- Another encoding schemes, such as Golay codes, will also be used for barcoding. Golay codes of 12 bases can correct all triple-bit errors and detect all quadruple-bit errors. The extended binary Golay code encodes 12 bits of data in a 24-bit word in such a way that any 3-bit errors can be corrected or any 7-bit errors can be detected. The perfect binary Golay code, has codewords of length 23 and is obtained from the extended binary Golay code by deleting one coordinate position (conversely, the extended binary Golay code is obtained from the perfect binary Golay code by adding a parity bit). In standard code notation the codes have parameters corresponding to the length of the codewords, the dimension of the code, and the minimum Hamming distance between two codewords, respectively.
- In general, design for barcoded primers for high-throughput sequencing is as follows. The primer will be designed to include nucleotides specific for the sequencing platform; nucleotides specific for the gene of interest; nucleotides for the barcode chosen; and the nucleotides of the gene. Upon amplification, one contiguous string of nucleotides known as the “forward” primer will be formed from the platform specific sequencing adaptors and the gene specific primer and linker. Additionally formed upon amplification will be one contiguous string of nucleotides known as the “reverse” primer formed from the platform specific sequencing adaptors, the gene specific primer and linker, and the barcode. In general PCR using barcoded primers is known in the art. Other error-correcting codes may be utilized such as Gray codes, low-density parity check codes, etc.
- The barcoded high-throughput sequencing technique provides a robust description of the changes in bacterial community structure across the sample set. A high-throughput sequencing run is expensive, and the large number of custom primers required only adds to this cost. However, the barcoding technique allows for thousands of samples to be analyzed simultaneously, with each community analyzed in considerable detail. Although the phylogenetic structure and composition of the surveyed communities can be determined with a high degree of accuracy, the barcoded high-throughput sequencing method may not allow for the identification of bacterial taxa at the finest levels of taxonomic resolution. However, with increasing read lengths in sequencing, this constraint will gradually become less relevant.
- The vast majority of life on earth is microbial, and the vast majority of these microbial species has not been, and is not capable of being easily cultured in the laboratory. Consequently, our primary source of information about most microbial species consists of fragments of their DNA sequences. Sequencing a DNA library will be done on a platform capable of producing many sequences for each sample contained in the library. High-throughput sequencing technologies have allowed for new horizons in microbial community analysis by providing a cost-effective method of identifying the microbial OTUs that are present in samples. These studies have drastically changed our understanding of the microbial communities in the human body and on the planet. This development in sequencing technology, combined with more advanced computational tools that employ metadata to relate hundreds of samples to one another in ways that reveal clear biological patterns, has reinvigorated studies of the 16S rRNA and other marker genes. Studies of 16S rRNA genes provide a view of which microbial taxa are present in a given sample because these genes provide an excellent phylogenetic marker. Although alternative techniques, such as metagenomics, provide insight into all of the genes (and potentially gene functions) present in a given community, 16S rRNA-based surveys are extraordinarily valuable given that they can be used to document unexplored biodiversity and the ecological characteristics of either whole communities or individual microbial taxa. Perhaps because 16S rRNA phylogenies tend to correspond well to trends in overall gene content, the ability to relate trends at the species level to host or environmental parameters has proven immensely powerful. The DNA encoding the 16S rRNA gene has been widely used to specify bacterial taxa, since the region can be amplified using PCR primers that bind to conserved sites in most or all species, and large databases are available relating 16S rRNA sequences to bacterial phylogenies. However, as previously discussed, other genes can be used to specify the taxa, such as 18S, LSU, ITS, and SSU (e.g., 16S). For the purposes of bacteria, cpn60 or ftsZ, or other markers, may also be utilized.
- New technologies have led to extraordinary decreases in sequencing costs. This rapid increase in sequencing capacity has led to a process in which newer sequencing platforms generate datasets of unprecedented scale that break existing software tools: new software is then developed that exploits these massive datasets to produce new biological insight, but in turn the availability of these software tools prompts new experiments that could not previously have been considered, which lead to the production of the next generation of datasets, starting the process again.
- With the advent of high-throughput sequencing, characterization of the nucleic acid world is proceeding at an accelerated pace. Three major high-throughput sequencing platforms are in use today: 1) the Genome Sequencers from Roche/454 Life Sciences™ [GS-20 or GS-FLX]; 2) the 1G Analyzer from Illumina™/Solexa™ which includes the MiSeg™ and the HiSeg™; and 3) the SOLiD™ System from Applied Biosystems™. Comparison across the three platforms reveals a trade-off between average sequence read length and the number of DNA molecules that are sequenced. The Illumina™/Solexa™ and SOLiD systems provide many more sequence reads, but render much shorter read lengths than the 454TH/Roche Genome Sequencers. This makes the 454 TH/Roche platform appealing for use with barcoding technology, as the enhanced read length facilitates the unambiguous identification of both complex barcodes and sequences of interest. However, even reads of less than 100 bases can be used to classify the particular microbe in phylogenetic analysis. Any platform, for example, Illumina™, providing many reads and read lengths of a predetermined necessary length, for example, 150 base pairs or 100 base pairs, is acceptable for this method. Because the accuracy of phylogenetic reconstruction depends sensitively on the number of informative sites, and tends to be much worse below a few hundred base pairs, the short sequence reads produced from high-throughput sequencing, which are 100 base pairs on average for the GS 20 (Genome Sequencer 20 DNA Sequencing System, 454 Life Sciences™), may be unsuitable for performing phylogenetically based community analysis.
- However, this limitation can be at least partially overcome by using a reference tree based on full-length sequences, such as the tree from the Greengenes 16S rRNA ARB Database, and then using an algorithm such as parsimony insertion to add the short sequence reads to this reference tree. These procedures are necessarily approximate, and may lead to errors in phylogenetic reconstruction that could affect later conclusions about which communities are more similar or different. One substantial concern is that because different regions of the rRNA sequence differ in variability, conclusions drawn about the similarities between communities from different studies might be affected more by the region of the 16S rRNA that was chosen for sequencing than by the underlying biological reality.
- The increase in number of sequences per run from parallel high-throughput sequencing technologies such as the Roche 454 GS FLX™ to Illumina GAIIx™ is on the order of 1,000-fold and greater than the increase in the number of sequences per run from Sanger to 454™ The transition from Sanger sequencing to 454™ sequencing has opened new frontiers in microbial community analysis by making it possible to collect hundreds of thousands of sequences spanning hundreds of samples. A transition to the Illumina™ platform allows for more extensive sequencing than has previously been feasible, with the possibility of detecting even OTUs that are very rare. By using a variant of the barcoding strategy used for 454™ with the Illumina™ platform, thousands of samples could be analyzed in a single run, with each of the samples analyzed in unprecedented depth.
- A few sequencing runs using 454™/Roche's pyrosequencing platform can generate sufficient coverage for assembling entire microbial genomes, for the discovery, identification and quantitation of small RNAs, and for the detection of rare variations in cancers, among many other applications. However, as the analytical technology becomes more advanced, the coverage provided by this system becomes unnecessary for phylogenetic classification. For analysis of multiple libraries, the 454/Roche™ pyrosequencers can accommodate a maximum of only 16 independent samples, which have to be physically separated using manifolds on the sequencing medium, drastically limiting is utility in the effort to elucidate the diverse microbial communities in each sample. Relatively speaking, the Illumina™ platforms are experiencing the most growth. However, with the constant improvements in sequencing systems, the different platforms that will be used will change over time. Generally, the method describe herein will be used with any available high-throughput sequencing platform currently available or will be available in the future. For example, the method described herein will be applied to a sequencing method wherein the genetic material will be sequenced without barcoding by simply placing the DNA or RNA directly into a sequencing machine.
- In general, high-throughput sequencing technology allows for the characterization of microbial communities orders of magnitude faster and more cheaply than has previously been possible. In addition, the ability to barcode amplicons from individual samples means that hundreds of samples can be sequenced in parallel, further reducing costs and increasing the number of samples that can be analyzed. Though high-throughput sequencing reads tend to be short compared to those produced by the Sanger method, the sequencing effort is best focused on gathering more short sequences (less than 150 base pairs or less than 100 base pairs) rather than fewer longer ones as much of the diversity of microbial communities lies within the “rare biosphere,” also known as the “long tail,” that traditional culturing and sequencing technologies are slow to detect due to the limited amount of data generated from these techniques.
- The length of the read of a sequence describes the number of nucleotides in a row that the sequencer is able to obtain in one read. This length can determine the type of OTU obtained (e.g., family, genus or species). For example, a read length of approximately 300 base pairs will probably provide family information but not a species determination. Depth of coverage in DNA sequencing refers to the number of times a nucleotide is read during the sequencing process. On a genome basis, it means that, on average, each base has been sequenced a certain number of times (10×, 20×, . . . ). For a specific nucleotide, it represents the number of sequences that added information about that nucleotide. Coverage is the average number of reads representing a given nucleotide in the reconstructed sequence. Depth can be calculated from the length of the original genome (G), the number of reads (N), and the average read length (L) as N×L/G. For example, a hypothetical genome with 2,000 base pairs reconstructed from 8 reads with an average length of 500 nucleotides will have 2× redundancy. This parameter also enables estimation of other quantities, such as the percentage of the genome covered by reads (coverage) Sometimes a distinction is made between sequence coverage and physical coverage. Sequence coverage is the average number of times a base is read. Physical coverage is the average number of times a base is read or spanned by mate paired reads.
- Organisms of lower abundance rank can be detected if more sequence reads are collected. To verify that these sequences are present, a higher read depth (i.e. more sequences) must be obtained. Analyzing the rare biosphere is attainable because sequencing depth provided by high-throughput sequencing allows for the detection of microbes that would otherwise be detected only occasionally by chance with traditional techniques. Thus high-throughput sequencing will allow for the analysis of the more rare members (low abundance organisms) of any environment which may play critical role in a fermentation process important in food production, agriculture and other industries where microbes are present within a time-frame feasible for industrial settings.
- One type of high-throughput sequencing is known as pyrosequencing. Pyrosequencing, based on the “sequencing by synthesis” principle, is a method of DNA sequencing widely used in microbial sequencing studies. Pyrosequencing involves taking a single strand of the DNA to be sequenced and then synthesizing its complementary strand enzymatically. The pyrosequencing method is based on observing the activity of DNA polymerase, which is a DNA synthesizing enzyme, with another chemiluminescent enzyme. The single stranded DNA template is hybridized to a sequencing primer and incubated with the enzymes DNA polymerase, ATP sulfurylase, luciferase and apyrase, and with the substrates adenosine 5′ phosphosulfate (APS) and luciferin. Synthesis of the complementary strand along the template DNA allows for sequencing of a single strand of DNA, one base pair at a time, by the detection of which base was actually added at each step.
- The template DNA is immobile, and solutions of A, C, G, and T nucleotides are sequentially added and removed from the reaction. The templates for pyrosequencing can be made both by solid phase template preparation (streptavidin-coated magnetic beads) and enzymatic template preparation (apyrase+exonuclease). Specifically, the addition of one of the four deoxynucleoside triphosphates (dNTPs) (dATPalphaS, which is not a substrate for a luciferase, is added instead of dATP) initiates the next step. DNA polymerase incorporates the correct, complementary dNTPs onto the template. This base incorporation releases pyrophosphate (PPi) stoichiometrically. Then, ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5′ phosphosulfate. This ATP acts to catalyze the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP. Light is produced only when the nucleotide solution complements the particular unpaired base of the template. The light output in the luciferase-catalyzed reaction is detected by a camera and analyzed in a program. The sequence of solutions which produce chemiluminescent signals allows the sequence determination of the template. Unincorporated nucleotides and ATP are degraded by the apyrase, and the reaction can restart with another nucleotide.
- Illumina's™ sequencing by synthesis (SBS) technology with TruSeq technology supports massively parallel sequencing using a proprietary reversible terminator-based method that enables detection of single bases as they are incorporated into growing DNA strands.
- A fluorescently labeled terminator is imaged as each dNTP is added and then cleaved to allow incorporation of the next base. Since all four reversible terminator-bound dNTPs are present during each sequencing cycle, natural competition minimizes incorporation bias. The end result is true base-by-base. Although this is similar to pyrosequencing, the differences between the platforms are noteworthy. The method described herein can be applied to any high-throughput sequencing technology, past, present or future. Pyrosequencing and SBS are merely examples and do not limit the application of the method in terms of sequencing.
- Generally, as the expense of sequencing decreases, the methods for comparing different communities based on the sequences they contain become increasingly important, and are often the bottleneck in obtaining insight from the data. Sequence data can be analyzed in a manner in which sequences are identified and labeled as being from a specific sample using the unique barcode introduced during library preparation, if barcodes are used, or sample identifiers will be associated with each run directly if barcodes are not used. Once sequences have been identified as belonging to a specific sample, the relationship between each pair of samples will be determined based on the distance between the collection of microbes present in each sample. In particular, techniques that allow for the comparison of many microbial samples in terms of the phylogeny of the microbes that live in them (“phylogenetic techniques”) are often necessary. Such methods are particularly valuable as the gradients that affect microbial distribution are analyzed, and where there is a need to characterize many communities in an efficient and cost-effective fashion. Gradients of interest include different physical or chemical gradients in natural environments, such as temperature or nutrient gradients in certain industrial settings.
- When comparing microbial communities, researchers often begin by determining whether groups of similar community types are significantly different. However, to gain a broad understanding of how and why communities differ, it is essential to move beyond pairwise significance tests. For example, determining whether differences between communities stem primarily from particular lineages of the phylogenetic tree, or whether there are environmental factors (such as temperature, salinity, or acidity) that group multiple communities together is pivotal to an analysis. The analysis systems described herein are merely examples and are not limiting. Any methods which will distill massive data sets from raw sequences to human-interpretable formats, for example, 2-D or 3-D ordination plots, supervised learning for predictive modeling, or more traditional statistical significance testing, allowing for pattern elucidation and recognition, will be used.
- After DNA sequence data is obtained the bioinformatics stages begin. This includes barcode decoding, sequence quality control, “upstream” analysis steps (including clustering of closely related sequences and phylogenetic tree construction), and “downstream” diversity analyses, visualization, and statistics. All of these steps are currently facilitated by the Quantitative Insights Into Microbial Ecology (QIIME) open source software package, which is the most widely used software for the analysis of microbial community data generated on high-throughput sequencing platforms. QIIME was initially designed to support the analysis of marker gene sequence data, but is also generally applicable to “comparative -omics” data (including but not limited to metabolomics, metatranscriptomics, and comparative human genomics).
- QIIME is designed to take users from raw sequencing data (for example, as generated on the Illumina™ and 454™ platforms) though the processing steps mentioned above, leading to quality statistics and visualizations used for interpretation of the data. Because QIIME scales to billions of sequences and runs on systems ranging from laptops to high-performance computer clusters, it will continue to keep pace with advances in sequencing technologies to facilitate characterization of microbial community patterns ranging from normal variations to pathological disturbances in many human, animal and environmental ecosystems.
- For microbiome data analysis, the following steps will be taken. Unless otherwise noted, the steps will be performed with QIIME. However, other such systems may be used and the scope of protection afforded to the present inventions is not in anyway limited to, or dependent upon, the use of QIIME.
- The first step in the bioinformatics stage of a microbial community analysis study is to consolidate the sample metadata in a spreadsheet. The sample metadata is all per-sample information, including technical information such as the barcode assigned to each sample, and “environmental” metadata. This environmental metadata will differ depending on the types of samples that are being analyzed. If, for example, the study is of microbial communities in soils, the pH and latitude where the soil was collected will be environment metadata categories. Alternatively, if the samples are of the wine microbiome, environmental metadata may include barrel and/or bottling identifiers and collection times. This spreadsheet will be referred to as the sample metadata mapping file in the following sections.
- Next, in a combined analysis step, sequence barcodes will be read to identify the source sample of each sequence, poor quality regions of sequence reads will be trimmed, and poor quality reads will be discarded. These steps will be combined for computational efficiency. The features included in quality filtering include whether the barcode will unambiguously be mapped to a sample barcode, per-base quality scores, and the number of ambiguous (N) base calls. The default settings for all quality control parameters in QIIME will be determined by benchmarking combinations of these parameters on artificial (i.e., “mock”) community data, where microbial communities were created in the lab from known concentrations of cultured microbes, and the composition of the communities is thus known in advance.
- After mapping sequence reads to samples and performing quality control, sequences will be clustered into OTUs (Operational Taxonomic Units). This is typically the most computationally expensive step in microbiome data analysis, and will be performed to reduce the computational complexity at subsequent steps. The assumption made at this stage is that organisms that are closely related, as determined by the similarity of their marker gene sequences, are functionally similar. Highly similar sequences (e.g., those that are greater than 97% identical to one another) will be clustered, the count of sequences that are contained in each cluster will be retained, and then a single representative sequence from that cluster for use in downstream analysis steps such as taxonomic assignment and phylogenetic tree construction will be chosen. This process of clustering sequences is referred to as OTU picking, where the OTUs (i.e., the clusters of sequences) are considered to represent taxonomic units such as species. SILVA, a comprehensive on-line resource for quality checked and aligned ribosomal RNA sequence data, provides regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for all three domains of life (Bacteria, Archaea and Eukarya).
- There are three high-level strategies for OTU picking, each of which is implemented in QIIME. In a de novo OTU picking process, reads will be clustered against one another without any external reference sequence collection. pick_de_novo_otus.py is the primary interface for de novo OTU picking in QIIME, and includes taxonomy assignment, sequence alignment, and tree-building steps. A benefit of de novo OTU picking is that all reads are clustered. A drawback is that there is no existing support for running this in parallel, so it can be too slow to apply to large datasets (e.g., more than 10 million reads) De novo OTU picking must be used if there is no reference sequence collection to cluster against, for example because an infrequently used marker gene is being used. De novo OTU picking cannot be used if the comparison is between non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA gene or for very large data sets, like a full HiSeg™ 2000 run. Although technically, de novo OTU picking can be used for very large data sets, the program would take too long to run to be practical.
- In a closed-reference OTU picking process, reads will be clustered against a reference sequence collection and any reads that do not hit a sequence in the reference sequence collection are excluded from downstream analyses. pick_closed_reference_otus.py is the primary interface for dosed-reference OTU picking in QIIME. If the user provides taxonomic assignments for sequences in the reference database, those are assigned to OTUs. Closed-reference OTU picking must be used if non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA, will be compared to each other. The reference sequences must span both of the regions being sequenced. Closed-reference OTU picking cannot be used if there is no reference sequence collection to cluster against, for example because an infrequently used marker gene is being used. A benefit of closed-reference OTU picking is speed in that the picking is fully parallelizable, and therefore useful for extremely large data sets. Another benefit is that because all OTUs are already defined in the reference sequence collection, a trusted tree and taxonomy for those OTUs may already exist. There is the option of using those, or building a tree and taxonomy from the sequence data. A drawback to reference-based OTU picking is that there is an inability to detect novel diversity with respect to the reference sequence collection. Because reads that do not hit the reference sequence collection are discarded, the analyses only focus on the diversity that is already known. Also, depending on how well-characterized the environment is, a small fraction of the reads (e.g., discarding 1-10% of the reads is common for 16S-based human microbiome studies, where databases like Greengenes cover most of the organisms that are typically present) or a large fraction of your reads (e.g., discarding 50-80% of the reads has been observed for “unusual” environments like the Guerrero Negro microbial mats) may be discarded.
- In an open-reference OTU picking process, reads will be clustered against a reference sequence collection and any reads which do not hit the reference sequence collection are subsequently clustered de novo. pick_open_reference_otus.py is the primary interface for open-reference OTU picking in QIIME, and includes taxonomy assignment, sequence alignment, and tree-building steps. Open-reference OTU picking with pick_open_reference_otus.py is the preferred strategy for OTU picking. Open-reference OTU picking cannot be used for comparing non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA, or when there is no reference sequence collection to cluster against, for example because an infrequently used marker gene is being used. A benefit of open-reference OTU picking is that all reads are clustered. Another benefit is speed. Open-reference OTU picking is partially run in parallel. In particular, the subsampled open reference OTU picking process implemented in pick_open_reference_otus.py is much faster than pick_de_novo_otus.py as some strategies are applied to run several pieces of the workflow in parallel. However, a drawback of open-reference OTU picking is also speed. Some steps of this workflow run serially. For data sets with a lot of novel diversity with respect to the reference sequence collection, this can still take days to run.
- Generally, uclust is the preferred method for performing OTU picking. QIIME's uclust-based open reference OTU picking protocol will be used when circumstances allow (i.e., when none of the cases above, where open reference OTU picking is not possible, apply).
- The OTU-picking protocol described above is used for processing taxonomic marker gene sequences such as those from the 16S rRNA, ITS and LSU genes as well as other marker genes. In that case, the sequences themselves are not used to identify biological functions performed by members of the microbial community; they are instead used to identify which kinds of organisms are present. In the case of shotgun metagenomic sequencing, the data obtained are random fragments of all genomic DNA present in a given microbiome. These can be compared to reference genomes to identify the types of organisms present in a manner similar to marker gene sequences, but they may also be used to infer biological functions encoded by the genomes of microbes in the community. Typically this is done by comparing them to reference genomes and/or individual genes or genetic fragments that have been annotated for functional content. In the case of shotgun metatranscriptomic sequencing, the data obtained are similar to that for shotgun metatranscroptomic sequencing except that the RNA rather than the DNA is used, and physical or chemical steps to deplete particular classes of sequence such as eukaryotic messenger RNA or ribosomal RNA are often used prior to library construction for sequencing. In the case of shotgun metaproteomics, protein fragments are obtained and matched to reference databases. In the case of shotgun metabolomics, metabolites are obtained by biophysical methods including nuclear magnetic resonance or mass spectrometry. In all of these cases, some type of coarse-graining of the original data equivalent to OTU picking to identify biologically relevant features is employed, and a biological observation matrix as described in relating either the raw or coarse-grained observations to samples is obtained. The steps downstream from the Biological Observation Matrix, including the construction of distance matrices, taxon or functional tables, and industry-specific, actionable models from such data, are conceptually equivalent for each of these datatypes and are within the scope of the present Invention.
- Next, the centroid sequence in each OTU will be selected as the representative sequence for that OTU. The centroid sequence will be chosen so that all sequences are within the similarity threshold to their representative sequence, and the centroid sequences are specifically chosen to be the most abundant sequence in each OTU.
- The OTU representative sequences will next be aligned using an alignment algorithm such as the PyNAST software package. PyNAST is a reference-based alignment approach, and is chosen because it achieves similar quality alignments to non-reference-based alignment approaches (e.g., muscle), where quality is defined as the effect of the alignment algorithm choice on the results of phylogenetic diversity analyses, but is easily run in parallel, which is not the case for non-reference-based alignment algorithms.
- Once a PyNAST alignment is obtained, positions that mostly contain gaps, or too high or too low variability, will be stripped to create a position-filtered alignment. This position-filtered alignment will be used to construct a phylogenetic tree using FastTree. This tree relates the OTUs to one another, will be used in phylogenetic diversity calculations (discussed below), and is referred to below as the OTU phylogenetic tree.
- In addition to being aligned, all OTU representative sequences will have taxonomy assigned to them. This can be performed using a variety of techniques, though our currently preferred approach is the uclust-based consensus taxonomy assigner implemented in QIIME. Here, all representative sequences (the “query” sequences) are queried against a reference database (e.g., Greengenes, which contains near-full length 16S rRNA gene sequences with human-curated taxonomic assignments; UNITE database for ITS; SILVA for 18S rRNA) with uclust. The taxonomy assignments of the three best database hits for each query sequences are then compared, and a consensus of those assignments is assigned to the query sequence.
- The last of the “upstream” processing steps is to create a Biological Observation Matrix (BIOM) table, which contains counts of OTUs on a per-sample basis and the taxonomic assignment for each OTU. This table, which will be referred to as the BIOM table, the OTU phylogenetic tree constructed above, and the sample metadata mapping file will be the data required for computing phylogenetic diversity metrics in the next steps, and for doing visual and statistical analysis based on these diversity metrics. Although the BIOM is a specific file format for the table with OTU counts on a per-table basis, other file formats, e.g. xls, txt, or csv are also possible.
- Once a BIOM table, an OTU phylogenetic tree, and a sample metadata mapping file (n-dimensional plot) are compiled, the microbial communities present in each sample will be analyzed and compared. These analyses include, but are not limited to, summarizing the taxonomic composition of the samples, understanding the “richness” and “evenness” of samples (defined below), understanding the relative similarity of communities, and identifying organisms or groups of organisms that are significantly different across community types. The different types of analysis on soil microbial community data will be illustrated in the Examples below.
- The taxonomic composition of samples is often something that researchers are most immediately interested in. This can be studied at various taxonomic levels (e.g., phylum, class, species) by collapsing OTUs in the BIOM table based on their taxonomic assignments. The abundance of each taxon on a per-sample basis is then typically presented in bar charts, area charts or pie charts, though this list is not comprehensive.
- Alpha diversity refers to diversity of single samples (i.e., within-sample diversity), including features such as taxonomic richness and evenness. The species richness is a measure of the number of different species of microbes in a given sample. Species evenness refers to how close in numbers the abundance of each species in an environment is.
- Measures of alpha diversity (or, within-sample diversity) have a long history in ecology. Alpha diversity scores have been shown to differ in different types of communities, for example, from different human body habitats. For instance, skin-surface bacterial communities have been found to be significantly more rich (i.e., containing more species) in females than in males, and at dry sites rather than sebaceous sites, and the gut microbiome of lean individuals have been found to be significantly more rich than those of obese individuals. One way of viewing alpha diversity in the context of environmental metadata, for example, the degree of phylogenetic diversity in a sample (a phylogeny-aware measure of richness) changes with soil pH, ranging from pH around 6.5 through 9.5, with a peak in richness around neutral pH of 7. In some cases alpha diversity will be useful input features for building predictive models via supervised classifiers.
- Generally the primary question of interest when beginning a survey of new microbial community types is what environmental features are associated with differences in the composition of microbial communities? This is a question of between-sample (or “beta”) diversity. Beta diversity metrics provide a measure of community dissimilarity, allowing investigators to determine the relative similarity of microbial communities. Metrics of beta diversity are pairwise, operating on two samples at a time.
- The difference in overall community composition between each pair of samples can be determined using the phylogenetically-aware UniFrac distance metric, which allows researchers to address many of these broader questions about the composition of microbial communities. UniFrac calculates the fraction of branch length unique to a sample across a phylogenetic tree constructed from each pair of samples. In other words, the UniFrac metric measures the distance between communities as the percentage of branch length that leads to descendants from only one of a pair of samples represented in a single phylogenetic tree, or the fraction of evolution that is unique to one of the microbial communities. Phylogenetic techniques for comparing microbial communities, such as UniFrac, avoid some of the pitfalls associated with comparing communities at only a single level of taxonomic resolution and provide a more robust index of community distances than traditional taxon-based methods, such as the Jaccard and Sorenson indices. Unlike phylogenetic techniques, species-based methods that measure the distance between communities based solely on the number of shared taxa do not consider the amount of evolutionary divergence between taxa, which can vary widely in diverse microbial populations. Among the first applications of phylogenetic information to comparisons of microbial communities were the Phylogenetic (P)-test and the Fst test. Pairwise significance tests are limited because they cannot be used to relate many samples simultaneously. Although phylogenetically-aware techniques such as UniFrac offer significant benefits, techniques lacking phylogenetic awareness can also be implemented with success: after an alternative distance metric (e.g. Bray-Curtis, Jensen-Shannon divergence) has been applied, the resulting inter-sample distance matrix is processed in the same way as a UniFrac distance matrix as described below.
- QIIME implements the UniFrac metric and uses multivariate statistical techniques to determine whether groups of microbial communities are significantly different. When studying a set of n microbial communities, the UniFrac distances between all pairs of communities are computed to derive a distance matrix (using UniFrac or other distances) for all samples. This will be an n×n matrix, which is symmetric (because the distance between sample A and sample Bis always equal to the distance between sample Band sample A) and will have zeros on the diagonal (because the distance between any sample and itself is always zero). For any reasonably larger value of n (e.g., n>S) it becomes difficult to interpret patterns of beta diversity from a distance matrix directly.
- Ordination techniques, such as principal coordinates analysis (PCoA) and non-metric multidimensional scaling (NMDS), together with approximations to these techniques that reduce computational cost or improve parallelism, will be used to summarize these patterns in two or three dimensional scatter plots. The patterns can also be represented in two dimensions using, for example, line graph, bar graphs, pie charts, Venn diagrams, etc. This is a non-exhaustive list. The patterns can also be represented in three dimensions using, for example, wire frame, ball and stick models, 3-D monitors, etc. This list is also non-exhaustive and does not limit the 2-D or 3-D forms by which the data can be represented.
- PCoA is a multivariate statistical technique for finding the most important orthogonal axes along which samples vary. Distances are converted into points in a space with a number of dimensions one less than the number of samples. The principal components, in descending order, describe how much of the variation (technically, the inertia) each of the axes in this new space explains. The first principal component separates the data as much as possible; the second principal component provides the next most separation along an orthogonal axis, and so forth. QIIME returns information on all principal component axes in a data table. It also allows easy visualization of that data in interactive scatter plots that allow users to choose which principal components to display. The points (each representing a single sample) are typically marked with colored symbols, and users can interactively change the colors of the points to detect associations between sample microbial composition and sample metadata. PCoA often reveals patterns of similarity that are difficult to see in a distance matrix, and the axes along which variation occurs can sometimes be correlated with environmental variables such as pH or temperature. Industrial variables, or control data, can include presence of oil, pressure, viscosity, etc. These control data can be filtered or removed in order to observe other control data factors to visualize possible patterns.
- New ways of exploring and visualizing results and identifying meaningful patterns are increasingly important as the size and complexity of microbial datasets rapidly increase. QIIME 1.8.0 (released in December 2013) introduces several powerful tools to assist in visualizations of the results of PCoA, primarily the Emperor 3D scatter plot viewer. This includes (i) the ability to color large collections of samples using different user-defined subcategories (for example, coloring environmental samples according to temperature or pH), (ii) automatic scaled/unscaled views, which accentuate dimensions that explain more variance, (iii) the ability to interactively explore tens of thousands of points (and user-configurable labels) in 3D, and (iv) parallel coordinates displays that allow the dimensions that separate particular groups of environments to be readily identified.
- The significance of patterns identified in PCoA can be tested with a variety of methods. The significance of the clusters identified by UniFrac can be established using Monte Carlo based t-tests, where samples are grouped into categories based on their metadata, and distributions of distances within and between categories are compared. For example, if microbial communities are being compared between soils from a vineyard and soils unassociated with a vineyard, the distribution of UniFrac distances between soils from the same group can be compared to those between soils from different groups by computing a t-score (the actual t-score). The sample labels (vineyard, non-vineyard) can then be randomly shuffled 10,000 times, and at-score calculated for each of these randomized data sets (the randomized t-scores). If the vineyard soils and non-vineyard soils are significantly different from one another in composition, the actual t-score should be higher than the vast majority of the randomized t-scores. A p-value will be computed by dividing the number of randomized t-scores that are better than the actual t-score by 9999. The Monte Carlo simulations described here will be run in parallel, and are not limited to pairs of sample categories, so they support analysis of many different sample types.
- If the samples fall along a gradient that is correlated with some environmental metadata (e.g., pH, salinity), rather than clustering into discrete groups (as described above), there are alternative approaches to testing for statistical significance. For example, if pH appears to be correlated with the principal coordinate 1 (PC1) values in a PCoA plot, a Monte Carlo-based Pearson or Spearman correlation test will be performed. Here, pH and PC1 will be tested to, for example, compute a Spearman rho value. The labels of the samples will again be shuffled 10,000 times and rho computed for each randomized data set. The p-value for the pH versus PC1 correlation will then be the number of randomized rho values that are higher than the actual rho value divided by 9999.
- Identifying Features that are Predictive of Environment Characteristics (i.e., Sample Metadata)
- Supervised classification is a machine learning approach for developing predictive models from training data. Each training data point consists of a set of input features, for example, the relative abundance of taxa, and a qualitative dependent variable giving the correct classification of that data point. In microbiome analysis, such classifications might include soil nutrients, predominant weather patterns, disease states, therapeutic results, or forensic identification. The goal of supervised classification is to derive some function from the training data that can be used to assign the correct class or category labels to novel inputs (e.g. new samples), and to learn which features, for example, taxa, discriminate between classes. Common applications of supervised learning include text classification, microarray analysis, and other bioinformatics analyses. For example, when microbiologists use the Ribosomal Database Project website to classify 16S rRNA gene sequences taxonomically, a form of supervised classification is used.
- The primary goal of supervised learning is to build a model from a set of categorized data points that can predict the appropriate category membership of unlabeled future data. The category labels can be any type of important metadata, such as sugar content, viscosity, pH or temperature. The ability to classify unlabeled data is useful whenever alternative methods for obtaining data labels are difficult or expensive.
- This goal of building predictive models is very different from the traditional goal of fitting an explanatory model to one's data set. The concern is less with how well the model fits our particular set of training data, but rather with how well it will generalize to novel input data. Hence, there is a problem of model selection. A model that is too simple or general is undesirable because it will fail to capture subtle, but important information about the independent variables (underfitting). A model that is too complex or specific is also undesirable because it will incorporate idiosyncrasies that are specific only to the particular training data (overfitting). The expected prediction error (EPE) of the model on future data must be optimized.
- When the labels for the data are easily obtained, a predictive model is unnecessary. In these cases, supervised learning will still be useful for building descriptive models of the data, especially in data sets where the number of independent variables or the complexity of their interactions diminishes the usefulness of classical univariate hypothesis testing. Examples of this type of model can be seen in the various applications of supervised classification to microarray data, in which the goal is to identify a small, but highly predictive subset of the thousands of genes profiled in an experiment for further investigation. In microbial ecology, the analogous goal is to identify a subset of predictive taxa. In these descriptive models, accurate estimation of the EPE is still important to ensure that the association of the selected taxa with the class labels is not just happenstance or spurious. This process of finding small but predictive subsets of features, called feature selection, is increasingly important as the size and dimensionality of microbial community analyses continue to grow.
- A common way to estimate the EPE of a particular model is to fit the model to a subset (e.g., 90%) of the data and then test its predictive accuracy on the other 10% of the data. This can provide an idea of how well the model would perform on future data sets if the goal is to fit it to the entire current data set. To improve the estimate of the EPE, this process will be repeated ten times so that each data point is part of the held-out validation data once. This procedure, known as cross-validation, will allow for the comparison of models that use very different inner machinery or different subsets of input features. Of course if many different models are tried and one provides the lowest cross-validation error for the entire data set is selected, it is likely that the reported EPE will be too optimistic. This is similar to the problem of making multiple comparisons in statistical inference; some models are bound to fortuitously match a particular data set. Hence, whenever possible, an entirely separate test set will be held out for estimating the EPE of the final model, after performing model selection.
- Even if the method for selecting the best parameters or degree of complexity for a particular kind of model is determined, there is still a general challenge of picking what general class of models is most appropriate for a particular data set. The core aspect of choosing the right models for microbiome classification is to combine the knowledge of the most relevant constraints (e.g., data sparseness) inherent in the data with the understanding of the strengths and weaknesses of various approaches to supervised classification. If it is understood what structures will be inherent in the data, then models that take advantage of those structures will be chosen. For example, in the classification of microbiome, methods that can model nonlinear effects and complex interactions between organisms will be desired. In another example, the highly diverse nature of many microbial communities on the human body, models designed specifically to perform aggressive feature selection when faced with high-dimensional data will be most appropriate. Specialized generative models will be designed to incorporate prior knowledge about the data as well as the level of certainty about that prior knowledge. Instead of learning to predict class labels based on input features, a generative model will learn to predict the input features themselves. In other words, a generative model will learn what the data “looks like,” regardless of the class labels. One potential benefit of generative models such as topic models and deep-layered belief nets will be that they can extract useful information even when the data are unlabeled. The ability to use data from related experiments to help build classifiers for one's own labeled data will be important as the number of publicly available microbial community data sets continues to grow.
- Machine learning classification techniques will be applied to many types of microbial community data, for example, to the analysis of soil samples. For the soil samples, the samples will be classified according to environment type using support vector machines (SVMs) and k-nearest neighbors (KNN). Supervised learning will be used extensively in other classification domains with high-dimensional data, such as macroscopic ecology, microarray analysis, and text classification.
- The goal of feature selection will be to find the combination of the model parameters and the feature subset that provides the lowest expected error on novel input data. Feature selection will be of utmost importance in the realm of microbiome classification due to the generally large number of features (i.e., constituent species-level taxa, or genes, or transcripts, or metabolites, or some combination of these): in addition to improving predictive accuracy, reducing the number of features leads to the production of more interpretable models. Approaches to feature selection known to people in the art and are typically divided into three categories: filter methods, wrapper methods, and embedded methods.
- As the simplest form of feature selection, filter methods are completely agnostic to the choice of learning algorithm being used; that is, they treat the classifier as a black box. Filter methods use a two-step process. First a univariate test (e.g. t-test) or multivariate test (e.g., a linear classifier built with each unique pair of features) will be performed to estimate the relevance of each feature, and (1) all features whose scores exceed a predetermined threshold will be selected or (2) the best n features for inclusion in the model will be selected; then a classifier on the reduced feature set will be run. The choice of n can be determined using a validation data set or cross-validation on the training set.
- Filter methods have several benefits, including their low computational complexity, their ease of implementation, and their potential, in the case of multivariate filters, to identify important interactions between features. The fact that the filter has no knowledge about the classifier is advantageous in that it provides modularity, but it can also be disadvantageous, as there is no guarantee that the filter and the classifier will have the same optimal feature subsets. For example, a linear filter (e.g., correlation-based) is unlikely to choose an optimal feature subset for a nonlinear classifier such as an SVM or a random forest (RF).
- The purpose of a filter will be to identify features that are generally predictive of the response variable, or to remove features that are noisy or uninformative. Common filters include, but are not limited to, the between-class chit test, information gain (decrease in entropy when the feature is removed), various standard classification performance measures such as precision, recall, and the F-measure, and the accuracy of a univariate classifier, and the bi-normal separation (BNS), which treats the univariate true positive rate and the false-positive rate (tpr, fpr, based on document presence/absence in text classification) as though they were cumulative probabilities from the standard normal cumulative distribution function, and the difference between their respective z-scores, Fl (tpr)-Fl (fpr), will be used as a measure of that variable's relevance to the classification task.
- Wrapper methods are usually the most computationally intensive and perhaps the least elegant of the feature selection methods. A wrapper method, like a filter method, will treat the classifier as a black box, but instead of using a simple univariate or multivariate test to determine which features are important, a wrapper will use the classifier itself to evaluate subsets of features. This leads to a computationally intensive search: an ideal wrapper will retrain the classifier for all feature subsets, and will choose the one with the lowest validation error. Were this search tractable, wrappers would be superior to filters because they would be able to find the optimal combination of features and classifier parameters. The search will not be tractable for high-dimensional data sets; hence, the wrapper will use heuristics during the search to find the optimal feature subset. The use of a heuristic will limit the wrapper's ability to interact with the classifier for two reasons: the inherent lack of optimality of the search heuristic, and the compounded lack of optimality in cases where the wrapper's optimal feature set differs from that of the classifier. In many cases the main benefit of using wrappers instead of filters, namely that the wrapper can interact with the underlying classifier, is shared by embedded methods, and the additional computational cost incurred by wrappers therefore makes such methods unattractive.
- Embedded approaches to feature selection will perform an integrated search over the joint space of model parameters and feature subsets so that feature selection becomes an integral part of the learning process. Embedded feature selection will have the advantage over filters that it has the opportunity to search for the globally optimal parameter-feature combination. This is because feature selection will be performed with knowledge of the parameter selection process, whereas filter and wrapper methods treat the classifier as a “black box.” As discussed above, performing the search over the whole joint parameter-feature space is generally intractable, but embedded methods will use knowledge of the classifier structure to inform the search process, while in the other methods the classifier must be built from scratch for every feature set.
- The method described herein will be useful in a plethora of industrial settings. The scope of the information obtained can vary, based on the type of goal to be obtained. For example, the method can be applied on a macro scale, for example, sampling and analysis from all vineyards throughout the world. The method can also be applied on a regional scale, for example, sampling and analysis of vineyards in a region of the United States. Further, the method can be applied on a local scale, for example, sampling and analysis in a vineyard in Virginia. Next, the method can be applied on a run-based scale, for example, sampling and analysis of different harvests in one winery.
- Vintners rely heavily on the soil for the growth of their vineyards. With microbiome analysis of particular soil that yielded a successful harvest generally or that was especially resistant to climatic variation, a vintner will use this information to predict a number of things. First, the vintner will use the microbiome information from a successful harvest of the previous season and compare with the soil on his vineyard currently to see if the soil is likely to yield a successful harvest this season. Second, if the soil microbiome is much different, he will use that information to plant a different grape variety that will flourish in the soil. This data will be obtained from previous years' soil analysis. Third, if the vintner is looking to expand his vineyard or purchase a different vineyard, the soil microbiome of the prospective vineyard will be tested to see which grape varieties have growth potential in that particular soil. If the vintner desires to plant a specific grape variety, the analysis of the soil may steer him away from the new land if the microbiome of the soil is more likely to yield a successful season of a different variety. Fourth, a particular high-end variety in which the vintner is interested in cultivating may only grow in certain soil conditions. An analysis of the soil (including the microbiome) where the particular crop has thrived compared to the vintner's current soil will inform the vintner of the feasibility of the new crop. Precision oenology is one of the advantages of the embodiments of this invention. Using the information related to the fermentation species identifies in the soil to provide advice to vintners and winemakers to improve the organoleptic properties of the wine. With the soil being the repository of most of the fermentation species, the value of the soil/harvest could fluctuate depending on a Micro-Wine-Makers index identifying the percentage of fermentation species relevant for the specific winemaking process. The index would provide information on the optimal microbiome community needed in the soil to launch the fermentation process.
- In another embodiment the first set of one or more microorganisms are obtained from a source likely to favor the selection of appropriate microorganisms. By way of example, the source may be a particular environment in which it is desirable for other plants to grow, or which is thought to be associated with terroir. In another example, the source may be a plant having one or more desirable traits, for example a plant which naturally grows in a particular environment or under certain conditions of interest. By way of example, a certain plant may naturally grow in sandy soil or sand of high salinity, or under extreme temperatures, or with little water, or it may be resistant to certain pests or disease present in the environment, and it may be desirable for a commercial crop to be grown in such conditions, particularly if they are, for example, the only conditions available in a particular geographic location. By way of further example, the microorganisms may be collected from commercial crops grown in such environments, or more specifically from individual crop plants best displaying a trait of interest amongst a crop grown in any specific environment: for example the fastest-growing plants amongst a crop grown in saline-limiting soils, or the least damaged plants in crops exposed to severe insect damage or disease epidemic, or plants having desired quantities of certain metabolites and other compounds, including fibre content, oil content, and the like, or plants displaying desirable colours, taste or smell. The microorganisms may be collected from a plant of interest or any material occurring in the environment of interest, including fungi and other animal and plant biota, soil, water, sediments, and other elements of the environment as referred to previously.
- While the invention obviates the need for pre-existing knowledge about a microorganism's desirable properties with respect to a particular plant species, in one embodiment a microorganism or a combination of microorganisms of use in the methods of the invention may be selected from a pre-existing collection of individual microbial species or strains based on some knowledge of their likely or predicted benefit to a plant. For example, the microorganism may be predicted to: improve nitrogen fixation; release phosphate from the soil organic matter; release phosphate from the inorganic forms of phosphate (e.g. rock phosphate); “fix carbon” in the root microsphere; live in the rhizosphere of the plant thereby assisting the plant in absorbing nutrients from the surrounding soil and then providing these more readily to the plant; increase the number of nodules on the plant roots and thereby increase the number of symbiotic nitrogen fixing bacteria (e.g. Rhizobium species) per plant and the amount of nitrogen fixed by the plant; elicit plant defensive responses such as ISR (induced systemic resistance) or SAR (systemic acquired resistance) which help the plant resist the invasion and spread of pathogenic microorganisms; compete with microorganisms deleterious to plant growth or health by antagonism, or competitive utilization of resources such as nutrients or space; change the color of one or more part of the plant, or change the chemical profile of the plant, its smell, taste or one or more other quality.
- As used herein, “individual isolates” should be taken to mean a composition or culture comprising a predominance of a single genera, species or strain of microorganism, following separation from one or more other microorganisms. The phrase should not be taken to indicate the extent to which the microorganism has been isolated or purified. However, “individual isolates” preferably comprise substantially only one genus, species, or strain of microorganism.
- The microorganisms can be isolated from a plant or plant material, surface or growth media associates with a selected plant using any appropriate techniques known in the art, including but not limited to those techniques described herein. For example, whole plant could be obtained and optionally processed, such as mulched or crushed. Alternatively, individual tissues or parts of selected plants (such as leaves, stems, roots, and seeds) may be processed.
- The following is a list of non-limiting examples of the types of plants the methods of the invention may be applied to:
-
- Crops grown for the production of non-alcoholic beverages and stimulants (coffee, black and green teas, cocoa, tobacco);
- Plants grown for conversion to Energy, biological transformation during the production of biofuels, industrial solvents or chemical products, e.g. ethanol or buranol, propane diols, or other fuel of industrial material including sugar crops (e.g. beet, sugar cane), starch producing crops (e.g. C3 and C4 cereal crops and tuberous crops), cellulosic crops such as forest cellulosic crops such as forest trees (e.g. Pines, Eucalypts) and Graminaceous and Poaceous plants such as bamboo, switch grass, miscanthus; crops used in energy, biofuel or industrial chemical production via gasification and/or microbial or catalytic conversion of the gas to biofuels or other industrial raw materials such as solvents or plastics, with or without the production of biochar (e.g. biomass crops such as coniferous, eucalypt, tropical or broadleaf forest trees, graminaceous and poaceous crops such as bamboo, switch grass, miscanthus, sugar cane, or hemp or softwoods such as poplars, willows; and, biomass crops used in the production of biochar;
- The present invention also provides kits which are useful for carrying out the present invention. The present kits comprise one or more container means containing the above-described assay components. The kit also comprises other container means containing solutions necessary or convenient for carrying out the invention. The container means can be made of glass, plastic or foil and can be a vial, bottle, pouch, tube, bag, etc. The kit may also contain written information, such as procedures for carrying out the present invention or analytical information, such as the amount of reagent contained in the first container means. The container means may be in another container means, e.g. a box or a bag, along with the written information.
- The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventors and thought to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
- All documents cited herein are hereby incorporated in their entirety by reference thereto.
- Sample Reception/Order Management
- It is necessary to receive samples from their point of origin to the genetic testing laboratory where the samples are processed. We have created a full system to manage orders automatically by using internet based tools similar to ecommerce. That is the way we receive basic client's data information as identification or shipping address.
- Sample Collection
- We have developed a specific methodology to collect samples, concretely from the winemaking process. We can distinguish among seven different stages:
-
- 1. Soil,
- 2. Grape/Fruit,
- 3. Must,
- 4. Alcoholic fermentation (beginning, middle and end) Depending on parameters as alcoholic graduation, amount of sugar, density.
- 5. Malolactic fermentation (beginning, middle and end) Depending on amount of malic and acetic acid.
- 6. Barrel (Beginning, middle and end) measure in months.
- 7. Bottle
- To test the soil, it is enough to collect 200 mg of soil coming from what we call soil unit. In the case of vineyards, a unit is defined as a parcel of land with the same grape variety, type of soil, culture techniques, and climate characteristics. If the vineyard is on the side of a hill, it should be divided into different independent units and different sampling kits used.
- To capture most of the fermentative species, samples should be taken at the distance to vine trunk: 30 cm (12 in) and depth: 5-10 cm (2-4 in)
- We have developed specific forms and questionnaires to collect the additional data which will allow the understanding of the influence of microorganisms in the fermentation processes and data comparison.
- Most of the forms have been translated to information technology (IT) language and tools. For example, through a mobile application it is possible to register a soil sample by providing: tube ID, grape variety, planting year of the grape, and a picture including: an image of the soil, sampling date and timing, coordinates of the sample (location). With this information, especially the coordinates, it is possible to gather additional information from external databases regarding soil composition, climate, or weather conditions to be included in the sample assessment an evaluation.
- Each kind of sample should be shipped in different ways. Usually freezing the samples is a standard methodology to stabilize the microbial community included in a sample.
- Soil: After some experiments with different conservative buffers, we have determined that the best way to ship soil samples is at room temperature. Microbiome is consistent and does not change significantly for at least 14 days.
- Liquid samples: We are developing test with different conservative buffers to identify the most ideal additive to inactivate microbial activity in a sample. The ideal buffer should be in form of powder instead of liquid: easier to preserve and easier to deliver.
- Each sample should be identified with an unique ID in order to provide each sample with its special character so that it can be treated as unique during the workflow. We have designed a database architecture. We have designed our own structure according to the requirement and optimal functionality of data that we request/use/collect. This structure includes tables and fields which create relationships among parameters/data, including some evolutionary fields to be able to track each sample at real time.
- Sample ID has been conceived as a combination of six alphanumeric fields. The first three digits identify the client and the last three digits identify the sample number. With this unique code it is possible to create almost 50,000 sample IDs per client. If we run out of sample IDs, a new client ID could be assigned if necessary for the same client.
- Samples can pass through the following traceability steps:
-
- 1. Order: pending shipping
- 2. Shipped
- 3. Received in the lab
- 4. DNA extraction
- 5. Quality Control 1
- 6. Library building
- 7. Quality Control 2
- 8. DNA Sequencing
- 9. Bioinformatics processing
- 10. See results
- a) DNA Extraction
- When a sample arrives to our genetic facilities the first step is to extract the DNA by breaking the molecular union of cells, releasing the DNA and concentrating it. We apply an improved metagenomic approach.
- We are using RNA PowerSoil® Total RNA Isolation Kit, MO BIO Laboratories, Inc. Carlsbad, CA) for the metatranscriptome analysis. From 50 ml of wine, must, alcoholic or malolactic fermentation sample, centrifuge at 4000×g during 15 minutes in a 50 ml Falcon tube.
- 1. Discard the supernatant.
- 2. Wash step: Dilute the pellet using 1.5 ml of PBS and transfer to a 1.5 ml eppendorf.
- 3. Centrifuge at maximum speed during 3 minutes.
- 4. Repeat step 3-4 twice. Note: In this step you have to be aware of the pellet quantity so if you get little pellet avoid repeat the wash step and proceeds to step 6. If you are processing must, avoid the wash step.
- 5. Dilute the pellet using the liquid.
- 6. The samples that we are dealing with are soil, liquids, fruit. In the following lines we will describe the steps that we have identified as optimal. To do this we use some commercial DNA extraction kits adapted to our necessities.
- Based in PowerLyzer® PowerSoil® (MO BIO Laboratories, Inc. Carlsbad, CA) DNA Isolation Kit
- 1. To the PowerLyzer® Glass Bead Tube, 0.1 mm provided, add 0.2 grams of soil sample.
- 2. Add 750 •1 of Guanidine thiocyanate solution to the Glass Bead Tube. Gently vortex to mix.
- 3. Add 60 ul of surfactant and invert several times or vortex briefly.
- 4. After adding surfactant solution, incubate 10 minutes at 70° C.
- 5. Secure PowerBead Tubes into the Precellys device (bead-beating homogenation, Bertin Technologies, Montigny-le-Bretonneux, France). Vortex at 5500 rpm, during 90 seconds. You will have to set up the program at 3 cycles.
- 6. Make sure the PowerBead Tubes rotate freely in your centrifuge without rubbing. Centrifuge tubes at 10,000×g for 30 seconds at room temperature.
- 7. Transfer the supernatant to a clean 2 ml Collection Tube (provided).
- 8. Add 250 ul of Solution protein precipitant and vortex for 5 seconds. Incubate at 4° C. for 5 minutes.
- 9. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 10. Avoiding the pellet, transfer up to, but no more than, boo ul of supernatant to a clean 2 ml Collection Tube (provided).
- 11. Add 200 ul of Inhibitor removal compound and vortex briefly. Incubate at 4° C. for 5 minutes.
- 12. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 13. Avoiding the pellet, transfer up to, but no more than, 750 ul of supernatant into a clean 2 ml Collection Tube (provided).
- 14. Shake to mix chaotropic agent before use. Add 1200 ul of Solution C4 to the supernatant and vortex for 5 seconds.
- 15. Load approximately 675 ul onto a Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Discard the flow through and add an additional 675 ul of supernatant to the Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Load the remaining supernatant onto the Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Note: A total of 3-4 loads for each sample processed are required.
- 16. Add 500 ul of Solution Ethanol 60% and centrifuge at room temperature for 30 seconds at 10,000×g.
- 17. Discard the flow through.
- 18. Centrifuge again at room temperature for 1 minute at 10,000×g
- 19. Carefully place spin filter in a clean 2 ml Collection Tube (provided). Avoid splashing any Solution CS onto the Spin Filter.
- 20. Add 100 ul of 1,3-Propanediol, 2-amino-2-(hydroxymethyl)-, hydrochloride mix with Tris HCl 2-Amino-2-(hydroxymethyl)-1,3-propaneiol to the center of the white filter membrane. Stand the tube for at least 1 minute.
- 21. Centrifuge at room temperature for 30 seconds at 10,000×g.
- 22. Discard the Spin Filter. The DNA in the tube is now ready for 16S-ITS library preparation
- 1. Add 20 units of grapes previously frozen at −80° C. to a 50 ml falcon tube.
- 2. Add 20 ml of purified water.
- 3. Vortex 5 minutes without breaking the
- 4. Collect all liquid
- 5. Centrifuge at 4000×g during 15 minutes in a 50 ml Falcon tube
- 6. Discard the supernatant
- 7. Wash step: Dilute the pellet using 1.5 ml of PBS and transfer to a 1.5 ml eppendorf.
- 8. Centrifuge at maximum speed during 3 minutes.
- 9. Repeat step 3-4 twice. Note: In this step you have to be aware of the pellet quantity so if you get little pellet avoid repeating the wash step and proceeds to step 6. If you are processing must, avoid the wash step.
- 1. Dilute the pellet adding 750 •1 of Guanidine thiocyanate solution to the Glass Bead Tube. Gently vortex to mix.
- 2. Add 60 ul of surfactant Solution and invert several times or vortex briefly.
- 3. After adding Solution surfactant, incubate 10 minutes at 70° C.
- 4. Secure PowerBead Tubes into the Precellys device (bead-beating homogenization, Bertin Technologies, Montigny-le-Bretonneux, France). Vortex at 5500 rpm, during 90 seconds. You will have to set up the program at 3 cycles.
- 5. Make sure the PowerBead Tubes rotate freely in your centrifuge without rubbing. Centrifuge tubes at 10,000×g for 30 seconds at room temperature.
- 6. Transfer the supernatant to a clean 2 ml Collection Tube (provided).
- 7. Add 250 ul of protein precipitant Solution and vortex for 5 seconds. Incubate at 4° C. for 5 minutes.
- 8. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 9. Avoiding the pellet, transfer up to, but no more than, 600 ul of supernatant to a clean 2 ml Collection Tube.
- 10. Dilute the pellet adding 750 •1 of Guanidine thiocyanate solution to the Glass Bead Tube. Gently vortex to mix.
- 11. Add 60 ul of surfactant Solution and invert several times or vortex briefly.
- 12. After adding Solution surfactant, incubate 10 minutes at 70° C.
- 13. Secure PowerBead Tubes into the Precellys device (bead-beating homogenization, Bertin Technologies, Montigny-le-Bretonneux, France). Vortex at 5500 rpm, during 90 seconds. You will have to set up the program at 3 cycles.
- 14. Make sure the PowerBead Tubes rotate freely in your centrifuge without rubbing. Centrifuge tubes at 10,000×g for 30 seconds at room temperature.
- 15. Transfer the supernatant to a clean 2 ml Collection Tube (provided).
- 16. Add 250 ul of protein precipitant Solution and vortex for 5 seconds. Incubate at 4° C. for 5 minutes.
- 17. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 18. Avoiding the pellet, transfer up to, but no more than, 600 ul of supernatant to a clean 2 ml Collection Tube.
- 19. Add 200 ul of Inhibitor removal compound Solution and vortex briefly. Incubate at 4° C. for 5 minutes.
- 20. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 21. Avoiding the pellet, transfer up to, but no more than, 750 ul of supernatant into a clean 2 ml Collection Tube (provided).
- 22. Shake to mix chaotropic agent Solution before use. Add 1200 ul of Solution C4 to the supernatant and vortex for 5 seconds.
- 23. Load approximately 675 ul onto a Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Discard the flow through and add an additional 675 ul of supernatant to the Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Load the remaining supernatant onto the Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Note: A total of 3-4 loads for each sample processed are required.
- 24. Add 500 ul of Solution Ethanol 60% and centrifuge at room temperature for 30 seconds at 10,000×g.
- 25. Discard the flow through.
- 26. Centrifuge again at room temperature for 1 minute at 10,000×g.
- 27. Carefully place spin filter in a clean 2 ml Collection Tube (provided). Avoid splashing any Solution CS onto the Spin Filter.
- 28. Add 100 ul of 1,3-Propanediol, 2-amino-2-(hydroxymethyl)-, hydrochloride mix with Tris HCl 2-Amino-2-(hydroxymethyl)-1,3-propanediol to the center of the white filter membrane. Stand the tube for at least 1 minute.
- 29. Centrifuge at room temperature for 30 seconds at 10,000×g.
- 30. Discard the Spin Filter. The DNA in the tube is now ready for 16S-ITS library preparation.
- DNA Extraction from Wine (Liquid)
- Based in PowerLyzer® PowerSoil® DNA Isolation Kit
- 1. From 50 ml of wine, must, alcoholic or malolactic fermentation sample, centrifuge at 4000×g during 15 minutes in a 50 ml Falcon tube
- 2. Discard the supernatant
- 3. Wash step: Dilute the pellet using 1.5 ml of PBS and transfer to a 1.5 ml eppendorf.
- 4. Centrifuge at maximum speed during 3 minutes.
- 5. Repeat step 3-4 twice. Note: In this step you have to be aware of the pellet quantity so if you get little pellet avoid repeat the wash step and proceed to step 6. If you are processing must, avoid the wash step.
- 6. Dilute the pellet adding 750 •1 of Guanidine thiocyanate solution to the Glass Bead Tube. Gently vortex to mix.
- 7. Add 60 ul of surfactant Solution and invert several times or vortex briefly.
- 8. After adding Solution surfactant, incubate 10 minutes at 70° C.
- 9. Secure PowerBead Tubes into the Precellys device (bead-beating homogenization, Bertin Technologies, Montigny-le-Bretonneux, France). Vortex at 5500 rpm, during 90 seconds. You will have to set up the program at 3 cycles.
- 10. Make sure the PowerBead Tubes rotate freely in your centrifuge without rubbing. Centrifuge tubes at 10,000×g for 30 seconds at room temperature.
- 11. Transfer the supernatant to a clean 2 ml Collection Tube (provided).
- 12. Add 250 ul of protein precipitant Solution and vortex for 5 seconds. Incubate at 4° C. for 5 minutes.
- 13. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 14. Avoiding the pellet, transfer up to, but no more than, 600 ul of supernatant to a clean 2 ml Collection Tube.
- 15. Add 200 ul of Inhibitor removal compound Solution and vortex briefly. Incubate at 4° C. for 5 minutes.
- 16. Centrifuge the tubes at room temperature for 1 minute at 10,000×g.
- 17. Avoiding the pellet, transfer up to, but no more than, 750 ul of supernatant into a clean 2 ml Collection Tube (provided).
- 18. Shake to mix chaotropic agent Solution before use. Add 1200 ul of Solution C4 to the supernatant and vortex for 5 seconds.
- 19. Load approximately 675 ul onto a Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Discard the flow through and add an additional 675 ul of supernatant to the Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Load the remaining supernatant onto the Spin Filter and centrifuge at 10,000×g for 1 minute at room temperature. Note: A total of 3-4 loads for each sample processed are required.
- 20. Add 500 ul of Solution Ethanol 60% and centrifuge at room temperature for 30 seconds at 10,000×g.
- 21. Discard the flow through.
- 22. Centrifuge again at room temperature for 1 minute at 10,000×g
- 23. Carefully place spin filter in a clean 2 ml Collection Tube (provided). Avoid splashing any Solution CS onto the Spin Filter.
- 24. Add 100 ul of 1,3-Propanediol, 2-amino-2-(hydroxymethyl)-, hydrochloride mix with Tris HCl 2-Amino-2-(hydroxymethyl)-1,3-propanediol to the center of the white filter membrane. Stand the tube for at least 1 minute.
- 25. Centrifuge at room temperature for 30 seconds at 10,000×g.
- Discard the Spin Filter. The DNA in the tube is now ready for 16S-ITS library preparation.
- Once we have extracted the DNA it is necessary to build the library of genome regions that we want to read.
- Our technology identifies the bacteria and the fungi kingdoms present in a biological sample. We use different biomarkers for each kingdom and in the following lines we explain in detail the methodologies to build libraries for:
-
- Bacteria: 16S gene
- Fungi: ITS gene
- Complex samples (also vegetable species as grape)
- Shotgun for samples collected from bottled wine.
- 1. Prepare a 96 well plate format with DNA samples previously diluted 1:50
- 2. Prepare 8 different mixes per each 8 different primer FW and 5 primer hot Master Mix (MM). (0.5 ul×12 wells)+(10 ul of 5 primer hot Master Mix×12 wells)
- 3. Add each Mix in the different wells in Column 1 of the 96 well plate.
- 4. Distribute 10.5 ul per well in horizontal direction in the plate.
- 5. Prepare 12 different mixes per each 12 different primers RV and miliQ water. (0.5 ul×8 wells)+(13 ul of miliQ water×8 wells).
- 6. Distribute 13.5 ul per well in vertical direction in the plate.
- 7. With a multichannel distribute 1 ul of DNA in each well in horizontal direction.
- Put the plate in the thermocycler
- Complete reagent recipe (master mix) for 1×PCR reaction
- PCR Grade H2O (note 1, below) 13.0 μL
- 5 Primer Hot MM note 2) 10.0 μL
- Forward primer (5 μM) 0.5 μL
- Reverse primer (5 μM) 0.5 μL
- Template DNA 1.0 μL
- Total reaction volume 25.0 μL
- 1. Five Prime Hot Master Mix (5 prime: Item #2200410)
- 2. Final primer concentration of master mix: 0.2 μM
- Thermocycler Conditions for 96 Well Thermocyclers:
-
1. 94° C. 3 minutes 2. 94° C. 20 seconds 3. 50° C. 20 seconds 4. 72° C. 40 seconds 5. Repeat steps 2-4 35 times 6. 72° C. 10 minutes 7. 4° C. HOLD -
TABLE 1 16S Primers FW 15f. SEQ ID NO. 1 SA501 AATGATACGGCGACCACCGAGATCTACACAT CGTACGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f. SEQ ID NO. 2 SA502 AATGATACGGCGACCACCGAGATCTACACAC TATCTGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f. SEQ ID NO. 3 SA503 AATGATACGGCGACCACCGAGATCTACACTA GCGAGTGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f. SEQ ID NO. 4 SA504 AATGATACGGCGACCACCGAGATCTACACCT GCGTGTGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15f SEQ ID NO. 5 SA505 AATGATACGGCGACCACCGAGATCTACACTC ATCGAGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15£. SEQ ID NO. 6 SA506 AATGATACGGCGACCACCGAGATCTACACCG TGAGTGGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15£. SEQ ID NO. 7 SA507 [000375] AATGATACGGCGACCACCGAGATCTACACGG ATATCTGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 15£. SEQ ID NO. 8 SA508 AATGATACGGCGACCACCGAGATCTACACGA CACCGTGAATAGTTGGGAGTGYCAGCMGCCGCGGTAA -
TABLE 2 16S Primers RV 06r. SEQ ID NO. 9 SA701 CAAGCAGAAGACGGCATACGAGATAACTCTC GCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 10 SA702 CAAGCAGAAGACGGCATACGAGATACTATGT CCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 11 SA703 CAAGCAGAAGACGGCATACGAGATAGTAGCG TCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT [000385] SEQ ID NO. 12 06r. SA704 CAAGCAGAAGACGGCATACGAGATCAGTGAG TCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 13 SA705 CAAGCAGAAGACGGCATACGAGATCGTACTC ACGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 14 SA706 CAAGCAGAAGACGGCATACGAGATCTACGCA GCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 15 SA707 CAAGCAGAAGACGGCATACGAGATGGAGACT ACGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 16 SA708 CAAGCAGAAGACGGCATACGAGATGTCGCTC GCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 17 SA709 CAAGCAGAAGACGGCATACGAGATGTCGTAG TCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 18 SA710 CAAGCAGAAGACGGCATACGAGATTAGCAGA CCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 19 SA711 CAAGCAGAAGACGGCATACGAGATTCATAGA CCGCCAGTCAGCCGGACTACHVGGGTWTCTAAT 06r. SEQ ID NO. 20 SA712 CAAGCAGAAGACGGCATACGAGATTCGCTAT ACGCCAGTCAGCCGGACTACHVGGGTWTCTAAT - Note: No Soil Samples include a modification in the complete reagent recipe (master mix) for 1×PCR reaction. It is necessary to add the sequence of mPNA (SEQ ID NO. 21: ggcaagtgttcttcgga) to block mitochondria contamination, and pPNA (SEQ ID NO. 22: ggctcaaccctggacag) to block chloroplast contamination.
- PCR Grade H2O (note 1, below) 11.0 μL
- 5 Primer Hot MM note 2) 10.0 μL
- Forward primer (5 μM) 0.5 μL
- Reverse primer (5 μM) 0.5 μL
- Template DNA 1.0 μL
- 1 ul mPNA blocker (5 uM stock)
- 1 ul pPNA blocker (5 uM stock)
- Total reaction volume 25.0 μL
- 1. Prepare a 96 well plate format with DNA samples.
- 2. Prepare 8 different mixes per each 8 different primer FW and miliQ water. (0.5 ul×12 wells)+(6 ul miliQ water×12 wells)
- 3. Add each Mix in the different wells in Column 1 of the 96 well plate.
- 4. Distribute 6.5 ul per well in horizontal direction in the plate.
- 5. Prepare 12 different mix per each 12 different primer RV and miliQ water. (0.5 ul×8 wells)+(7 ul of miliQ water×8 wells).
- 6. Distribute 7.5 ul per well in vertical direction in the plate.
- 7. With a multichannel distribute 1 ul of DNA in each well in horizontal direction.
- 8. Put the plate in the thermocycler and start
- 9. When 5 minutes after start the first cycle, open the thermocycler tap and without remove the plate add 10 ul of Five Prime Hot Master Mix per well.
- Complete reagent recipe (master mix) for 1×PCR reaction.
- PCR Grade H2O (note 1, below) 13.0 μL
- 5 Primer Hot MM note 2) 10.0 μL
- Forward primer (5 μM) 0.5 μL
- Reverse primer (5 μM) 0.5 μL
- Template DNA to 1.0 μL
- Total reaction volume 25.0 μL
- 1. Five Prime Hot Master Mix (5 prime)
- 2. Final primer concentration of master mix: 0.2 μM
- Thermocycler Conditions for 96 well thermocyclers
- 1. 94° C. 7 minutes
- 2. 94° C. 20 seconds
- 3. 55° C. 20 seconds
- 4. 72° c. 40 seconds
- 5. Repeat steps 2-4 40 times
- 6. 72° C. 10 minutes
- 7. 4° C. HOLD
-
TABLE 3 ITS primers FW ITS£. SEQ ID NO. 23 SC501 AATGATACGGCGACCACCGAGATCTACACA CGACGTGACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 24 SC502 AATGATACGGCGACCACCGAGATCTACACA TATACACACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 25 SC503 AATGATACGGCGACCACCGAGATCTACACC GTCGCTAACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 26 SC504 AATGATACGGCGACCACCGAGATCTACACC TAGAGCTACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 27 SC505 AATGATACGGCGACCACCGAGATCTACACG CTCTAGTACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 28 SC506 AATGATACGGCGACCACCGAGATCTACACG ACACTGAACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 29 SC507 AATGATACGGCGACCACCGAGATCTACACT GCGTACGACTCAGGCAAACACCTGCGGARGGATCA ITS£. SEQ ID NO. 30 SC508 AATGATACGGCGACCACCGAGATCTACACT AGTGTAGACTCAGGCAAACACCTGCGGARGGATCA -
TABLE 4 ITS primers RV 58S3R_SC7 SEQ ID NO. 31 0 CAAGCAGAAGACGGCATACGAGATACCTAC TGCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 32 02 CAAGCAGAAGACGGCATACGAGATAGCGCT ATCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 33 03 CAAGCAGAAGACGGCATACGAGATAGTCTA GACCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 34 04 CAAGCAGAAGACGGCATACGAGATCATGAG GACCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 35 05 CAAGCAGAAGACGGCATACGAGATCTAGCT CGCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 36 06 CAAGCAGAAGACGGCATACGAGATCTCTAG AGCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 37 07 CAAGCAGAAGACGGCATACGAGATGAGCTC ATCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 38 08 CAAGCAGAAGACGGCATACGAGATGGTATG CTCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 39 09 CAAGCAGAAGACGGCATACGAGATGTATGA CGCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 40 10 CAAGCAGAAGACGGCATACGAGATTAGACT GACCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 41 11 CAAGCAGAAGACGGCATACGAGATTCACGA TGCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT 58S3R_SC7 SEQ ID NO. 42 12 CAAGCAGAAGACGGCATACGAGATTCGAGC TCCCATCCCCGGCTGAGATCCRTTGYTRAAAGTT - Complex samples are samples with PCR inhibitors. Wine contains many phenols which cause problems in the PCR procedure depending on their concentration.
- Step 1.
- 1. Prepare a 96 well plate format with DNA
- 2. Prepare master mix with primers.
-
TABLE 5 ITS primers I SEQ ID NO. 43 TS1Fw TCCGTAGGTGAACCTGCGG I SEQ ID NO. 44 TS4Rv TCCTCCGCTTATTGATATGC - 1. Distribute 24 ul per well.
- 2. With a multichannel distribute 1 ul of DNA in each well
- Put the plate in the thermocycler and start
- Complete reagent recipe (master mix) for 1×PCR reaction
- PCR Grade H2O (note 1, below) 13.0 μL
- 5 Primer Hot MM note 2) 10.0 μL
- Forward primer (5 μL) 0.5 μL
- Reverse primer (5 μM) 0.5 μL
- Template DNA 1.0 μL
- Total reaction volume 25.0 μL
- 1. Five Prime Hot Master Mix (5 prime: Item #2200410)
- 2. Final primer concentration of master mix: 0.2 μM
- Thermocycler Conditions for 96 well thermocyclers:
- 1. 94° C. 3 minutes
- 2. 94° C. 20 seconds
- 3.55° C. 20 seconds
- 4. 72° C. 60 seconds
- 5. Repeat steps 2-4 35 times
- 6. 72° C. 10 minutes
- 7. 4° C. HOLD
- Step 2.
- 1. Prepare 8 different mix per each 8 different primer FW and miliQ water. (0.5 ul×12 wells)+(6 ul miliQ water×12 wells)
- 2. Add each Mix in the different wells in Colum 1 of the 96 well plate.
- 3. Distribute 6.5 ul per well in horizontal direction in the plate.
- 4. Prepare 12 different mix per each 12 different primers Rand miliQ water. (0.5 ul×8 wells)+(7 ul of miliQ water×8 wells).
- 5. Distribute 7.5 ul per well in vertical direction in the plate.
- 6. With a multichannel distribute 1 ul of PCR product produced in the first step in each well in horizontal direction.
- 7. Put the plate in the thermocycler and start.
- 8. When 5 minutes after start the first cycle, open the thermocycler tap and without remove the plate add 10 ul of Five Prime Hot Master Mix per well.
- 1. Isolate DNA according to DNA extraction from Wine (liquid) sample Protocol
- 2. Use TruePrime™ Single Cell WGA (Illumina Inc., San Diego, CA) kit according to manufacturer instructions.
- 3. Use Nextera XT DNA Library Preparation Kit (Illumina, San Diego, CA) according to manufacture instructions.
- Note 1: 16S and ITS protocol are dual index PCR protocol, with only 20 different primers its possible to sequence 96 samples. The method is adapted from Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform publication (Kozinch, J. J. et al., 2013, Appl. Environ Microbial 79, 5112-5120) by designing and using different primer sequences.
- Note 2: master mix plates can be stablilizated at room temperature using ADN AmpligelMaster Mix plastes (Biotools).
- In addition to the previous library building methodologies, we have designed and developed a new methodology to build an improved library to detect bacteria and fungi more accurately. We call it “Precision metagenomic protocol applying dual phylogenetic markers with single cell epicPCR (Emulsion, Paired Isolation and Concatenation PCR)”
- 16S rDNA is a powerful phylogenetic marker commonly used for profiling diversity in microbial samples, yet its use is associated with known problems including biases introduced by copy-number variations, variability in amplification efficiency, inconsistencies when targeting different regions of the gene, and problems with accurately and consistently delineating prokaryotic species. To solve these problems we use 16S rDNA in combination with another single-copy marker gene. This results in prokaryotic species boundaries at higher resolution than 16S rDNA.
- Use of both markers guarantees identification of microbial diversity at the strain level. It is a new and powerful tool which can be applied to describe microbial communities in any sample.
- The improved protocol is based on the following publication: Spencer et al., 2015, ISME J.
- However, most importantly, we do not combine the 16S region with a functional gene, we combine the 16S region with one of the gene markers region described in Sunagawa, S. et al., 2013, Nat Methods 10: 1196-1199.
- A selection of genes we are now testing is found in Table 6.
-
TABLE 6 Phylogenetic makers to combine with 16S gene marker mean length COG in 3496 name OG genomes Predicted GTPase, OG0012 099 probable translation factor Phenylalanyl- OG0016 058 tRNA synthetase alpha subunit Arginyl-tRNA synthetase OG0018 721 Sery1-tRNA OG0172 285 synthetase Cysteinyl-tRNA synthetase OG0215 415 Leucyl-tRNA synthetase OG0495 571 Valyl-tRNA OG0525 722 synthetase Metal- OG0533 054 dependent proteases with possible chaperone activity Signal recognition OG0541 415 particle GTPase (Ffh) Signal recognition OG0552 189 particle GTPase (FtsY) RNA polymerase subunit gene (rpoB), - Most of the described libraries follow the next steps to prepare for sequencing:
- Cleanup, Normalization, and Pooling 16S and ITS libraries.
- Use the SequalPrep Thermo Fisher Scientific, Waltham, MA. Normalization Plate Kit
- 1. Transfer 20 •1 of PCR product from PCR plate to corresponding well on the normalization plate.
- 2. Add 120•1 of Binding Buffer. Mix by pipetting, sealing, vortexing, and spinning briefly.
- 3. Incubate at room temperature for 60 minutes. Note: can incubate overnight if needed. Extra time does not improve results.
- 4. Aspirate the liquid from the wells. Do not scrape the sides.
- 5. Add 5 •1 of Wash Buffer and pipette up and down twice, then aspirate immediately. Ensure there is no residual wash buffer in any wells.
- 6. Add 20 •1 of Elution Buffer. Mix by pipetting up and down 5 times. Seal, vortex, and spin briefly.
- 7. Incubate at room temperature for 5 minutes.
- 8. Create a pool from each plate. Take 10•1 of each well to pool.
- 9. Concentrate the pool in a SpeedVac
- 10. Freeze the remaining sample for later use.
- For the Shotgun metagenomic the protocol to properly prepare for sequencing change.
- Normalization, and Pooling
- 1. Measure the samples in a fragment analyzer or Bioanalyzer machine.
- 2. Dilute the samples to 2 nM concentration.
- 3. Pool the samples in an equimolar concentration.
- 4. Sequencing according the Miseq protocol.
- The Sequencing can be done with any available technology the unique requirement is add to the original gene marker primer and index sequence the specific adaptor sequence related with the sequencing technology.
- In this case we are going to describe the use of the technique with Illumina Miseq. We should follow the Sequencing instructions according with the custom protocol.
- 1. Place 100 •1 of the Read 1 (10 uM) Sequencing Primer(s) into a clean PCR tube. Repeat in separate tubes for the Index Primer(s) and Read 2 Sequencing Primer(s).
-
TABLE 7 Sequencing 16S primers 515£ SEQ ID NO. 45 Read1. GAATAGTTGGGAGTGYCAGCMGCCGCGGTAA 806r SEQ ID NO. 46 Read2. CGCCAGTCAGCCGGACTACHVGGGTWTCTAAT ead. 806r SEQ ID NO. 47 IndexR ATTAGAWACCCBDGTAGTCCGGCTGACTGGCG -
TABLE 8 Sequencing ITS primers Read1B ITS SEQ ID NO. 48 ACTCAGGCAAACACCTGCGGA RGGATCA Read2. B SEQ ID NO. 49 5S3r CCATCCCCGGCTGAGATCCRT TGYTRAAAGTT IndexRead. SEQ ID NO. 50 B58SRr AACTTTYARCAAYGGATCTCA GCCGGGGATGG - 2. Using a 1000 •1 pipette tip, break the foil over
wells 12, 13, 14, and 17. - 3. Use an extra long 100 •1 tip with the pipettor set on 75 •1 to transfer the 30 •1 of Read 1 Sequencing Primer to the bottom of well 12 and pipette 10× to mix. Repeat this process spiking the Index Primer into well 13 and the Read 2 Sequencing Primer into well 14.
- 4. Prepare a fresh dilution of 0.2N NaOH. 5. To a 1.5 ml tube add 5 •1 of library 2 nM, and 5 •1 of 0.2N NaOH. Vortex and wait 5 minutes. Add 990 ul of Hybridization Buffer and 200 ul of adapter-ligated control library based in PhiX, previously denatured with 0.2N NaOH to 20% final concentration of PhiX. Add 600 ul in the well sample.
- One of our greatest discoveries is that it is possible to mix different libraries in the same run of the NGS sequencer. In the following lines we described the step followed to perform this achievement in one of the most common sequencing platforms as Illumina's MySeq, however, can be adapted to other sequencing platforms.
- Sequencing 16S and ITS libraries in the same Miseq Run.
- 1. Pool equimolar (nM) 16S and ITS libraries
- 2. Place 100 •1 of the Read 1 (10 uM) Sequencing Primer(s) into a clean PCR tube. Repeat in separate tubes for the Index Primer(s) and Read 2 Sequencing Primer(s).
- 3. Mix 30 ul of the read 1 16S primer (10 uM) with 30 ul of the read 1 ITS primer (10 uM)
- 4. Mix 30 ul of the read 2 16S primer (10 uM) with 30 ul of the read 2 ITS primer (10 uM)
- 5. Mix 30 ul of the Index 16S primer (10 uM) with 30 ul of the Index ITS primer (10 uM)
- 6. Using a 1000 •1 pipette tip, break the foil over
wells 12, 13, 14, and 17. - 7. Use an extra-long 100 •1 tip with the pipette set on 75 •1 to transfer the 60 •1 of mix 16S and ITS Read 1 Sequencing Primer to the bottom of well 12 and pipette 10× to mix. Repeat this process. Spiking the Index Mix Primer into well 13 and the Read 2 Mix Sequencing Primer into well 14.
- 8. Prepare a fresh dilution of 0.2N NaOH.
- 9. To a 1.5 ml tube add 5 •1 of library 2 nM, and 5 •1 of 0.2N NaOH. Vortex and wait 5 minutes. Add 990 ul of HT1 and 200 ul of PhiX previusly denature with 0.2N NaOH to 20% final concentration of PhiX. Add 600 ul in the well sample.
- The pipeline is programmed to run on a custom made cloud-based computing platform such as Amazon Machine Image (AMI) on Amazon Web Services, Microsoft Azure Cloud Computing, or Compute Engine on Google Cloud Platform. The instance is able to connect directly to BaseSpace via Illumina's Basemount program.
- The pipeline is a bash script that wraps the following free programs along with custom Unix commands.
- We have developed this improved tool to ensure that all the microbiological information is generated under the same standard and it is easily comparable.
- In the following paragraphs we will described the steps done by this pipeline in order to process all the genetic information generated by NGS.
- 1. Remove any reads that align to PhiX with Bowtie2
- 2. Remove primers and Illumina adapters from reads with Cutadapt
- 3. Quality filter reads based on Q-scores with QIIME's split_libraries_fastq.py script
- Cut each read at the first three bases in which the average Q-score is less than 20. If the chopped sequence is at least 75% as long as the original sequence, then keep that read.
- It is possible to analyse with and without pairing end reads. We have developed analyses without this pairing. We use Pear
- The pipeline can pick OTUs using two different algorithms: QIIME open reference, and minimum entropy decomposition (MED)
- For QIIME open reference OTU picking with 16S sequences, the initial reference alignment step is against the SILVA database. Taxonomy is assigned to representative sequences for each OTU (QIIME and MED) according to SILVA. For ITS sequences, UNITE database is used.
- The pipeline produces two main tables. One table of OTU abundances by sample. The other table has the corresponding taxonomy for each OTU.
- All the data are storage in servers according the database structure designed by us. All fields are related among them and it is possible the development of big data mining techniques. Our knowledge stack is based in different databases/tables:
-
- DNA sequences coming from the different samples: Raw DNA data extracted from the NGS technology,
- Filtered and processed Genetic Information: Mainly the phylogenetic track and abundances of the different microbial species found in each sample independently of the kind of sample: soil, fruit, or liquid.
- Metatranscriptomic information for each sample: RNA information to identify gene expression.
- Client database: information related to client and users.
- Sample metadata: non-genetic information related to the different samples as location, grape variety, sampling date and hour, chemical conditions, additives or any other information providing useful inputs to enable comparison and data understanding.
- Auxiliary data: different auxiliary information processed and storage digitally which increase the value of the data generated by NGS and facilitate the understanding and comprehension of the information. Different groups have been developed here:
- a. Geographical Information System (GIS): as for example wine regions, geography, climate, weather, soil composition, and other similar GIS data layers.
- b. Microorganisms' profiles: specific information related to the effect of each microbial species and string to the winemaking process. This information includes assessment (positive/negative) and abundances threshold of the effect in the wine.
- c. Microorganisms' genomes: Whole genome database for each of the fermentation species. We are building this specific database to improve the species identification (Database matching, letter d of this section) and increase the understanding of the specific species/string's influence in wine and other food products.
- This technology produces big amount of heterogeneous data which could be used to provide interesting inputs for viticulturist and winemakers. We have developed different visualization tools for the generated data, especially those linked to soil samples.
- Some of the visualization tools we developed/coded and specifically designed for the wine industry. Some of the main features are interactivity, utility and design.
- Keeping in mind that we have geographical information of the samples, we have designed specific tools to use Geographical Information Systems (GIS) to generate understandable knowledge.
- These tools use different GIS layers as for example wine regions, geography, climate, weather, soil composition, and other similar GIS data layers. Some of the layers have been developed by us and other are open data.
- For instance related to Wine Region GIS layer, we have gathered geographical information of the wine regions worldwide. At this stage we have information for USA, France, Spain, Italy, and Portugal. We plan to start to parameterize the wine regions in other European countries, as well as the rest of the world. At this moment we have identified more than 1,500 wine regions worldwide.
- A Geo-map identifying the different wine regions and the microbiome profile, highlighting the presence of the Micro-Wine-Makers is in preparation. This map will also match different grape varieties and microbiome profile worldwide.
- This technology helps to identify and quantify all the fermentation species from bacteria and fungi kingdoms for different samples.
- In the winemaking process some of these species are completely new/unseen before and for this reason we have generated knowledge about who are the real fermentation species in winemaking, the Micro-Wine-Makers, in form of different species profiles including information about its origin, picture, and influence in wine.
- Presently, we have collected information for more than 200 species. Appendix C lists of some species discovered in the different samples and their influence in wine.
- We have also developed a methodology to assess if the abundance of the specific species in any kind of sample is appropriate or indicative of a warning/alert.
- We have designed a digital report including information structured in different sections which are accessible through a session in our proprietary portal:
-
- Dashboard: listing all the client's data, including general overview of their status, and basic comparison information among all the client sample data specially focus in findings of the microbiome.
- Sample information: Specific sample information screened in different ways, focusing in the findings of microbiome in the soil samples and assessing the threshold to determine if the microbiome proportions raise any alerts.
- Microbiome profiles: Specific fermentation species information including a picture and descriptions about its influence in wine.
- Client profile: user and client basic information as name, address, contact details, company, type of business and other similar information used to identify the client.
- Some data mining and big data techniques are used to make queries to our databases and get useful information especially interesting to better understand the relevance of the microbiome profile in products as wine. An interesting example of the outcomes of this process is the matching between the composition of the microbiome community in the wine and the organoleptic characteristics (flavours/taste) of the wine.
- This allows us to provide prescription/recommendation to industry (Precision enology) and consumers (personalized product prescription)
- Our users can communicate and create a social network once they log into our client portal. This is going to be a new network around the microorganisms in wine industry.
- Whole genome high-throughput sequencing and annotation can be used to identify genes and single nucleotide polymorphisms (SNPs) between Saccharomyces cerevisiae strains and other non Saccharomyces species involved in wine fermentation process.
- Yeasts selected provide specific and desirable phenotype with fermentation characteristics knowing and represent 80% of commercial world yeast.
- The objective of this work is to connect the phenotype known with the genotype of these strains to provide tools to:
-
- Evaluate the potential fermentation characteristics of wild yeast without use fermentation experiments.
- Quality Control of organic wineries.
- Provide tools to prevent fraudulent use of commercial yeasts.
- Detect Grape Variety in Wine Samples Previously to Bottle
- Using the same protocol for library building described for analyzing bacteria kingdom (Bacteria Kingdom: 16S Prep Workflow), we can detect chloroplast and mitochondrial DNA from the plant to define the type of grape (variety). Similar primers as described for the bacteria protocol above are used.
- We use minimum entropy decomposition analysis protocol to differentiate this reads at SNP level. With this we can group chloroplast and mitochondrial DNA reads and differentiate the type of grape in the sample comparing the reads with our chloroplast and mitochondrial DNA database.
- A Genomic soil test to identify all the bacteria and yeast unique to a specific terroir. The result is presented in a digital report accessible through a private session at the proprietary portal.
- Users will unveil the wine-related fermentation species of bacteria & yeast, and will detect potential biological contamination.
- The benefits of this service are:
-
- Identify the native Micro Wine Makers (MWM) or fermentation species in the soil which make your wine unique.
- Compare different areas of a vineyard or different vineyards to characterize local scale differences in the microbial terroir.
- Compare a soil microbiome to other regions
- Estimate the organoleptic potential of a wine
- Assess necessity of inoculums and sulfur doses
- Anticipate contamination due to unwanted microorganisms
- This service will allow collection of data coming from vineyard soils from different part of the world to increase the amount of data and empower a geo-map.
- This methodology defines the Genome of the wine, a genuine genetic DNA footprint, which could be included as a label in the bottle and will provide a new and innovative tool to identify and differentiate wines. The DNA of wine can be used to target consumers and rank wines. It creates a microbiological fingerprint of the wine along the winemaking process, from soil to the bottle, creating a unique identity of the wine which can be labelled as Wine's Genome.
- As we understand better the microbiome influence in the wine, conclusions, for example that some specific species are present in quality vineyards, or in a specific wine region can be made. Specific bio-fertilizers to replicate the same conditions of a quality vineyard can be produced and utilized.
- Also, bio-based control tools designed to avoid possible problems in a certain phases of vinification process can be applied. For example, depending on our analysis of the soil microbiome, we can state if that soil has organic properties and has been cultivated environmentally sustainable. The “Genetic Friendly Label” is our first labelling product and it is used for soil quality assessment at a certain moment.
- The specification further incorporates by reference a Sequence Listing including sequences described in paragraph [00362] (at Table 1), paragraph [00363] (at Table 2), paragraph [00399] (at Table 3), paragraph [00400] (at Table 4), paragraph [00405] (at Table 5), paragraph [00468] (at Table 7), and paragraph [00469] (at Table 8). The Sequence Listing .xml file, identified as BMKR-P00-U52, is 46,919 bytes in size and was created on Mar. 7, 2023.
Claims (20)
1. A method comprising:
receiving a set of samples comprising agricultural material;
extracting nucleic acid material from each of the set of samples, with use of a bead-beating homogenization process;
barcoding said nucleic acid material with a barcoding process configured to correct sequencing errors and detect multi-bit errors and increase sequencing depth performance;
amplifying said nucleic acid material in coordination with said barcoding, in order to generate a 16S library and an internal transcribed spacer (ITS) library from said nucleic acid material;
pooling material of the 16S library with material of the ITS library to generate a pooled library;
sequencing the pooled library within a single run of a high-throughput sequencer, thereby obtaining a set of nucleic acid sequence reads of 16S and ITS genes of microorganisms represented in the set of samples; and
generating one or more clusters upon clustering reads of the set of nucleic acid sequence reads; and
selecting a representative sequence from each of the one or more clusters to return a characterization of bacterial and fungal microorganism abundances for each of the set of samples.
2. The method of claim 1 , further comprising returning a report characterizing a microbiome profile from the characterization.
3. The method of claim 1 , further comprising returning a personalized product prescription from the characterization, the personalized product prescription configured to improve production of an agricultural product associated with the set of samples.
4. The method of claim 3 , wherein the personalized product prescription is configured to improve characteristics of at least one of nitrogen fixation and carbon fixation in an environment associated with the set of samples.
5. The method of claim 3 , wherein the agricultural product comprises at least one of: a sugar crop and a starch-producing crop.
6. The method of claim 1 , wherein said agricultural material comprises soil.
7. The method of claim 1 , wherein said agricultural material comprises a plant part comprising at least one of: a leaf part, a stem part, a root part, and a seed.
8. The method of claim 1 , wherein said agricultural material comprises material of a food production process.
9. The method of claim 1 , wherein said characterization of bacterial and fungal microorganism abundances comprises characterizing microorganisms from the group consisting essentially of: a single-celled organism, a bacteria, an archaea, a protozoan, a unicellular fungus and a protist.
10. The method of claim 1 , wherein clustering reads of the set of nucleic acid sequence reads comprises clustering sequences exhibiting a threshold level of similarity, and selecting a representative sequence for each cluster for taxonomic assignment.
11. The method of claim 1 , wherein said sequencing comprises employing a long-read sequencing platform.
12. The method of claim 1 , wherein receiving the set of samples comprises providing a kit comprising containers for sample reception, reagents for sample reception, and instructions for method performance executable by way of a computer-readable medium.
13. The method of claim 1 , wherein returning the characterization comprises returning the characterization based upon the representative sequences and auxiliary data comprising: geographical information and climate.
14. The method of claim 1 , wherein the barcoding process comprises a double-index barcoding process implementing tagging with a first class of Hamming codes and a second class of Golay codes.
15. A method comprising:
a) receiving a set of samples comprising agricultural material;
b) extracting nucleic acid material from each of the set of samples, with use of a homogenization process;
c) barcoding said nucleic acid material with a barcoding process configured to correct sequencing errors and detect multi-bit errors and increase sequencing depth performance;
d) amplifying said extracted and barcoded nucleic acid material to generate a 16S library and an internal transcribed spacer (ITS) library from said nucleic acid material;
e) pooling the material of the 16S library with the material of the ITS library to generate a pooled library;
f) sequencing the pooled library within a single run of a high-throughput sequencer, thereby obtaining a set of nucleic acid sequence reads of 16S and ITS genes of microorganisms represented in the set of samples; and
g) applying the set of sequence reads representing microorganisms in each sample to a trained machine learning model, wherein the trained machine learning model comprises network architecture and regression architecture and is trained on 16S and ITS sequence data associated with soil status, agriculture product quality, contamination, and treatment response,
the output of the trained machine learning model providing characterization of bacterial and fungal microorganism abundances for each of the set of samples.
16. The method of claim 15 , wherein said agricultural material comprises at least one of: soil, a leaf part, a stem part, a root part, and a seed.
17. The method of claim 15 , wherein the trained machine learning model comprises classification architecture for classification of soil nutrients associated with the set of samples.
18. The method of claim 15 , wherein the trained machine learning model comprises classification architecture for classification of crop disease states associated with the set of samples
19. The method of claim 15 , wherein the trained machine learning model comprises at least one of support vector machine architecture and deep-layered belief nets architecture.
20. The method of claim 15 , wherein the output characterizes a microbiome profile for the set of samples and provides a personalized product prescription configured to improve production of an agricultural product associated with the set of samples.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/823,435 US20230407409A1 (en) | 2015-12-04 | 2022-08-30 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562263488P | 2015-12-04 | 2015-12-04 | |
PCT/US2016/064984 WO2017096385A1 (en) | 2015-12-04 | 2016-12-05 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
US201815779531A | 2018-05-28 | 2018-05-28 | |
US17/823,435 US20230407409A1 (en) | 2015-12-04 | 2022-08-30 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
Related Parent Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/779,531 Continuation US11492672B2 (en) | 2015-12-04 | 2016-12-05 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
PCT/US2016/064984 Continuation WO2017096385A1 (en) | 2015-12-04 | 2016-12-05 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230407409A1 true US20230407409A1 (en) | 2023-12-21 |
Family
ID=58798033
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/779,531 Active 2040-03-19 US11492672B2 (en) | 2015-12-04 | 2016-12-05 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
US17/823,435 Pending US20230407409A1 (en) | 2015-12-04 | 2022-08-30 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/779,531 Active 2040-03-19 US11492672B2 (en) | 2015-12-04 | 2016-12-05 | Microbiome based identification, monitoring and enhancement of fermentation processes and products |
Country Status (3)
Country | Link |
---|---|
US (2) | US11492672B2 (en) |
EP (1) | EP3384025A4 (en) |
WO (1) | WO2017096385A1 (en) |
Families Citing this family (40)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9526480B2 (en) * | 2013-11-27 | 2016-12-27 | Elwha Llc | Devices and methods for profiling microbiota of skin |
US11015154B2 (en) | 2016-11-09 | 2021-05-25 | The Regents Of The University Of California | Methods for identifying interactions amongst microorganisms |
KR20230152172A (en) * | 2017-03-19 | 2023-11-02 | 오펙-에슈콜롯 리서치 앤드 디벨롭먼트 엘티디 | System and method for generating filters for k-mismatch search |
US10943182B2 (en) * | 2017-03-27 | 2021-03-09 | International Business Machines Corporation | Cognitive screening of EOR additives |
WO2019067558A1 (en) * | 2017-09-28 | 2019-04-04 | Precision Fermentation, Inc. | Methods, devices and computer program products for yeast performance monitoring in fermentation systems |
WO2019126824A1 (en) | 2017-12-22 | 2019-06-27 | Trace Genomics, Inc. | Metagenomics for microbiomes |
AU2019212700B2 (en) * | 2018-01-25 | 2022-06-16 | Trace Genomics, Inc. | Soil health indicators using microbial composition |
CN108627272B (en) * | 2018-03-22 | 2020-04-24 | 北京航空航天大学 | Two-dimensional temperature distribution reconstruction method based on four-angle laser absorption spectrum |
US10719782B2 (en) | 2018-05-09 | 2020-07-21 | International Business Machines Corporation | Chemical EOR materials database architecture and method for screening EOR materials |
EP3847651A1 (en) * | 2018-09-07 | 2021-07-14 | Advanced Biological Marketing Inc. | Microbiome-based tracking system and methods relating thereto |
CN109593865A (en) * | 2018-10-25 | 2019-04-09 | 华中科技大学鄂州工业技术研究院 | The analysis of marine coral Bacterial community, gene excavating method and equipment |
CN109686406A (en) * | 2018-11-12 | 2019-04-26 | 山东省医学科学院基础医学研究所 | A kind of phylogenetic tree figure production method and system |
CN109266769A (en) * | 2018-11-19 | 2019-01-25 | 南京工业大学 | A kind of primer and its application for identifying Azospirillum sp.TSA2S bacterial strain |
EP3861136A1 (en) * | 2018-11-30 | 2021-08-11 | Orvinum AG | Method for providing an identifier for a product |
EP3763828A1 (en) * | 2019-07-08 | 2021-01-13 | Nemri, Adnane | Method for monitoring fermentation processes, apparatus, and system therefore |
US11692989B2 (en) * | 2019-07-11 | 2023-07-04 | Locus Solutions Ipco, Llc | Use of soil and other environmental data to recommend customized agronomic programs |
US10783559B1 (en) | 2019-10-06 | 2020-09-22 | Bao Tran | Mobile information display platforms |
CN110827917B (en) * | 2019-11-06 | 2023-10-20 | 华中科技大学鄂州工业技术研究院 | SNP-based method for identifying individual intestinal flora type |
US11933775B2 (en) | 2019-12-12 | 2024-03-19 | Biome Makers Inc. | Methods and systems for evaluating ecological disturbance of an agricultural microbiome based upon network properties of organism communities |
CN111117910B (en) * | 2019-12-27 | 2021-08-31 | 江西农业大学 | Enterobacter ludwigii PN6 and application thereof |
CN111378779A (en) * | 2020-04-09 | 2020-07-07 | 中国科学院微生物研究所 | Verticillium polygene pedigree screening method |
CN111933218B (en) * | 2020-07-01 | 2022-03-29 | 广州基迪奥生物科技有限公司 | Optimized metagenome binding method for analyzing microbial community |
CA3199664A1 (en) * | 2020-12-03 | 2022-06-09 | Marc Rodriguez | Compostions and methods for biological production and harvest of precious metals, platinum group elements, and rare earth elements |
AU2021397247A1 (en) * | 2020-12-08 | 2023-06-22 | Pluton Biosciences, Inc. | Producing functional microbial consortia |
US20240104947A1 (en) * | 2020-12-14 | 2024-03-28 | Mars, Incorporated | Systems and methods for classifying food products |
US11158417B1 (en) * | 2020-12-29 | 2021-10-26 | Kpn Innovations, Llc. | System and method for generating a digestive disease nourishment program |
US20220208375A1 (en) * | 2020-12-29 | 2022-06-30 | Kpn Innovations, Llc. | System and method for generating a digestive disease functional program |
CN112735530A (en) * | 2021-01-22 | 2021-04-30 | 中国科学院北京基因组研究所(国家生物信息中心) | Method for tracing sample based on flora structure |
US20220240432A1 (en) * | 2021-01-29 | 2022-08-04 | Biome Makers Inc. | Methods and systems for predicting crop features and evaluating inputs and practices |
EP4298200A1 (en) | 2021-02-24 | 2024-01-03 | Precision Fermentation, Inc. | Devices and methods for monitoring |
WO2022187818A1 (en) * | 2021-03-03 | 2022-09-09 | Lanzatech, Inc. | System for control and analysis of gas fermentation processes |
CN112961807B (en) * | 2021-03-30 | 2023-01-20 | 中国科学院成都生物研究所 | Microbial composition and application thereof in promoting germination and growth of highland barley seeds |
WO2022212156A1 (en) * | 2021-03-31 | 2022-10-06 | Biome Makers Inc. | Methods and systems for assessing agriculture practices and inputs with time and location factors |
CN113284560B (en) * | 2021-04-28 | 2022-05-17 | 广州微远基因科技有限公司 | Pathogenic detection background microorganism judgment method and application |
CN117813655A (en) * | 2021-08-17 | 2024-04-02 | 玛氏公司 | Metagenomic filtration and authentication of food raw materials using microbiological characteristics |
CN113782098B (en) * | 2021-09-30 | 2023-10-13 | 天津科技大学 | Edible vinegar fermentation artificial flora construction method and application |
CN113718047B (en) * | 2021-11-04 | 2022-02-18 | 艾德范思(北京)医学检验实验室有限公司 | Kit for detecting 10 bacteria in human breast milk by fluorescence quantitative method and application thereof |
CN114003752B (en) * | 2021-11-24 | 2022-11-15 | 重庆邮电大学 | Database simplification method and system based on particle ball face clustering image quality evaluation |
CN116612820B (en) * | 2023-07-20 | 2023-09-19 | 山东省滨州畜牧兽医研究院 | Dairy product production intelligent management platform based on data analysis |
CN117497065B (en) * | 2023-12-28 | 2024-04-02 | 中国农业大学 | Method for screening microorganism species for promoting regeneration of perennial grass, apparatus therefor and computer-readable storage medium |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6180339B1 (en) * | 1995-01-13 | 2001-01-30 | Bayer Corporation | Nucleic acid probes for the detection and identification of fungi |
EP1948783A1 (en) * | 2005-11-10 | 2008-07-30 | Intervet International BV | Novel bacterium and vaccine |
EP1985712A1 (en) | 2007-04-17 | 2008-10-29 | Vereniging voor christelijk hoger onderwijs, wetenschappelijk onderzoek en patiëntenzorg | Microbial population analysis |
WO2009037575A2 (en) * | 2007-04-19 | 2009-03-26 | Uti Limited Partnership | Multiplex pcr assay for identification of usa300 and usa400 community-associated methicillin resistant staphylococcal aureus strains |
AU2014232370B2 (en) * | 2013-03-15 | 2018-11-01 | Seres Therapeutics, Inc. | Network-based microbial compositions and methods |
WO2015103165A1 (en) * | 2013-12-31 | 2015-07-09 | Biota Technology, Inc. | Microbiome based systems, apparatus and methods for monitoring and controlling industrial processes and systems |
EP3140424B1 (en) * | 2014-05-06 | 2020-04-29 | IS Diagnostics LTD | Microbial population analysis |
JP6839666B2 (en) * | 2015-06-25 | 2021-03-10 | ネイティブ マイクロビアルズ, インコーポレイテッド | Methods, devices, and systems for the analysis of complex heterogeneous microbial strains, the prediction and identification of functional relationships and their interactions, and the selection and synthesis of microbial ensembles based on them. |
-
2016
- 2016-12-05 EP EP16871723.9A patent/EP3384025A4/en active Pending
- 2016-12-05 US US15/779,531 patent/US11492672B2/en active Active
- 2016-12-05 WO PCT/US2016/064984 patent/WO2017096385A1/en active Application Filing
-
2022
- 2022-08-30 US US17/823,435 patent/US20230407409A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
US20180363031A1 (en) | 2018-12-20 |
US11492672B2 (en) | 2022-11-08 |
EP3384025A1 (en) | 2018-10-10 |
WO2017096385A1 (en) | 2017-06-08 |
EP3384025A4 (en) | 2019-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230407409A1 (en) | Microbiome based identification, monitoring and enhancement of fermentation processes and products | |
US20210371938A1 (en) | Microbiome based systems, apparatus and methods for monitoring and controlling industrial processes and systems | |
Das et al. | Microbial diversity in the genomic era | |
Nilsson et al. | Mycobiome diversity: high-throughput sequencing and identification of fungi | |
Rusch et al. | The Sorcerer II global ocean sampling expedition: northwest Atlantic through eastern tropical Pacific | |
WO2015103165A1 (en) | Microbiome based systems, apparatus and methods for monitoring and controlling industrial processes and systems | |
Nilsson et al. | Intraspecific ITS variability in the kingdom Fungi as expressed in the international sequence databases and its implications for molecular species identification | |
Sarma | Handbook of cyanobacteria | |
Singh et al. | Advances in cyanobacterial biology | |
Medlin et al. | Methods to estimate the diversity in the marine photosynthetic protist community with illustrations from case studies: a review | |
Nguyen et al. | Foliar fungi of Betula pendula: Impact of tree species mixtures and assessment methods | |
Zaneveld et al. | Combined phylogenetic and genomic approaches for the high-throughput study of microbial habitat adaptation | |
Williamson et al. | From bacterial to microbial ecosystems (metagenomics) | |
Kowalska et al. | DNA barcoding–A new device in phycologist's toolbox | |
Ma et al. | Leaf‐associated fungal and viral communities of wild plant populations differ between cultivated and natural ecosystems | |
Abdelrhman et al. | Exploring the bacterial gut microbiota of supralittoral talitrid amphipods | |
Feghali et al. | Genetic and phenotypic characterisation of a Saccharomyces cerevisiae population of ‘Merwah’white wine | |
Čadež et al. | Hanseniaspora smithiae sp. nov., a novel apiculate yeast species from Patagonian forests that lacks the typical genomic domestication signatures for fermentative environments | |
Combrink et al. | Best practice for wildlife gut microbiome research: A comprehensive review of methodology for 16S rRNA gene investigations | |
Liu et al. | Population diversity and genetic structure reveal patterns of host association and anthropogenic impact for the globally important fungal tree pathogen Ceratocystis manginecans | |
Kumari et al. | Cyanobacterial diversity: molecular insights under multifarious environmental conditions | |
Owen | Bacterial taxonomics: finding the wood through the phylogenetic trees | |
Zuzolo et al. | The rootstock shape microbial diversity and functionality in the rhizosphere of Vitis vinifera L. cultivar Falanghina | |
Rodrigues et al. | Molecular Diversity of Environmental Prokaryotes | |
Campos-Guillén et al. | The use of big data in the modern biology: the case of agriculture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: BIOME MAKERS INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BECARES, ALBERTO A.;FERNANDEZ, ADRIAN F.;REEL/FRAME:061024/0032 Effective date: 20220901 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |