CN117897502A - Compositions and methods for detecting genetic features - Google Patents
Compositions and methods for detecting genetic features Download PDFInfo
- Publication number
- CN117897502A CN117897502A CN202280058627.5A CN202280058627A CN117897502A CN 117897502 A CN117897502 A CN 117897502A CN 202280058627 A CN202280058627 A CN 202280058627A CN 117897502 A CN117897502 A CN 117897502A
- Authority
- CN
- China
- Prior art keywords
- gene
- fusion
- primer
- nucleic acid
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 200
- 239000000203 mixture Substances 0.000 title abstract description 24
- 230000002068 genetic effect Effects 0.000 title abstract description 11
- 125000003729 nucleotide group Chemical group 0.000 claims description 390
- 239000002773 nucleotide Substances 0.000 claims description 349
- 150000007523 nucleic acids Chemical class 0.000 claims description 313
- 102000040430 polynucleotide Human genes 0.000 claims description 293
- 108091033319 polynucleotide Proteins 0.000 claims description 293
- 239000002157 polynucleotide Substances 0.000 claims description 293
- 230000004927 fusion Effects 0.000 claims description 291
- 102000039446 nucleic acids Human genes 0.000 claims description 286
- 108020004707 nucleic acids Proteins 0.000 claims description 286
- 230000003321 amplification Effects 0.000 claims description 199
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 199
- 108090000623 proteins and genes Proteins 0.000 claims description 195
- 230000000903 blocking effect Effects 0.000 claims description 166
- 238000012163 sequencing technique Methods 0.000 claims description 152
- 239000000523 sample Substances 0.000 claims description 107
- 230000000295 complement effect Effects 0.000 claims description 105
- 108020004414 DNA Proteins 0.000 claims description 59
- 108091034117 Oligonucleotide Proteins 0.000 claims description 58
- 230000027455 binding Effects 0.000 claims description 45
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 34
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 claims description 31
- 102000004169 proteins and genes Human genes 0.000 claims description 23
- 239000002299 complementary DNA Substances 0.000 claims description 14
- 238000006073 displacement reaction Methods 0.000 claims description 13
- 238000011144 upstream manufacturing Methods 0.000 claims description 13
- 239000003795 chemical substances by application Substances 0.000 claims description 11
- 210000003719 b-lymphocyte Anatomy 0.000 claims description 10
- 230000009320 intrachromosomal translocation Effects 0.000 claims description 9
- 108091035707 Consensus sequence Proteins 0.000 claims description 8
- 238000012217 deletion Methods 0.000 claims description 5
- 230000037430 deletion Effects 0.000 claims description 5
- 108010012919 B-Cell Antigen Receptors Proteins 0.000 claims description 4
- 102000019260 B-Cell Antigen Receptors Human genes 0.000 claims description 4
- 108010092262 T-Cell Antigen Receptors Proteins 0.000 claims description 4
- 238000003780 insertion Methods 0.000 claims description 4
- 230000037431 insertion Effects 0.000 claims description 4
- 230000007547 defect Effects 0.000 claims description 2
- 230000004075 alteration Effects 0.000 abstract description 2
- 239000013615 primer Substances 0.000 description 286
- 239000000047 product Substances 0.000 description 103
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 95
- 201000010099 disease Diseases 0.000 description 92
- 201000009030 Carcinoma Diseases 0.000 description 61
- 210000004027 cell Anatomy 0.000 description 58
- 238000006243 chemical reaction Methods 0.000 description 41
- 230000002441 reversible effect Effects 0.000 description 38
- 208000011580 syndromic disease Diseases 0.000 description 37
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 36
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 36
- 238000009396 hybridization Methods 0.000 description 32
- 206010028980 Neoplasm Diseases 0.000 description 30
- 208000032839 leukemia Diseases 0.000 description 30
- 239000007787 solid Substances 0.000 description 30
- 125000005647 linker group Chemical group 0.000 description 29
- -1 DNA polymerase) Chemical class 0.000 description 28
- 108091008874 T cell receptors Proteins 0.000 description 28
- 239000000872 buffer Substances 0.000 description 27
- 208000015181 infectious disease Diseases 0.000 description 25
- 102000053602 DNA Human genes 0.000 description 24
- 102000012410 DNA Ligases Human genes 0.000 description 24
- 108010061982 DNA Ligases Proteins 0.000 description 24
- 108091028043 Nucleic acid sequence Proteins 0.000 description 24
- 230000008707 rearrangement Effects 0.000 description 24
- 239000000126 substance Substances 0.000 description 24
- 206010039491 Sarcoma Diseases 0.000 description 23
- 239000003153 chemical reaction reagent Substances 0.000 description 22
- 238000003752 polymerase chain reaction Methods 0.000 description 22
- 201000011510 cancer Diseases 0.000 description 21
- 238000001514 detection method Methods 0.000 description 20
- 230000000694 effects Effects 0.000 description 20
- 102000004190 Enzymes Human genes 0.000 description 17
- 108090000790 Enzymes Proteins 0.000 description 17
- 229940088598 enzyme Drugs 0.000 description 17
- 108091008915 immune receptors Proteins 0.000 description 16
- 102000027596 immune receptors Human genes 0.000 description 16
- 239000000463 material Substances 0.000 description 16
- 210000004369 blood Anatomy 0.000 description 15
- 239000008280 blood Substances 0.000 description 15
- 210000001519 tissue Anatomy 0.000 description 15
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 14
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical group NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 14
- 201000001441 melanoma Diseases 0.000 description 14
- 108091093088 Amplicon Proteins 0.000 description 13
- 102000003960 Ligases Human genes 0.000 description 13
- 108090000364 Ligases Proteins 0.000 description 13
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical group O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 13
- 150000001875 compounds Chemical class 0.000 description 13
- 239000000975 dye Substances 0.000 description 13
- 101100112922 Candida albicans CDR3 gene Proteins 0.000 description 12
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 12
- 230000015572 biosynthetic process Effects 0.000 description 12
- 239000012530 fluid Substances 0.000 description 12
- 238000002560 therapeutic procedure Methods 0.000 description 12
- 238000004925 denaturation Methods 0.000 description 11
- 230000036425 denaturation Effects 0.000 description 11
- 239000007850 fluorescent dye Substances 0.000 description 11
- 239000012634 fragment Substances 0.000 description 11
- 229910019142 PO4 Inorganic materials 0.000 description 10
- 210000001744 T-lymphocyte Anatomy 0.000 description 10
- 125000003275 alpha amino acid group Chemical group 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 10
- 230000001363 autoimmune Effects 0.000 description 10
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 10
- 238000012986 modification Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 238000005096 rolling process Methods 0.000 description 10
- 230000007017 scission Effects 0.000 description 10
- 238000003786 synthesis reaction Methods 0.000 description 10
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 10
- 208000023275 Autoimmune disease Diseases 0.000 description 9
- 208000031981 Thrombocytopenic Idiopathic Purpura Diseases 0.000 description 9
- 230000001684 chronic effect Effects 0.000 description 9
- 239000003398 denaturant Substances 0.000 description 9
- 238000010348 incorporation Methods 0.000 description 9
- 230000003993 interaction Effects 0.000 description 9
- 230000004048 modification Effects 0.000 description 9
- 235000021317 phosphate Nutrition 0.000 description 9
- 239000010452 phosphate Substances 0.000 description 9
- 239000000758 substrate Substances 0.000 description 9
- 230000005945 translocation Effects 0.000 description 9
- 238000011282 treatment Methods 0.000 description 9
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 8
- 108060002716 Exonuclease Proteins 0.000 description 8
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 8
- 108010017842 Telomerase Proteins 0.000 description 8
- 201000003710 autoimmune thrombocytopenic purpura Diseases 0.000 description 8
- 238000003776 cleavage reaction Methods 0.000 description 8
- 102000013165 exonuclease Human genes 0.000 description 8
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 8
- ANRHNWWPFJCPAZ-UHFFFAOYSA-M thionine Chemical compound [Cl-].C1=CC(N)=CC2=[S+]C3=CC(N)=CC=C3N=C21 ANRHNWWPFJCPAZ-UHFFFAOYSA-M 0.000 description 8
- 108091008875 B cell receptors Proteins 0.000 description 7
- 208000035473 Communicable disease Diseases 0.000 description 7
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 7
- 108091093037 Peptide nucleic acid Chemical group 0.000 description 7
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 7
- 239000000090 biomarker Substances 0.000 description 7
- 230000007812 deficiency Effects 0.000 description 7
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 7
- 210000004698 lymphocyte Anatomy 0.000 description 7
- 238000005259 measurement Methods 0.000 description 7
- 238000012175 pyrosequencing Methods 0.000 description 7
- 239000011541 reaction mixture Substances 0.000 description 7
- 239000007790 solid phase Substances 0.000 description 7
- 241000894007 species Species 0.000 description 7
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 6
- 208000003407 Creutzfeldt-Jakob Syndrome Diseases 0.000 description 6
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 6
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical group O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 6
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 6
- 208000009956 adenocarcinoma Diseases 0.000 description 6
- 230000001413 cellular effect Effects 0.000 description 6
- 238000007796 conventional method Methods 0.000 description 6
- 125000004122 cyclic group Chemical group 0.000 description 6
- 229940104302 cytosine Drugs 0.000 description 6
- 230000002255 enzymatic effect Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- KWIUHFFTVRNATP-UHFFFAOYSA-N glycine betaine Chemical group C[N+](C)(C)CC([O-])=O KWIUHFFTVRNATP-UHFFFAOYSA-N 0.000 description 6
- 210000000056 organ Anatomy 0.000 description 6
- 238000011002 quantification Methods 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 230000008685 targeting Effects 0.000 description 6
- 229940035893 uracil Drugs 0.000 description 6
- 108091026890 Coding region Proteins 0.000 description 5
- 208000020406 Creutzfeldt Jacob disease Diseases 0.000 description 5
- 208000010859 Creutzfeldt-Jakob disease Diseases 0.000 description 5
- 102000008158 DNA Ligase ATP Human genes 0.000 description 5
- 108010060248 DNA Ligase ATP Proteins 0.000 description 5
- 208000007465 Giant cell arteritis Diseases 0.000 description 5
- 241000124008 Mammalia Species 0.000 description 5
- 206010034277 Pemphigoid Diseases 0.000 description 5
- 101710086015 RNA ligase Proteins 0.000 description 5
- 206010047115 Vasculitis Diseases 0.000 description 5
- 239000000654 additive Substances 0.000 description 5
- 230000000692 anti-sense effect Effects 0.000 description 5
- 239000011324 bead Substances 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 5
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 5
- 238000013467 fragmentation Methods 0.000 description 5
- 238000006062 fragmentation reaction Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 208000028454 lice infestation Diseases 0.000 description 5
- 238000007403 mPCR Methods 0.000 description 5
- 230000003211 malignant effect Effects 0.000 description 5
- 208000023356 medullary thyroid gland carcinoma Diseases 0.000 description 5
- 201000010879 mucinous adenocarcinoma Diseases 0.000 description 5
- 201000006417 multiple sclerosis Diseases 0.000 description 5
- 210000002381 plasma Anatomy 0.000 description 5
- 229920000642 polymer Polymers 0.000 description 5
- 239000002987 primer (paints) Substances 0.000 description 5
- 230000000306 recurrent effect Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 201000000306 sarcoidosis Diseases 0.000 description 5
- 238000007841 sequencing by ligation Methods 0.000 description 5
- 206010041823 squamous cell carcinoma Diseases 0.000 description 5
- 206010043207 temporal arteritis Diseases 0.000 description 5
- 230000009385 viral infection Effects 0.000 description 5
- 125000003903 2-propenyl group Chemical group [H]C([*])([H])C([H])=C([H])[H] 0.000 description 4
- 208000026872 Addison Disease Diseases 0.000 description 4
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 4
- 201000003076 Angiosarcoma Diseases 0.000 description 4
- 208000024172 Cardiovascular disease Diseases 0.000 description 4
- 108020004635 Complementary DNA Proteins 0.000 description 4
- 208000011231 Crohn disease Diseases 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 4
- 208000026350 Inborn Genetic disease Diseases 0.000 description 4
- 208000005615 Interstitial Cystitis Diseases 0.000 description 4
- 201000001779 Leukocyte adhesion deficiency Diseases 0.000 description 4
- 208000025174 PANDAS Diseases 0.000 description 4
- 238000012408 PCR amplification Methods 0.000 description 4
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 4
- 208000029082 Pelvic Inflammatory Disease Diseases 0.000 description 4
- 108020004682 Single-Stranded DNA Proteins 0.000 description 4
- 208000021386 Sjogren Syndrome Diseases 0.000 description 4
- 208000018756 Variant Creutzfeldt-Jakob disease Diseases 0.000 description 4
- 239000002253 acid Substances 0.000 description 4
- 208000002552 acute disseminated encephalomyelitis Diseases 0.000 description 4
- 230000000996 additive effect Effects 0.000 description 4
- 208000002517 adenoid cystic carcinoma Diseases 0.000 description 4
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 4
- 238000000137 annealing Methods 0.000 description 4
- 125000004429 atom Chemical group 0.000 description 4
- 208000027625 autoimmune inner ear disease Diseases 0.000 description 4
- 239000011616 biotin Substances 0.000 description 4
- 229960002685 biotin Drugs 0.000 description 4
- 235000020958 biotin Nutrition 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 210000000349 chromosome Anatomy 0.000 description 4
- 208000025302 chronic primary adrenal insufficiency Diseases 0.000 description 4
- 201000001981 dermatomyositis Diseases 0.000 description 4
- 230000029087 digestion Effects 0.000 description 4
- 206010014599 encephalitis Diseases 0.000 description 4
- 208000016361 genetic disease Diseases 0.000 description 4
- 239000011521 glass Substances 0.000 description 4
- 230000028993 immune response Effects 0.000 description 4
- 238000002372 labelling Methods 0.000 description 4
- 210000000265 leukocyte Anatomy 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 206010025135 lupus erythematosus Diseases 0.000 description 4
- 230000001404 mediated effect Effects 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 201000003631 narcolepsy Diseases 0.000 description 4
- 208000008795 neuromyelitis optica Diseases 0.000 description 4
- 230000036961 partial effect Effects 0.000 description 4
- ZJAOAACCNHFJAH-UHFFFAOYSA-N phosphonoformic acid Chemical class OC(=O)P(O)(O)=O ZJAOAACCNHFJAH-UHFFFAOYSA-N 0.000 description 4
- 208000028529 primary immunodeficiency disease Diseases 0.000 description 4
- 210000003491 skin Anatomy 0.000 description 4
- 235000000346 sugar Nutrition 0.000 description 4
- 229940113082 thymine Drugs 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 238000005406 washing Methods 0.000 description 4
- 208000030507 AIDS Diseases 0.000 description 3
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 3
- 208000002874 Acne Vulgaris Diseases 0.000 description 3
- 208000024827 Alzheimer disease Diseases 0.000 description 3
- 206010001935 American trypanosomiasis Diseases 0.000 description 3
- 206010002556 Ankylosing Spondylitis Diseases 0.000 description 3
- 208000003343 Antiphospholipid Syndrome Diseases 0.000 description 3
- 201000001320 Atherosclerosis Diseases 0.000 description 3
- 208000031212 Autoimmune polyendocrinopathy Diseases 0.000 description 3
- 208000009137 Behcet syndrome Diseases 0.000 description 3
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 3
- 241000283690 Bos taurus Species 0.000 description 3
- 208000003170 Bronchiolo-Alveolar Adenocarcinoma Diseases 0.000 description 3
- 206010007134 Candida infections Diseases 0.000 description 3
- 208000024699 Chagas disease Diseases 0.000 description 3
- 208000017667 Chronic Disease Diseases 0.000 description 3
- 208000030939 Chronic inflammatory demyelinating polyneuropathy Diseases 0.000 description 3
- 208000016192 Demyelinating disease Diseases 0.000 description 3
- 201000010374 Down Syndrome Diseases 0.000 description 3
- 208000021866 Dressler syndrome Diseases 0.000 description 3
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 3
- 108700024394 Exon Proteins 0.000 description 3
- 206010018364 Glomerulonephritis Diseases 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 206010061192 Haemorrhagic fever Diseases 0.000 description 3
- 208000030836 Hashimoto thyroiditis Diseases 0.000 description 3
- 208000001258 Hemangiosarcoma Diseases 0.000 description 3
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 3
- 206010021245 Idiopathic thrombocytopenic purpura Diseases 0.000 description 3
- 108060003951 Immunoglobulin Proteins 0.000 description 3
- 208000022559 Inflammatory bowel disease Diseases 0.000 description 3
- 208000003456 Juvenile Arthritis Diseases 0.000 description 3
- 206010059176 Juvenile idiopathic arthritis Diseases 0.000 description 3
- 201000010743 Lambert-Eaton myasthenic syndrome Diseases 0.000 description 3
- 208000016604 Lyme disease Diseases 0.000 description 3
- 208000002569 Machado-Joseph Disease Diseases 0.000 description 3
- 208000007054 Medullary Carcinoma Diseases 0.000 description 3
- 208000003250 Mixed connective tissue disease Diseases 0.000 description 3
- 201000002481 Myositis Diseases 0.000 description 3
- 206010071579 Neuronal neuropathy Diseases 0.000 description 3
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 3
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 3
- 241000721454 Pemphigus Species 0.000 description 3
- 208000031845 Pernicious anaemia Diseases 0.000 description 3
- DNIAPMSPPWPWGF-UHFFFAOYSA-N Propylene glycol Chemical compound CC(O)CO DNIAPMSPPWPWGF-UHFFFAOYSA-N 0.000 description 3
- 201000004681 Psoriasis Diseases 0.000 description 3
- 201000001263 Psoriatic Arthritis Diseases 0.000 description 3
- 208000036824 Psoriatic arthropathy Diseases 0.000 description 3
- 206010037660 Pyrexia Diseases 0.000 description 3
- 208000012322 Raynaud phenomenon Diseases 0.000 description 3
- 108091028664 Ribonucleotide Proteins 0.000 description 3
- 108010090804 Streptavidin Proteins 0.000 description 3
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical group OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 3
- 208000002474 Tinea Diseases 0.000 description 3
- 201000005485 Toxoplasmosis Diseases 0.000 description 3
- 206010044688 Trisomy 21 Diseases 0.000 description 3
- 241000223109 Trypanosoma cruzi Species 0.000 description 3
- 208000026928 Turner syndrome Diseases 0.000 description 3
- 208000037386 Typhoid Diseases 0.000 description 3
- 208000025865 Ulcer Diseases 0.000 description 3
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 3
- 102100037111 Uracil-DNA glycosylase Human genes 0.000 description 3
- 101150117115 V gene Proteins 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 206010000496 acne Diseases 0.000 description 3
- 230000001154 acute effect Effects 0.000 description 3
- 229960000643 adenine Drugs 0.000 description 3
- 208000006673 asthma Diseases 0.000 description 3
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 3
- 229960003237 betaine Drugs 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 201000003984 candidiasis Diseases 0.000 description 3
- 208000037516 chromosome inversion disease Diseases 0.000 description 3
- 201000005795 chronic inflammatory demyelinating polyneuritis Diseases 0.000 description 3
- 230000001351 cycling effect Effects 0.000 description 3
- RGWHQCVHVJXOKC-SHYZEUOFSA-N dCTP Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO[P@](O)(=O)O[P@](O)(=O)OP(O)(O)=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-N 0.000 description 3
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 3
- 230000007850 degeneration Effects 0.000 description 3
- 239000005547 deoxyribonucleotide Substances 0.000 description 3
- 208000035475 disorder Diseases 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 210000001508 eye Anatomy 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 208000005017 glioblastoma Diseases 0.000 description 3
- 210000004209 hair Anatomy 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 210000000987 immune system Anatomy 0.000 description 3
- 102000018358 immunoglobulin Human genes 0.000 description 3
- 208000033065 inborn errors of immunity Diseases 0.000 description 3
- 201000008319 inclusion body myositis Diseases 0.000 description 3
- 206010022000 influenza Diseases 0.000 description 3
- 239000000543 intermediate Substances 0.000 description 3
- 230000002427 irreversible effect Effects 0.000 description 3
- 208000003747 lymphoid leukemia Diseases 0.000 description 3
- 230000036210 malignancy Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 125000001434 methanylylidene group Chemical group [H]C#[*] 0.000 description 3
- 125000004184 methoxymethyl group Chemical group [H]C([H])([H])OC([H])([H])* 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 206010028417 myasthenia gravis Diseases 0.000 description 3
- 208000025113 myeloid leukemia Diseases 0.000 description 3
- 201000009240 nasopharyngitis Diseases 0.000 description 3
- 230000004770 neurodegeneration Effects 0.000 description 3
- 208000015122 neurodegenerative disease Diseases 0.000 description 3
- 238000007481 next generation sequencing Methods 0.000 description 3
- 239000001301 oxygen Substances 0.000 description 3
- 229910052760 oxygen Inorganic materials 0.000 description 3
- 201000002528 pancreatic cancer Diseases 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 210000005259 peripheral blood Anatomy 0.000 description 3
- 239000011886 peripheral blood Substances 0.000 description 3
- 208000033808 peripheral neuropathy Diseases 0.000 description 3
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 3
- 150000004713 phosphodiesters Chemical class 0.000 description 3
- 239000004033 plastic Substances 0.000 description 3
- 229920003023 plastic Polymers 0.000 description 3
- 238000006116 polymerization reaction Methods 0.000 description 3
- 208000005987 polymyositis Diseases 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 230000035935 pregnancy Effects 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 150000003230 pyrimidines Chemical class 0.000 description 3
- 230000005855 radiation Effects 0.000 description 3
- 102000005962 receptors Human genes 0.000 description 3
- 108020003175 receptors Proteins 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 230000004043 responsiveness Effects 0.000 description 3
- 108091008146 restriction endonucleases Proteins 0.000 description 3
- 239000002336 ribonucleotide Substances 0.000 description 3
- 125000002652 ribonucleotide group Chemical group 0.000 description 3
- 238000007363 ring formation reaction Methods 0.000 description 3
- 201000007416 salivary gland adenoid cystic carcinoma Diseases 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 239000000377 silicon dioxide Substances 0.000 description 3
- 201000000849 skin cancer Diseases 0.000 description 3
- 201000008261 skin carcinoma Diseases 0.000 description 3
- 239000011734 sodium Substances 0.000 description 3
- 230000000392 somatic effect Effects 0.000 description 3
- 230000009870 specific binding Effects 0.000 description 3
- 238000006467 substitution reaction Methods 0.000 description 3
- 208000006379 syphilis Diseases 0.000 description 3
- 201000000596 systemic lupus erythematosus Diseases 0.000 description 3
- 229940104230 thymidine Drugs 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 125000000025 triisopropylsilyl group Chemical group C(C)(C)[Si](C(C)C)(C(C)C)* 0.000 description 3
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 3
- 201000008297 typhoid fever Diseases 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 2
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 2
- UXFQFBNBSPQBJW-UHFFFAOYSA-N 2-amino-2-methylpropane-1,3-diol Chemical compound OCC(N)(C)CO UXFQFBNBSPQBJW-UHFFFAOYSA-N 0.000 description 2
- ACERFIHBIWMFOR-UHFFFAOYSA-N 2-hydroxy-3-[(1-hydroxy-2-methylpropan-2-yl)azaniumyl]propane-1-sulfonate Chemical compound OCC(C)(C)NCC(O)CS(O)(=O)=O ACERFIHBIWMFOR-UHFFFAOYSA-N 0.000 description 2
- DVLFYONBTKHTER-UHFFFAOYSA-N 3-(N-morpholino)propanesulfonic acid Chemical compound OS(=O)(=O)CCCN1CCOCC1 DVLFYONBTKHTER-UHFFFAOYSA-N 0.000 description 2
- XNPKNHHFCKSMRV-UHFFFAOYSA-N 4-(cyclohexylamino)butane-1-sulfonic acid Chemical compound OS(=O)(=O)CCCCNC1CCCCC1 XNPKNHHFCKSMRV-UHFFFAOYSA-N 0.000 description 2
- PEHVGBZKEYRQSX-UHFFFAOYSA-N 7-deaza-adenine Chemical compound NC1=NC=NC2=C1C=CN2 PEHVGBZKEYRQSX-UHFFFAOYSA-N 0.000 description 2
- 206010069754 Acquired gene mutation Diseases 0.000 description 2
- 208000030090 Acute Disease Diseases 0.000 description 2
- 208000036762 Acute promyelocytic leukaemia Diseases 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- 208000008190 Agammaglobulinemia Diseases 0.000 description 2
- 201000010000 Agranulocytosis Diseases 0.000 description 2
- 102100034452 Alternative prion protein Human genes 0.000 description 2
- 208000031277 Amaurotic familial idiocy Diseases 0.000 description 2
- 241000224489 Amoeba Species 0.000 description 2
- 102100021569 Apoptosis regulator Bcl-2 Human genes 0.000 description 2
- 206010003827 Autoimmune hepatitis Diseases 0.000 description 2
- 208000000659 Autoimmune lymphoproliferative syndrome Diseases 0.000 description 2
- 206010064539 Autoimmune myocarditis Diseases 0.000 description 2
- 206010069002 Autoimmune pancreatitis Diseases 0.000 description 2
- 208000022106 Autoimmune polyendocrinopathy type 2 Diseases 0.000 description 2
- 102100024222 B-lymphocyte antigen CD19 Human genes 0.000 description 2
- 102100022005 B-lymphocyte antigen CD20 Human genes 0.000 description 2
- 108091012583 BCL2 Proteins 0.000 description 2
- 208000032791 BCR-ABL1 positive chronic myelogenous leukemia Diseases 0.000 description 2
- 208000004429 Bacillary Dysentery Diseases 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 208000009299 Benign Mucous Membrane Pemphigoid Diseases 0.000 description 2
- 208000008439 Biliary Liver Cirrhosis Diseases 0.000 description 2
- 208000033222 Biliary cirrhosis primary Diseases 0.000 description 2
- 206010005098 Blastomycosis Diseases 0.000 description 2
- 206010005913 Body tinea Diseases 0.000 description 2
- BTBUEUYNUDRHOZ-UHFFFAOYSA-N Borate Chemical compound [O-]B([O-])[O-] BTBUEUYNUDRHOZ-UHFFFAOYSA-N 0.000 description 2
- 208000003508 Botulism Diseases 0.000 description 2
- 208000003174 Brain Neoplasms Diseases 0.000 description 2
- 206010058354 Bronchioloalveolar carcinoma Diseases 0.000 description 2
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 2
- 102000017420 CD3 protein, epsilon/gamma/delta subunit Human genes 0.000 description 2
- 108050005493 CD3 protein, epsilon/gamma/delta subunit Proteins 0.000 description 2
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 2
- 241000222122 Candida albicans Species 0.000 description 2
- 208000009458 Carcinoma in Situ Diseases 0.000 description 2
- 208000005024 Castleman disease Diseases 0.000 description 2
- 108091006146 Channels Proteins 0.000 description 2
- 206010008609 Cholangitis sclerosing Diseases 0.000 description 2
- 208000006332 Choriocarcinoma Diseases 0.000 description 2
- 208000010833 Chronic myeloid leukaemia Diseases 0.000 description 2
- 201000000724 Chronic recurrent multifocal osteomyelitis Diseases 0.000 description 2
- 108020004638 Circular DNA Proteins 0.000 description 2
- 208000015943 Coeliac disease Diseases 0.000 description 2
- 206010009900 Colitis ulcerative Diseases 0.000 description 2
- 206010009944 Colon cancer Diseases 0.000 description 2
- 208000009802 Colorado tick fever Diseases 0.000 description 2
- 201000003874 Common Variable Immunodeficiency Diseases 0.000 description 2
- 206010010741 Conjunctivitis Diseases 0.000 description 2
- 208000001528 Coronaviridae Infections Diseases 0.000 description 2
- 208000008953 Cryptosporidiosis Diseases 0.000 description 2
- 206010011502 Cryptosporidiosis infection Diseases 0.000 description 2
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 2
- 230000006820 DNA synthesis Effects 0.000 description 2
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 2
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 2
- AHCYMLUZIRLXAA-SHYZEUOFSA-N Deoxyuridine 5'-triphosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 AHCYMLUZIRLXAA-SHYZEUOFSA-N 0.000 description 2
- 206010012468 Dermatitis herpetiformis Diseases 0.000 description 2
- 201000009273 Endometriosis Diseases 0.000 description 2
- 102100031780 Endonuclease Human genes 0.000 description 2
- 206010014954 Eosinophilic fasciitis Diseases 0.000 description 2
- 208000018428 Eosinophilic granulomatosis with polyangiitis Diseases 0.000 description 2
- 206010064212 Eosinophilic oesophagitis Diseases 0.000 description 2
- 206010014979 Epidemic typhus Diseases 0.000 description 2
- 206010015150 Erythema Diseases 0.000 description 2
- 208000007985 Erythema Infectiosum Diseases 0.000 description 2
- 206010015226 Erythema nodosum Diseases 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 101000939283 Escherichia coli (strain K12) Protein UmuC Proteins 0.000 description 2
- 101000939288 Escherichia coli (strain K12) Protein UmuD Proteins 0.000 description 2
- 208000004332 Evans syndrome Diseases 0.000 description 2
- 208000006168 Ewing Sarcoma Diseases 0.000 description 2
- 201000005866 Exanthema Subitum Diseases 0.000 description 2
- 201000006353 Filariasis Diseases 0.000 description 2
- 201000011240 Frontotemporal dementia Diseases 0.000 description 2
- 206010017533 Fungal infection Diseases 0.000 description 2
- 206010017711 Gangrene Diseases 0.000 description 2
- 206010018687 Granulocytopenia Diseases 0.000 description 2
- ZRALSGWEFCBTJO-UHFFFAOYSA-N Guanidine Chemical compound NC(N)=N ZRALSGWEFCBTJO-UHFFFAOYSA-N 0.000 description 2
- 108020005004 Guide RNA Proteins 0.000 description 2
- 208000020061 Hand, Foot and Mouth Disease Diseases 0.000 description 2
- 208000025713 Hand-foot-and-mouth disease Diseases 0.000 description 2
- 206010019143 Hantavirus pulmonary infection Diseases 0.000 description 2
- 206010019263 Heart block congenital Diseases 0.000 description 2
- 206010019280 Heart failures Diseases 0.000 description 2
- 208000032759 Hemolytic-Uremic Syndrome Diseases 0.000 description 2
- 208000032982 Hemorrhagic Fever with Renal Syndrome Diseases 0.000 description 2
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 2
- 101000980825 Homo sapiens B-lymphocyte antigen CD19 Proteins 0.000 description 2
- 101000897405 Homo sapiens B-lymphocyte antigen CD20 Proteins 0.000 description 2
- 101000599852 Homo sapiens Intercellular adhesion molecule 1 Proteins 0.000 description 2
- 101001057504 Homo sapiens Interferon-stimulated gene 20 kDa protein Proteins 0.000 description 2
- 101001055144 Homo sapiens Interleukin-2 receptor subunit alpha Proteins 0.000 description 2
- 101001018097 Homo sapiens L-selectin Proteins 0.000 description 2
- 101000946889 Homo sapiens Monocyte differentiation antigen CD14 Proteins 0.000 description 2
- 101000934346 Homo sapiens T-cell surface antigen CD2 Proteins 0.000 description 2
- 101000716102 Homo sapiens T-cell surface glycoprotein CD4 Proteins 0.000 description 2
- 101000946843 Homo sapiens T-cell surface glycoprotein CD8 alpha chain Proteins 0.000 description 2
- 101000914514 Homo sapiens T-cell-specific surface glycoprotein CD28 Proteins 0.000 description 2
- 101000851370 Homo sapiens Tumor necrosis factor receptor superfamily member 9 Proteins 0.000 description 2
- 241000701806 Human papillomavirus Species 0.000 description 2
- 208000023105 Huntington disease Diseases 0.000 description 2
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 2
- 206010020772 Hypertension Diseases 0.000 description 2
- 206010020983 Hypogammaglobulinaemia Diseases 0.000 description 2
- XQFRJNBWHJMXHO-RRKCRQDMSA-N IDUR Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(I)=C1 XQFRJNBWHJMXHO-RRKCRQDMSA-N 0.000 description 2
- 208000010159 IgA glomerulonephritis Diseases 0.000 description 2
- 206010021263 IgA nephropathy Diseases 0.000 description 2
- 206010053574 Immunoblastic lymphoma Diseases 0.000 description 2
- 206010061598 Immunodeficiency Diseases 0.000 description 2
- 208000029462 Immunodeficiency disease Diseases 0.000 description 2
- 208000004187 Immunoglobulin G4-Related Disease Diseases 0.000 description 2
- 206010061218 Inflammation Diseases 0.000 description 2
- 102100037877 Intercellular adhesion molecule 1 Human genes 0.000 description 2
- 108010074328 Interferon-gamma Proteins 0.000 description 2
- 102000008070 Interferon-gamma Human genes 0.000 description 2
- 102100027268 Interferon-stimulated gene 20 kDa protein Human genes 0.000 description 2
- 208000007766 Kaposi sarcoma Diseases 0.000 description 2
- 208000011200 Kawasaki disease Diseases 0.000 description 2
- 102100033467 L-selectin Human genes 0.000 description 2
- 241000589248 Legionella Species 0.000 description 2
- 208000007764 Legionnaires' Disease Diseases 0.000 description 2
- 208000009829 Lewy Body Disease Diseases 0.000 description 2
- 201000002832 Lewy body dementia Diseases 0.000 description 2
- 206010024434 Lichen sclerosus Diseases 0.000 description 2
- 208000012309 Linear IgA disease Diseases 0.000 description 2
- 108060001084 Luciferase Proteins 0.000 description 2
- 239000005089 Luciferase Substances 0.000 description 2
- 206010025323 Lymphomas Diseases 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 208000000932 Marburg Virus Disease Diseases 0.000 description 2
- 201000011013 Marburg hemorrhagic fever Diseases 0.000 description 2
- 201000005505 Measles Diseases 0.000 description 2
- XUMBMVFBXHLACL-UHFFFAOYSA-N Melanin Chemical compound O=C1C(=O)C(C2=CNC3=C(C(C(=O)C4=C32)=O)C)=C2C4=CNC2=C1C XUMBMVFBXHLACL-UHFFFAOYSA-N 0.000 description 2
- 208000027530 Meniere disease Diseases 0.000 description 2
- 208000025370 Middle East respiratory syndrome Diseases 0.000 description 2
- 102100035877 Monocyte differentiation antigen CD14 Human genes 0.000 description 2
- 208000024599 Mooren ulcer Diseases 0.000 description 2
- 208000034578 Multiple myelomas Diseases 0.000 description 2
- 208000031888 Mycoses Diseases 0.000 description 2
- 208000033761 Myelogenous Chronic BCR-ABL Positive Leukemia Diseases 0.000 description 2
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 2
- MKWKNSIESPFAQN-UHFFFAOYSA-N N-cyclohexyl-2-aminoethanesulfonic acid Chemical compound OS(=O)(=O)CCNC1CCCCC1 MKWKNSIESPFAQN-UHFFFAOYSA-N 0.000 description 2
- LFTLOKWAGJYHHR-UHFFFAOYSA-N N-methylmorpholine N-oxide Chemical compound CN1(=O)CCOCC1 LFTLOKWAGJYHHR-UHFFFAOYSA-N 0.000 description 2
- 206010028851 Necrosis Diseases 0.000 description 2
- 206010029260 Neuroblastoma Diseases 0.000 description 2
- 208000002537 Neuronal Ceroid-Lipofuscinoses Diseases 0.000 description 2
- 241001263478 Norovirus Species 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 241000243985 Onchocerca volvulus Species 0.000 description 2
- 208000010195 Onychomycosis Diseases 0.000 description 2
- 208000003435 Optic Neuritis Diseases 0.000 description 2
- 206010033128 Ovarian cancer Diseases 0.000 description 2
- 206010061535 Ovarian neoplasm Diseases 0.000 description 2
- 206010053869 POEMS syndrome Diseases 0.000 description 2
- 241000517324 Pediculidae Species 0.000 description 2
- 241000517307 Pediculus humanus Species 0.000 description 2
- 108091005804 Peptidases Proteins 0.000 description 2
- 102000035195 Peptidases Human genes 0.000 description 2
- 108091000080 Phosphotransferase Proteins 0.000 description 2
- 206010035226 Plasma cell myeloma Diseases 0.000 description 2
- 206010035664 Pneumonia Diseases 0.000 description 2
- 239000002202 Polyethylene glycol Substances 0.000 description 2
- 108010059820 Polygalacturonase Proteins 0.000 description 2
- 239000004743 Polypropylene Substances 0.000 description 2
- WCUXLLCKKVVCTQ-UHFFFAOYSA-M Potassium chloride Chemical compound [Cl-].[K+] WCUXLLCKKVVCTQ-UHFFFAOYSA-M 0.000 description 2
- 208000012654 Primary biliary cholangitis Diseases 0.000 description 2
- 208000031951 Primary immunodeficiency Diseases 0.000 description 2
- 108091000054 Prion Proteins 0.000 description 2
- RJKFOVLPORLFTN-LEKSSAKUSA-N Progesterone Chemical compound C1CC2=CC(=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H](C(=O)C)[C@@]1(C)CC2 RJKFOVLPORLFTN-LEKSSAKUSA-N 0.000 description 2
- 208000037534 Progressive hemifacial atrophy Diseases 0.000 description 2
- 208000006311 Pyoderma Diseases 0.000 description 2
- 101710188535 RNA ligase 2 Proteins 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 101710204104 RNA-editing ligase 2, mitochondrial Proteins 0.000 description 2
- 206010063837 Reperfusion injury Diseases 0.000 description 2
- 208000000705 Rift Valley Fever Diseases 0.000 description 2
- 108010017324 STAT3 Transcription Factor Proteins 0.000 description 2
- 206010039710 Scleroderma Diseases 0.000 description 2
- 206010040070 Septic Shock Diseases 0.000 description 2
- 201000003176 Severe Acute Respiratory Syndrome Diseases 0.000 description 2
- 102100024040 Signal transducer and activator of transcription 3 Human genes 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 208000036834 Spinocerebellar ataxia type 3 Diseases 0.000 description 2
- 206010061372 Streptococcal infection Diseases 0.000 description 2
- 208000006011 Stroke Diseases 0.000 description 2
- PPBRXRYQALVLMV-UHFFFAOYSA-N Styrene Chemical compound C=CC1=CC=CC=C1 PPBRXRYQALVLMV-UHFFFAOYSA-N 0.000 description 2
- 206010042276 Subacute endocarditis Diseases 0.000 description 2
- 102100025237 T-cell surface antigen CD2 Human genes 0.000 description 2
- 102100036011 T-cell surface glycoprotein CD4 Human genes 0.000 description 2
- 102100034922 T-cell surface glycoprotein CD8 alpha chain Human genes 0.000 description 2
- 102100027213 T-cell-specific surface glycoprotein CD28 Human genes 0.000 description 2
- 108091012456 T4 RNA ligase 1 Proteins 0.000 description 2
- PZBFGYYEXUXCOF-UHFFFAOYSA-N TCEP Chemical compound OC(=O)CCP(CCC(O)=O)CCC(O)=O PZBFGYYEXUXCOF-UHFFFAOYSA-N 0.000 description 2
- 206010043561 Thrombocytopenic purpura Diseases 0.000 description 2
- 208000024770 Thyroid neoplasm Diseases 0.000 description 2
- 208000007712 Tinea Versicolor Diseases 0.000 description 2
- 206010043866 Tinea capitis Diseases 0.000 description 2
- 206010056131 Tinea versicolour Diseases 0.000 description 2
- 206010044248 Toxic shock syndrome Diseases 0.000 description 2
- 231100000650 Toxic shock syndrome Toxicity 0.000 description 2
- DTQVDTLACAAQTR-UHFFFAOYSA-N Trifluoroacetic acid Chemical compound OC(=O)C(F)(F)F DTQVDTLACAAQTR-UHFFFAOYSA-N 0.000 description 2
- 239000007983 Tris buffer Substances 0.000 description 2
- 102100036856 Tumor necrosis factor receptor superfamily member 9 Human genes 0.000 description 2
- 201000006704 Ulcerative Colitis Diseases 0.000 description 2
- 206010064996 Ulcerative keratitis Diseases 0.000 description 2
- 208000025851 Undifferentiated connective tissue disease Diseases 0.000 description 2
- 208000017379 Undifferentiated connective tissue syndrome Diseases 0.000 description 2
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 2
- 208000036142 Viral infection Diseases 0.000 description 2
- 206010047642 Vitiligo Diseases 0.000 description 2
- 208000000260 Warts Diseases 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 2
- 230000002378 acidificating effect Effects 0.000 description 2
- 208000036676 acute undifferentiated leukemia Diseases 0.000 description 2
- 229960005305 adenosine Drugs 0.000 description 2
- 150000003838 adenosines Chemical class 0.000 description 2
- 150000001345 alkine derivatives Chemical class 0.000 description 2
- 208000004631 alopecia areata Diseases 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 206010002022 amyloidosis Diseases 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 102000036639 antigens Human genes 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- 206010003246 arthritis Diseases 0.000 description 2
- 201000009361 ascariasis Diseases 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 201000009780 autoimmune polyendocrine syndrome type 2 Diseases 0.000 description 2
- 206010071578 autoimmune retinopathy Diseases 0.000 description 2
- 210000000467 autonomic pathway Anatomy 0.000 description 2
- 150000001540 azides Chemical class 0.000 description 2
- 230000008827 biological function Effects 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 210000001124 body fluid Anatomy 0.000 description 2
- 238000010504 bond cleavage reaction Methods 0.000 description 2
- 208000005881 bovine spongiform encephalopathy Diseases 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 208000000594 bullous pemphigoid Diseases 0.000 description 2
- 150000001720 carbohydrates Chemical class 0.000 description 2
- 235000014633 carbohydrates Nutrition 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 108091092259 cell-free RNA Proteins 0.000 description 2
- 239000000919 ceramic Substances 0.000 description 2
- 238000007385 chemical modification Methods 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 239000003638 chemical reducing agent Substances 0.000 description 2
- 238000000576 coating method Methods 0.000 description 2
- 201000003486 coccidioidomycosis Diseases 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000001447 compensatory effect Effects 0.000 description 2
- 201000004395 congenital heart block Diseases 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 208000029078 coronary artery disease Diseases 0.000 description 2
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 2
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 239000005546 dideoxynucleotide Substances 0.000 description 2
- BTVWZWFKMIUSGS-UHFFFAOYSA-N dimethylethyleneglycol Natural products CC(C)(O)CO BTVWZWFKMIUSGS-UHFFFAOYSA-N 0.000 description 2
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 2
- 235000011180 diphosphates Nutrition 0.000 description 2
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 201000002491 encephalomyelitis Diseases 0.000 description 2
- 206010014881 enterobiasis Diseases 0.000 description 2
- 201000000708 eosinophilic esophagitis Diseases 0.000 description 2
- 208000028104 epidemic louse-borne typhus Diseases 0.000 description 2
- 210000002919 epithelial cell Anatomy 0.000 description 2
- 231100000321 erythema Toxicity 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 208000030533 eye disease Diseases 0.000 description 2
- 208000002980 facial hemiatrophy Diseases 0.000 description 2
- 201000006061 fatal familial insomnia Diseases 0.000 description 2
- 230000001605 fetal effect Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 2
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 2
- 125000000524 functional group Chemical group 0.000 description 2
- 230000007614 genetic variation Effects 0.000 description 2
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical group [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 2
- 239000010931 gold Substances 0.000 description 2
- 229910052737 gold Inorganic materials 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 201000005648 hantavirus pulmonary syndrome Diseases 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 208000006454 hepatitis Diseases 0.000 description 2
- 231100000283 hepatitis Toxicity 0.000 description 2
- 208000002672 hepatitis B Diseases 0.000 description 2
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 2
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 2
- 208000002557 hidradenitis Diseases 0.000 description 2
- 201000007162 hidradenitis suppurativa Diseases 0.000 description 2
- 239000000017 hydrogel Substances 0.000 description 2
- 206010021198 ichthyosis Diseases 0.000 description 2
- 230000007813 immunodeficiency Effects 0.000 description 2
- 208000015446 immunoglobulin a vasculitis Diseases 0.000 description 2
- 238000009169 immunotherapy Methods 0.000 description 2
- 201000004933 in situ carcinoma Diseases 0.000 description 2
- 230000002458 infectious effect Effects 0.000 description 2
- 201000006747 infectious mononucleosis Diseases 0.000 description 2
- 230000002757 inflammatory effect Effects 0.000 description 2
- 230000004054 inflammatory process Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000009319 interchromosomal translocation Effects 0.000 description 2
- 229960003130 interferon gamma Drugs 0.000 description 2
- 238000007852 inverse PCR Methods 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 208000017476 juvenile neuronal ceroid lipofuscinosis Diseases 0.000 description 2
- 206010023497 kuru Diseases 0.000 description 2
- 201000011486 lichen planus Diseases 0.000 description 2
- 238000007169 ligase reaction Methods 0.000 description 2
- 210000004072 lung Anatomy 0.000 description 2
- 208000025036 lymphosarcoma Diseases 0.000 description 2
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 2
- 210000001161 mammalian embryo Anatomy 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 230000008018 melting Effects 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- 239000004005 microsphere Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 208000008588 molluscum contagiosum Diseases 0.000 description 2
- 208000001725 mucocutaneous lymph node syndrome Diseases 0.000 description 2
- 206010065579 multifocal motor neuropathy Diseases 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 208000009091 myxoma Diseases 0.000 description 2
- 230000017074 necrotic cell death Effects 0.000 description 2
- 201000008383 nephritis Diseases 0.000 description 2
- 201000007607 neuronal ceroid lipofuscinosis 3 Diseases 0.000 description 2
- 208000004235 neutropenia Diseases 0.000 description 2
- QJGQUHMNIGDVPM-UHFFFAOYSA-N nitrogen group Chemical group [N] QJGQUHMNIGDVPM-UHFFFAOYSA-N 0.000 description 2
- 230000000422 nocturnal effect Effects 0.000 description 2
- 238000001668 nucleic acid synthesis Methods 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 239000007800 oxidant agent Substances 0.000 description 2
- 208000008443 pancreatic carcinoma Diseases 0.000 description 2
- 244000052769 pathogen Species 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 239000012071 phase Substances 0.000 description 2
- 239000002953 phosphate buffered saline Substances 0.000 description 2
- XYFCBTPGUUZFHI-UHFFFAOYSA-N phosphine group Chemical group P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 2
- XUYJLQHKOGNDPB-UHFFFAOYSA-N phosphonoacetic acid Chemical compound OC(=O)CP(O)(O)=O XUYJLQHKOGNDPB-UHFFFAOYSA-N 0.000 description 2
- PTMHPRAIXMAOOB-UHFFFAOYSA-L phosphoramidate Chemical compound NP([O-])([O-])=O PTMHPRAIXMAOOB-UHFFFAOYSA-L 0.000 description 2
- 102000020233 phosphotransferase Human genes 0.000 description 2
- 208000031223 plasma cell leukemia Diseases 0.000 description 2
- 201000006292 polyarteritis nodosa Diseases 0.000 description 2
- 229920001223 polyethylene glycol Polymers 0.000 description 2
- 229920001155 polypropylene Polymers 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 201000000742 primary sclerosing cholangitis Diseases 0.000 description 2
- 150000003212 purines Chemical class 0.000 description 2
- 125000000561 purinyl group Chemical group N1=C(N=C2N=CNC2=C1)* 0.000 description 2
- 206010037844 rash Diseases 0.000 description 2
- 208000002574 reactive arthritis Diseases 0.000 description 2
- 238000003753 real-time PCR Methods 0.000 description 2
- 230000011514 reflex Effects 0.000 description 2
- 206010039073 rheumatoid arthritis Diseases 0.000 description 2
- 239000001022 rhodamine dye Substances 0.000 description 2
- 208000010157 sclerosing cholangitis Diseases 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 201000005113 shigellosis Diseases 0.000 description 2
- 108700014590 single-stranded DNA binding proteins Proteins 0.000 description 2
- 201000010153 skin papilloma Diseases 0.000 description 2
- 208000000649 small cell carcinoma Diseases 0.000 description 2
- 235000002639 sodium chloride Nutrition 0.000 description 2
- 230000037439 somatic mutation Effects 0.000 description 2
- PFNFFQXMRSDOHW-UHFFFAOYSA-N spermine Chemical compound NCCCNCCCCNCCCN PFNFFQXMRSDOHW-UHFFFAOYSA-N 0.000 description 2
- 208000008467 subacute bacterial endocarditis Diseases 0.000 description 2
- 208000004441 taeniasis Diseases 0.000 description 2
- ILMRJRBKQSSXGY-UHFFFAOYSA-N tert-butyl(dimethyl)silicon Chemical compound C[Si](C)C(C)(C)C ILMRJRBKQSSXGY-UHFFFAOYSA-N 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 201000002510 thyroid cancer Diseases 0.000 description 2
- 201000003875 tinea corporis Diseases 0.000 description 2
- 201000005882 tinea unguium Diseases 0.000 description 2
- 208000009920 trichuriasis Diseases 0.000 description 2
- 231100000397 ulcer Toxicity 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 230000000304 vasodilatating effect Effects 0.000 description 2
- 206010047470 viral myocarditis Diseases 0.000 description 2
- 230000003612 virological effect Effects 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- 101710194665 1-aminocyclopropane-1-carboxylate synthase Proteins 0.000 description 1
- 102100026210 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-2 Human genes 0.000 description 1
- ZIIUUSVHCHPIQD-UHFFFAOYSA-N 2,4,6-trimethyl-N-[3-(trifluoromethyl)phenyl]benzenesulfonamide Chemical compound CC1=CC(C)=CC(C)=C1S(=O)(=O)NC1=CC=CC(C(F)(F)F)=C1 ZIIUUSVHCHPIQD-UHFFFAOYSA-N 0.000 description 1
- 229940058020 2-amino-2-methyl-1-propanol Drugs 0.000 description 1
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical group OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 1
- KQRBOLDPRPDVIM-UHFFFAOYSA-N 2-prop-1-ynylpyrimidine Chemical compound CC#CC1=NC=CC=N1 KQRBOLDPRPDVIM-UHFFFAOYSA-N 0.000 description 1
- INEWUCPYEUEQTN-UHFFFAOYSA-N 3-(cyclohexylamino)-2-hydroxy-1-propanesulfonic acid Chemical compound OS(=O)(=O)CC(O)CNC1CCCCC1 INEWUCPYEUEQTN-UHFFFAOYSA-N 0.000 description 1
- YICAEXQYKBMDNH-UHFFFAOYSA-N 3-[bis(3-hydroxypropyl)phosphanyl]propan-1-ol Chemical compound OCCCP(CCCO)CCCO YICAEXQYKBMDNH-UHFFFAOYSA-N 0.000 description 1
- QOXOZONBQWIKDA-UHFFFAOYSA-N 3-hydroxypropyl Chemical group [CH2]CCO QOXOZONBQWIKDA-UHFFFAOYSA-N 0.000 description 1
- ZAYHVCMSTBRABG-UHFFFAOYSA-N 5-Methylcytidine Natural products O=C1N=C(N)C(C)=CN1C1C(O)C(O)C(CO)O1 ZAYHVCMSTBRABG-UHFFFAOYSA-N 0.000 description 1
- LUCHPKXVUGJYGU-XLPZGREQSA-N 5-methyl-2'-deoxycytidine Chemical compound O=C1N=C(N)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 LUCHPKXVUGJYGU-XLPZGREQSA-N 0.000 description 1
- ZAYHVCMSTBRABG-JXOAFFINSA-N 5-methylcytidine Chemical group O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 ZAYHVCMSTBRABG-JXOAFFINSA-N 0.000 description 1
- 108010011619 6-Phytase Proteins 0.000 description 1
- LOSIULRWFAEMFL-UHFFFAOYSA-N 7-deazaguanine Chemical compound O=C1NC(N)=NC2=C1CC=N2 LOSIULRWFAEMFL-UHFFFAOYSA-N 0.000 description 1
- 239000007991 ACES buffer Substances 0.000 description 1
- 125000003345 AMP group Chemical group 0.000 description 1
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 1
- 206010063409 Acarodermatitis Diseases 0.000 description 1
- 208000016557 Acute basophilic leukemia Diseases 0.000 description 1
- 208000032194 Acute haemorrhagic leukoencephalitis Diseases 0.000 description 1
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 1
- 206010000871 Acute monocytic leukaemia Diseases 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 241000321096 Adenoides Species 0.000 description 1
- 102100034540 Adenomatous polyposis coli protein Human genes 0.000 description 1
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 1
- 102100040149 Adenylyl-sulfate kinase Human genes 0.000 description 1
- 208000006468 Adrenal Cortex Neoplasms Diseases 0.000 description 1
- 208000009746 Adult T-Cell Leukemia-Lymphoma Diseases 0.000 description 1
- 208000016683 Adult T-cell leukemia/lymphoma Diseases 0.000 description 1
- 244000058084 Aegle marmelos Species 0.000 description 1
- 235000003930 Aegle marmelos Nutrition 0.000 description 1
- 208000000230 African Trypanosomiasis Diseases 0.000 description 1
- 108010021809 Alcohol dehydrogenase Proteins 0.000 description 1
- 102000007698 Alcohol dehydrogenase Human genes 0.000 description 1
- 201000010053 Alcoholic Cardiomyopathy Diseases 0.000 description 1
- 208000035805 Aleukaemic leukaemia Diseases 0.000 description 1
- 239000012099 Alexa Fluor family Substances 0.000 description 1
- 208000011403 Alexander disease Diseases 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 208000032671 Allergic granulomatous angiitis Diseases 0.000 description 1
- 208000004881 Amebiasis Diseases 0.000 description 1
- 201000000736 Amenorrhea Diseases 0.000 description 1
- 206010001928 Amenorrhoea Diseases 0.000 description 1
- 206010001980 Amoebiasis Diseases 0.000 description 1
- 239000004382 Amylase Substances 0.000 description 1
- 108010065511 Amylases Proteins 0.000 description 1
- 102000013142 Amylases Human genes 0.000 description 1
- 201000002045 Ancylostomiasis Diseases 0.000 description 1
- 208000028185 Angioedema Diseases 0.000 description 1
- 208000033211 Ankylostomiasis Diseases 0.000 description 1
- 208000032467 Aplastic anaemia Diseases 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- 208000002150 Arrhythmogenic Right Ventricular Dysplasia Diseases 0.000 description 1
- 201000006058 Arrhythmogenic right ventricular cardiomyopathy Diseases 0.000 description 1
- 206010003267 Arthritis reactive Diseases 0.000 description 1
- 201000002909 Aspergillosis Diseases 0.000 description 1
- 208000036641 Aspergillus infections Diseases 0.000 description 1
- 101100460704 Aspergillus sp. (strain MF297-2) notI gene Proteins 0.000 description 1
- 206010003594 Ataxia telangiectasia Diseases 0.000 description 1
- 102000007371 Ataxin-3 Human genes 0.000 description 1
- 206010003805 Autism Diseases 0.000 description 1
- 208000020706 Autistic disease Diseases 0.000 description 1
- 208000032116 Autoimmune Experimental Encephalomyelitis Diseases 0.000 description 1
- 206010071576 Autoimmune aplastic anaemia Diseases 0.000 description 1
- 206010071577 Autoimmune hyperlipidaemia Diseases 0.000 description 1
- 206010003840 Autonomic nervous system imbalance Diseases 0.000 description 1
- 208000010839 B-cell chronic lymphocytic leukemia Diseases 0.000 description 1
- 102100021631 B-cell lymphoma 6 protein Human genes 0.000 description 1
- 108010074708 B7-H1 Antigen Proteins 0.000 description 1
- 208000017392 BENTA disease Diseases 0.000 description 1
- 206010055181 BK virus infection Diseases 0.000 description 1
- 108700020463 BRCA1 Proteins 0.000 description 1
- 101150072950 BRCA1 gene Proteins 0.000 description 1
- 108700020462 BRCA2 Proteins 0.000 description 1
- 102000052609 BRCA2 Human genes 0.000 description 1
- 241000223836 Babesia Species 0.000 description 1
- 241000193738 Bacillus anthracis Species 0.000 description 1
- 206010060976 Bacillus infection Diseases 0.000 description 1
- 201000001178 Bacterial Pneumonia Diseases 0.000 description 1
- 208000004926 Bacterial Vaginosis Diseases 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 208000034974 Bacteroides Infections Diseases 0.000 description 1
- 208000023328 Basedow disease Diseases 0.000 description 1
- 102100026189 Beta-galactosidase Human genes 0.000 description 1
- 102100022548 Beta-hexosaminidase subunit alpha Human genes 0.000 description 1
- 208000020925 Bipolar disease Diseases 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 241000255789 Bombyx mori Species 0.000 description 1
- 101150008921 Brca2 gene Proteins 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 102100025401 Breast cancer type 1 susceptibility protein Human genes 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 206010006500 Brucellosis Diseases 0.000 description 1
- 206010068597 Bulbospinal muscular atrophy congenital Diseases 0.000 description 1
- 206010073031 Burkholderia infection Diseases 0.000 description 1
- 239000008000 CHES buffer Substances 0.000 description 1
- 201000002829 CREST Syndrome Diseases 0.000 description 1
- 108091033409 CRISPR Proteins 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 102100038700 Calcium-responsive transactivator Human genes 0.000 description 1
- 208000022526 Canavan disease Diseases 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108090000489 Carboxy-Lyases Proteins 0.000 description 1
- 102000004031 Carboxy-Lyases Human genes 0.000 description 1
- 206010007572 Cardiac hypertrophy Diseases 0.000 description 1
- 208000006029 Cardiomegaly Diseases 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 206010007637 Cardiomyopathy alcoholic Diseases 0.000 description 1
- 102100026089 Caspase recruitment domain-containing protein 9 Human genes 0.000 description 1
- 108010076667 Caspases Proteins 0.000 description 1
- 102000011727 Caspases Human genes 0.000 description 1
- 208000003732 Cat-scratch disease Diseases 0.000 description 1
- 102100035882 Catalase Human genes 0.000 description 1
- 108010053835 Catalase Proteins 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 1
- 108010059892 Cellulase Proteins 0.000 description 1
- 206010007882 Cellulitis Diseases 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 241000282994 Cervidae Species 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 201000009182 Chikungunya Diseases 0.000 description 1
- 208000004293 Chikungunya Fever Diseases 0.000 description 1
- 108010022172 Chitinases Proteins 0.000 description 1
- 102000012286 Chitinases Human genes 0.000 description 1
- 206010061041 Chlamydial infection Diseases 0.000 description 1
- 208000035086 Chlamydophila Infections Diseases 0.000 description 1
- 206010008631 Cholera Diseases 0.000 description 1
- 206010008685 Chondritis Diseases 0.000 description 1
- 208000005243 Chondrosarcoma Diseases 0.000 description 1
- 206010008761 Choriomeningitis lymphocytic Diseases 0.000 description 1
- 208000016718 Chromosome Inversion Diseases 0.000 description 1
- 208000006344 Churg-Strauss Syndrome Diseases 0.000 description 1
- 206010009344 Clonorchiasis Diseases 0.000 description 1
- 241000193403 Clostridium Species 0.000 description 1
- 208000037384 Clostridium Infections Diseases 0.000 description 1
- 206010009657 Clostridium difficile colitis Diseases 0.000 description 1
- 241000193468 Clostridium perfringens Species 0.000 description 1
- 241000223205 Coccidioides immitis Species 0.000 description 1
- 208000010200 Cockayne syndrome Diseases 0.000 description 1
- 208000010007 Cogan syndrome Diseases 0.000 description 1
- 208000011038 Cold agglutinin disease Diseases 0.000 description 1
- 206010009868 Cold type haemolytic anaemia Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 206010010099 Combined immunodeficiency Diseases 0.000 description 1
- 108091028732 Concatemer Proteins 0.000 description 1
- 208000002330 Congenital Heart Defects Diseases 0.000 description 1
- 208000025212 Constitutional neutropenia Diseases 0.000 description 1
- 208000034656 Contusions Diseases 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 201000006306 Cor pulmonale Diseases 0.000 description 1
- KQLDDLUWUFBQHP-UHFFFAOYSA-N Cordycepin Natural products C1=NC=2C(N)=NC=NC=2N1C1OCC(CO)C1O KQLDDLUWUFBQHP-UHFFFAOYSA-N 0.000 description 1
- 241000711573 Coronaviridae Species 0.000 description 1
- 208000011990 Corticobasal Degeneration Diseases 0.000 description 1
- 206010011258 Coxsackie myocarditis Diseases 0.000 description 1
- 108010051219 Cre recombinase Proteins 0.000 description 1
- 201000003075 Crimean-Congo hemorrhagic fever Diseases 0.000 description 1
- 208000019707 Cryoglobulinemic vasculitis Diseases 0.000 description 1
- 201000007336 Cryptococcosis Diseases 0.000 description 1
- 241000221204 Cryptococcus neoformans Species 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- 241000223935 Cryptosporidium Species 0.000 description 1
- 229920000089 Cyclic olefin copolymer Polymers 0.000 description 1
- 239000004713 Cyclic olefin copolymer Substances 0.000 description 1
- 108050006400 Cyclin Proteins 0.000 description 1
- PMATZTZNYRCHOR-CGLBZJNRSA-N Cyclosporin A Chemical compound CC[C@@H]1NC(=O)[C@H]([C@H](O)[C@H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)N(C)C(=O)CN(C)C1=O PMATZTZNYRCHOR-CGLBZJNRSA-N 0.000 description 1
- 108010036949 Cyclosporine Proteins 0.000 description 1
- 206010011732 Cyst Diseases 0.000 description 1
- 108010076010 Cystathionine beta-lyase Proteins 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- 206010011831 Cytomegalovirus infection Diseases 0.000 description 1
- 102100039498 Cytotoxic T-lymphocyte protein 4 Human genes 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- 108010066133 D-octopine dehydrogenase Proteins 0.000 description 1
- 108020001738 DNA Glycosylase Proteins 0.000 description 1
- 108020001019 DNA Primers Proteins 0.000 description 1
- 102000028381 DNA glycosylase Human genes 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 101710159156 DNA polymerase IV Proteins 0.000 description 1
- 108010025600 DNA polymerase iota Proteins 0.000 description 1
- 108010061914 DNA polymerase mu Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 102100024350 Dedicator of cytokinesis protein 8 Human genes 0.000 description 1
- CYCGRDQQIOGCKX-UHFFFAOYSA-N Dehydro-luciferin Natural products OC(=O)C1=CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 CYCGRDQQIOGCKX-UHFFFAOYSA-N 0.000 description 1
- 206010012289 Dementia Diseases 0.000 description 1
- 208000001490 Dengue Diseases 0.000 description 1
- 206010012310 Dengue fever Diseases 0.000 description 1
- 201000004624 Dermatitis Diseases 0.000 description 1
- 206010012438 Dermatitis atopic Diseases 0.000 description 1
- 206010048768 Dermatosis Diseases 0.000 description 1
- 108700029231 Developmental Genes Proteins 0.000 description 1
- LTMHDMANZUZIPE-AMTYYWEZSA-N Digoxin Natural products O([C@H]1[C@H](C)O[C@H](O[C@@H]2C[C@@H]3[C@@](C)([C@@H]4[C@H]([C@]5(O)[C@](C)([C@H](O)C4)[C@H](C4=CC(=O)OC4)CC5)CC3)CC2)C[C@@H]1O)[C@H]1O[C@H](C)[C@@H](O[C@H]2O[C@@H](C)[C@H](O)[C@@H](O)C2)[C@@H](O)C1 LTMHDMANZUZIPE-AMTYYWEZSA-N 0.000 description 1
- 208000006926 Discoid Lupus Erythematosus Diseases 0.000 description 1
- 101100224482 Drosophila melanogaster PolE1 gene Proteins 0.000 description 1
- 208000006402 Ductal Carcinoma Diseases 0.000 description 1
- 102100035813 E3 ubiquitin-protein ligase CBL Human genes 0.000 description 1
- 102000012199 E3 ubiquitin-protein ligase Mdm2 Human genes 0.000 description 1
- 108050002772 E3 ubiquitin-protein ligase Mdm2 Proteins 0.000 description 1
- 102000001301 EGF receptor Human genes 0.000 description 1
- 201000011001 Ebola Hemorrhagic Fever Diseases 0.000 description 1
- 208000030820 Ebola disease Diseases 0.000 description 1
- 206010014096 Echinococciasis Diseases 0.000 description 1
- 208000009366 Echinococcosis Diseases 0.000 description 1
- 241000605314 Ehrlichia Species 0.000 description 1
- 201000009051 Embryonal Carcinoma Diseases 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 206010014611 Encephalitis venezuelan equine Diseases 0.000 description 1
- 101710121765 Endo-1,4-beta-xylanase Proteins 0.000 description 1
- 208000001976 Endocrine Gland Neoplasms Diseases 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 206010057649 Endometrial sarcoma Diseases 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- 208000004232 Enteritis Diseases 0.000 description 1
- 206010014909 Enterovirus infection Diseases 0.000 description 1
- 206010014958 Eosinophilic leukaemia Diseases 0.000 description 1
- 241000124092 Escherichia virus N15 Species 0.000 description 1
- 208000000289 Esophageal Achalasia Diseases 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 208000032027 Essential Thrombocythemia Diseases 0.000 description 1
- 102100034169 Eukaryotic translation initiation factor 2-alpha kinase 1 Human genes 0.000 description 1
- 101710196289 Eukaryotic translation initiation factor 2-alpha kinase 1 Proteins 0.000 description 1
- 208000001382 Experimental Melanoma Diseases 0.000 description 1
- 201000006850 Familial medullary thyroid carcinoma Diseases 0.000 description 1
- 241001126309 Fasciolopsis Species 0.000 description 1
- 208000001640 Fibromyalgia Diseases 0.000 description 1
- 201000008808 Fibrosarcoma Diseases 0.000 description 1
- BJGNCJDXODQBOB-UHFFFAOYSA-N Fivefly Luciferin Natural products OC(=O)C1CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 BJGNCJDXODQBOB-UHFFFAOYSA-N 0.000 description 1
- KRHYYFGTRYWZRS-UHFFFAOYSA-M Fluoride anion Chemical compound [F-] KRHYYFGTRYWZRS-UHFFFAOYSA-M 0.000 description 1
- 206010016952 Food poisoning Diseases 0.000 description 1
- 208000019331 Foodborne disease Diseases 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 208000000259 GATA2 Deficiency Diseases 0.000 description 1
- 208000022140 GATA2 deficiency with susceptibility to MDS/AML Diseases 0.000 description 1
- 102100029974 GTPase HRas Human genes 0.000 description 1
- 102100030708 GTPase KRas Human genes 0.000 description 1
- 102100039788 GTPase NRas Human genes 0.000 description 1
- 108010093031 Galactosidases Proteins 0.000 description 1
- 102000002464 Galactosidases Human genes 0.000 description 1
- 201000000628 Gas Gangrene Diseases 0.000 description 1
- 206010017915 Gastroenteritis shigella Diseases 0.000 description 1
- 206010017916 Gastroenteritis staphylococcal Diseases 0.000 description 1
- 208000034951 Genetic Translocation Diseases 0.000 description 1
- 241000626621 Geobacillus Species 0.000 description 1
- 241000159512 Geotrichum Species 0.000 description 1
- 208000008999 Giant Cell Carcinoma Diseases 0.000 description 1
- 208000032612 Glial tumor Diseases 0.000 description 1
- 201000010915 Glioblastoma multiforme Diseases 0.000 description 1
- 206010018338 Glioma Diseases 0.000 description 1
- 208000010055 Globoid Cell Leukodystrophy Diseases 0.000 description 1
- 206010018404 Glucagonoma Diseases 0.000 description 1
- 108010073178 Glucan 1,4-alpha-Glucosidase Proteins 0.000 description 1
- 239000004366 Glucose oxidase Substances 0.000 description 1
- 108010015776 Glucose oxidase Proteins 0.000 description 1
- 108700023224 Glucose-1-phosphate adenylyltransferases Proteins 0.000 description 1
- 206010018612 Gonorrhoea Diseases 0.000 description 1
- 208000024869 Goodpasture syndrome Diseases 0.000 description 1
- 206010018691 Granuloma Diseases 0.000 description 1
- 206010072579 Granulomatosis with polyangiitis Diseases 0.000 description 1
- 208000003084 Graves Ophthalmopathy Diseases 0.000 description 1
- 208000015023 Graves' disease Diseases 0.000 description 1
- 208000035895 Guillain-Barré syndrome Diseases 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 206010061190 Haemophilus infection Diseases 0.000 description 1
- 208000001204 Hashimoto Disease Diseases 0.000 description 1
- 206010019375 Helicobacter infections Diseases 0.000 description 1
- 208000035186 Hemolytic Autoimmune Anemia Diseases 0.000 description 1
- 201000004331 Henoch-Schoenlein purpura Diseases 0.000 description 1
- 206010019617 Henoch-Schonlein purpura Diseases 0.000 description 1
- 208000005176 Hepatitis C Diseases 0.000 description 1
- 208000005331 Hepatitis D Diseases 0.000 description 1
- 208000028782 Hereditary disease Diseases 0.000 description 1
- 208000009889 Herpes Simplex Diseases 0.000 description 1
- 102100022103 Histone-lysine N-methyltransferase 2A Human genes 0.000 description 1
- 201000002563 Histoplasmosis Diseases 0.000 description 1
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000017662 Hodgkin disease lymphocyte depletion type stage unspecified Diseases 0.000 description 1
- 208000021519 Hodgkin lymphoma Diseases 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000691589 Homo sapiens 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-2 Proteins 0.000 description 1
- 101000924577 Homo sapiens Adenomatous polyposis coli protein Proteins 0.000 description 1
- 101000971234 Homo sapiens B-cell lymphoma 6 protein Proteins 0.000 description 1
- 101000957728 Homo sapiens Calcium-responsive transactivator Proteins 0.000 description 1
- 101000983508 Homo sapiens Caspase recruitment domain-containing protein 9 Proteins 0.000 description 1
- 101000889276 Homo sapiens Cytotoxic T-lymphocyte protein 4 Proteins 0.000 description 1
- 101001052946 Homo sapiens Dedicator of cytokinesis protein 8 Proteins 0.000 description 1
- 101000851181 Homo sapiens Epidermal growth factor receptor Proteins 0.000 description 1
- 101000584633 Homo sapiens GTPase HRas Proteins 0.000 description 1
- 101000584612 Homo sapiens GTPase KRas Proteins 0.000 description 1
- 101000744505 Homo sapiens GTPase NRas Proteins 0.000 description 1
- 101001045846 Homo sapiens Histone-lysine N-methyltransferase 2A Proteins 0.000 description 1
- 101001017764 Homo sapiens Lipopolysaccharide-responsive and beige-like anchor protein Proteins 0.000 description 1
- 101001064870 Homo sapiens Lon protease homolog, mitochondrial Proteins 0.000 description 1
- 101000916644 Homo sapiens Macrophage colony-stimulating factor 1 receptor Proteins 0.000 description 1
- 101000954986 Homo sapiens Merlin Proteins 0.000 description 1
- 101001030211 Homo sapiens Myc proto-oncogene protein Proteins 0.000 description 1
- 101000876829 Homo sapiens Protein C-ets-1 Proteins 0.000 description 1
- 101000585703 Homo sapiens Protein L-Myc Proteins 0.000 description 1
- 101000573199 Homo sapiens Protein PML Proteins 0.000 description 1
- 101000861454 Homo sapiens Protein c-Fos Proteins 0.000 description 1
- 101000579425 Homo sapiens Proto-oncogene tyrosine-protein kinase receptor Ret Proteins 0.000 description 1
- 101000742859 Homo sapiens Retinoblastoma-associated protein Proteins 0.000 description 1
- 101000857677 Homo sapiens Runt-related transcription factor 1 Proteins 0.000 description 1
- 101000595531 Homo sapiens Serine/threonine-protein kinase pim-1 Proteins 0.000 description 1
- 101000891113 Homo sapiens T-cell acute lymphocytic leukemia protein 1 Proteins 0.000 description 1
- 101000800488 Homo sapiens T-cell leukemia homeobox protein 1 Proteins 0.000 description 1
- 101000837626 Homo sapiens Thyroid hormone receptor alpha Proteins 0.000 description 1
- 101000813738 Homo sapiens Transcription factor ETV6 Proteins 0.000 description 1
- 101000636213 Homo sapiens Transcriptional activator Myb Proteins 0.000 description 1
- 101000801234 Homo sapiens Tumor necrosis factor receptor superfamily member 18 Proteins 0.000 description 1
- 101000823316 Homo sapiens Tyrosine-protein kinase ABL1 Proteins 0.000 description 1
- 101000912503 Homo sapiens Tyrosine-protein kinase Fgr Proteins 0.000 description 1
- 101001022129 Homo sapiens Tyrosine-protein kinase Fyn Proteins 0.000 description 1
- 101001047681 Homo sapiens Tyrosine-protein kinase Lck Proteins 0.000 description 1
- 101001054878 Homo sapiens Tyrosine-protein kinase Lyn Proteins 0.000 description 1
- 206010020376 Hookworm infection Diseases 0.000 description 1
- 238000009015 Human TaqMan MicroRNA Assay kit Methods 0.000 description 1
- 241000342334 Human metapneumovirus Species 0.000 description 1
- 208000029966 Hutchinson Melanotic Freckle Diseases 0.000 description 1
- 208000037147 Hypercalcaemia Diseases 0.000 description 1
- 208000035150 Hypercholesterolemia Diseases 0.000 description 1
- 206010048643 Hypereosinophilic syndrome Diseases 0.000 description 1
- 208000033892 Hyperhomocysteinemia Diseases 0.000 description 1
- 206010058222 Hypertensive cardiomyopathy Diseases 0.000 description 1
- 210000005131 Hürthle cell Anatomy 0.000 description 1
- 201000009794 Idiopathic Pulmonary Fibrosis Diseases 0.000 description 1
- 208000031814 IgA Vasculitis Diseases 0.000 description 1
- DGAQECJNVWCQMB-PUAWFVPOSA-M Ilexoside XXIX Chemical compound C[C@@H]1CC[C@@]2(CC[C@@]3(C(=CC[C@H]4[C@]3(CC[C@@H]5[C@@]4(CC[C@@H](C5(C)C)OS(=O)(=O)[O-])C)C)[C@@H]2[C@]1(C)O)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O)O)O.[Na+] DGAQECJNVWCQMB-PUAWFVPOSA-M 0.000 description 1
- 208000028622 Immune thrombocytopenia Diseases 0.000 description 1
- 206010052210 Infantile genetic agranulocytosis Diseases 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 102000013462 Interleukin-12 Human genes 0.000 description 1
- 108010065805 Interleukin-12 Proteins 0.000 description 1
- 102000013264 Interleukin-23 Human genes 0.000 description 1
- 108010065637 Interleukin-23 Proteins 0.000 description 1
- 206010022557 Intermediate uveitis Diseases 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 229920001202 Inulin Polymers 0.000 description 1
- 102000004195 Isomerases Human genes 0.000 description 1
- 108090000769 Isomerases Proteins 0.000 description 1
- 241000567229 Isospora Species 0.000 description 1
- 206010023256 Juvenile melanoma benign Diseases 0.000 description 1
- 208000027747 Kennedy disease Diseases 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- 208000028226 Krabbe disease Diseases 0.000 description 1
- 108010059881 Lactase Proteins 0.000 description 1
- 208000007177 Left Ventricular Hypertrophy Diseases 0.000 description 1
- 208000004554 Leishmaniasis Diseases 0.000 description 1
- 206010024229 Leprosy Diseases 0.000 description 1
- 206010024238 Leptospirosis Diseases 0.000 description 1
- 206010024305 Leukaemia monocytic Diseases 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- 206010024612 Lipoma Diseases 0.000 description 1
- 102100033353 Lipopolysaccharide-responsive and beige-like anchor protein Human genes 0.000 description 1
- 108090001030 Lipoproteins Proteins 0.000 description 1
- 102000004895 Lipoproteins Human genes 0.000 description 1
- 102000003820 Lipoxygenases Human genes 0.000 description 1
- 108090000128 Lipoxygenases Proteins 0.000 description 1
- 241000186781 Listeria Species 0.000 description 1
- 241000406668 Loxodonta cyclotis Species 0.000 description 1
- DDWFXDSYGUXRAY-UHFFFAOYSA-N Luciferin Natural products CCc1c(C)c(CC2NC(=O)C(=C2C=C)C)[nH]c1Cc3[nH]c4C(=C5/NC(CC(=O)O)C(C)C5CC(=O)O)CC(=O)c4c3C DDWFXDSYGUXRAY-UHFFFAOYSA-N 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 208000031422 Lymphocytic Chronic B-Cell Leukemia Diseases 0.000 description 1
- 208000028018 Lymphocytic leukaemia Diseases 0.000 description 1
- 102000008072 Lymphokines Human genes 0.000 description 1
- 108010074338 Lymphokines Proteins 0.000 description 1
- 208000030289 Lymphoproliferative disease Diseases 0.000 description 1
- 239000007993 MOPS buffer Substances 0.000 description 1
- 108700012912 MYCN Proteins 0.000 description 1
- 101150022024 MYCN gene Proteins 0.000 description 1
- 102100028198 Macrophage colony-stimulating factor 1 receptor Human genes 0.000 description 1
- 208000002720 Malnutrition Diseases 0.000 description 1
- 208000009018 Medullary thyroid cancer Diseases 0.000 description 1
- 208000037196 Medullary thyroid carcinoma Diseases 0.000 description 1
- 208000000172 Medulloblastoma Diseases 0.000 description 1
- 206010027145 Melanocytic naevus Diseases 0.000 description 1
- 201000009906 Meningitis Diseases 0.000 description 1
- 206010027202 Meningitis bacterial Diseases 0.000 description 1
- 102100037106 Merlin Human genes 0.000 description 1
- 206010027406 Mesothelioma Diseases 0.000 description 1
- 206010066226 Metapneumovirus infection Diseases 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 241000243190 Microsporidia Species 0.000 description 1
- 206010049567 Miller Fisher syndrome Diseases 0.000 description 1
- 108010074633 Mixed Function Oxygenases Proteins 0.000 description 1
- 102000008109 Mixed Function Oxygenases Human genes 0.000 description 1
- 208000035489 Monocytic Acute Leukemia Diseases 0.000 description 1
- 102100025725 Mothers against decapentaplegic homolog 4 Human genes 0.000 description 1
- 208000012192 Mucous membrane pemphigoid Diseases 0.000 description 1
- 206010073148 Multiple endocrine neoplasia type 2A Diseases 0.000 description 1
- 208000001089 Multiple system atrophy Diseases 0.000 description 1
- 208000005647 Mumps Diseases 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 208000001572 Mycoplasma Pneumonia Diseases 0.000 description 1
- 241000204051 Mycoplasma genitalium Species 0.000 description 1
- 201000008235 Mycoplasma pneumoniae pneumonia Diseases 0.000 description 1
- 206010028570 Myelopathy Diseases 0.000 description 1
- 208000006123 Myiasis Diseases 0.000 description 1
- 208000009525 Myocarditis Diseases 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- 108700026495 N-Myc Proto-Oncogene Proteins 0.000 description 1
- CHJJGSNFBQVOTG-UHFFFAOYSA-N N-methyl-guanidine Natural products CNC(N)=N CHJJGSNFBQVOTG-UHFFFAOYSA-N 0.000 description 1
- 102100030124 N-myc proto-oncogene protein Human genes 0.000 description 1
- 208000002454 Nasopharyngeal Carcinoma Diseases 0.000 description 1
- 206010061306 Nasopharyngeal cancer Diseases 0.000 description 1
- 241000244206 Nematoda Species 0.000 description 1
- 206010062701 Nematodiasis Diseases 0.000 description 1
- 208000009905 Neurofibromatoses Diseases 0.000 description 1
- 108010085793 Neurofibromin 1 Proteins 0.000 description 1
- 102000007530 Neurofibromin 1 Human genes 0.000 description 1
- 108090000590 Neurotransmitter Receptors Proteins 0.000 description 1
- 102000004108 Neurotransmitter Receptors Human genes 0.000 description 1
- 208000007256 Nevus Diseases 0.000 description 1
- 101710147059 Nicking endonuclease Proteins 0.000 description 1
- GRYLNZFGIOXLOG-UHFFFAOYSA-N Nitric acid Chemical compound O[N+]([O-])=O GRYLNZFGIOXLOG-UHFFFAOYSA-N 0.000 description 1
- IOVCWXUNBOPUCH-UHFFFAOYSA-N Nitrous acid Chemical compound ON=O IOVCWXUNBOPUCH-UHFFFAOYSA-N 0.000 description 1
- 241000187654 Nocardia Species 0.000 description 1
- 206010029488 Nodular melanoma Diseases 0.000 description 1
- 206010049813 Non-obstructive cardiomyopathy Diseases 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- CTQNGGLPUBDAKN-UHFFFAOYSA-N O-Xylene Chemical compound CC1=CC=CC=C1C CTQNGGLPUBDAKN-UHFFFAOYSA-N 0.000 description 1
- 206010030136 Oesophageal achalasia Diseases 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 102000043276 Oncogene Human genes 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 241001420836 Ophthalmitis Species 0.000 description 1
- 208000007027 Oral Candidiasis Diseases 0.000 description 1
- 102000004316 Oxidoreductases Human genes 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- 206010033701 Papillary thyroid cancer Diseases 0.000 description 1
- 208000002606 Paramyxoviridae Infections Diseases 0.000 description 1
- 206010048705 Paraneoplastic cerebellar degeneration Diseases 0.000 description 1
- 208000030852 Parasitic disease Diseases 0.000 description 1
- 208000018737 Parkinson disease Diseases 0.000 description 1
- UOZODPSAJZTQNH-UHFFFAOYSA-N Paromomycin II Natural products NC1C(O)C(O)C(CN)OC1OC1C(O)C(OC2C(C(N)CC(N)C2O)OC2C(C(O)C(O)C(CO)O2)N)OC1CO UOZODPSAJZTQNH-UHFFFAOYSA-N 0.000 description 1
- 241000606860 Pasteurella Species 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 206010034665 Peritoneal fibrosis Diseases 0.000 description 1
- 102000003992 Peroxidases Human genes 0.000 description 1
- 201000005702 Pertussis Diseases 0.000 description 1
- 241000423012 Phage TS2126 Species 0.000 description 1
- 108090000430 Phosphatidylinositol 3-kinases Proteins 0.000 description 1
- 102000003993 Phosphatidylinositol 3-kinases Human genes 0.000 description 1
- 108010010677 Phosphodiesterase I Proteins 0.000 description 1
- 108010064785 Phospholipases Proteins 0.000 description 1
- 102000015439 Phospholipases Human genes 0.000 description 1
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 1
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 1
- 108010073135 Phosphorylases Proteins 0.000 description 1
- 102000009097 Phosphorylases Human genes 0.000 description 1
- 235000014676 Phragmites communis Nutrition 0.000 description 1
- 241001674048 Phthiraptera Species 0.000 description 1
- 208000000609 Pick Disease of the Brain Diseases 0.000 description 1
- 208000012641 Pigmentation disease Diseases 0.000 description 1
- 208000000766 Pityriasis Lichenoides Diseases 0.000 description 1
- 206010048895 Pityriasis lichenoides et varioliformis acuta Diseases 0.000 description 1
- 206010035148 Plague Diseases 0.000 description 1
- 208000035109 Pneumococcal Infections Diseases 0.000 description 1
- 206010035737 Pneumonia viral Diseases 0.000 description 1
- 206010035742 Pneumonitis Diseases 0.000 description 1
- 208000000474 Poliomyelitis Diseases 0.000 description 1
- 206010036030 Polyarthritis Diseases 0.000 description 1
- 239000004698 Polyethylene Substances 0.000 description 1
- 239000004642 Polyimide Substances 0.000 description 1
- 208000007048 Polymyalgia Rheumatica Diseases 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- 208000031732 Post-Lyme Disease Syndrome Diseases 0.000 description 1
- 208000004347 Postpericardiotomy Syndrome Diseases 0.000 description 1
- 208000002389 Pouchitis Diseases 0.000 description 1
- 206010049422 Precancerous skin lesion Diseases 0.000 description 1
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 1
- 241000605861 Prevotella Species 0.000 description 1
- 208000032319 Primary lateral sclerosis Diseases 0.000 description 1
- 208000024777 Prion disease Diseases 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- 102100036691 Proliferating cell nuclear antigen Human genes 0.000 description 1
- 208000033826 Promyelocytic Acute Leukemia Diseases 0.000 description 1
- 108090000459 Prostaglandin-endoperoxide synthases Proteins 0.000 description 1
- 102000004005 Prostaglandin-endoperoxide synthases Human genes 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102100035251 Protein C-ets-1 Human genes 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 102100030128 Protein L-Myc Human genes 0.000 description 1
- 102100026375 Protein PML Human genes 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 102100027584 Protein c-Fos Human genes 0.000 description 1
- 102100028286 Proto-oncogene tyrosine-protein kinase receptor Ret Human genes 0.000 description 1
- 229930185560 Pseudouridine Chemical group 0.000 description 1
- PTJWIQPHWPFNBW-UHFFFAOYSA-N Pseudouridine C Chemical group OC1C(O)C(CO)OC1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-UHFFFAOYSA-N 0.000 description 1
- 206010037151 Psittacosis Diseases 0.000 description 1
- 241000517305 Pthiridae Species 0.000 description 1
- 208000004186 Pulmonary Heart Disease Diseases 0.000 description 1
- 208000003670 Pure Red-Cell Aplasia Diseases 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 206010037688 Q fever Diseases 0.000 description 1
- 239000013616 RNA primer Substances 0.000 description 1
- 238000003559 RNA-seq method Methods 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 206010037742 Rabies Diseases 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 108010091086 Recombinases Proteins 0.000 description 1
- 102000018120 Recombinases Human genes 0.000 description 1
- 208000005587 Refsum Disease Diseases 0.000 description 1
- 208000033464 Reiter syndrome Diseases 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 208000006265 Renal cell carcinoma Diseases 0.000 description 1
- 206010061603 Respiratory syncytial virus infection Diseases 0.000 description 1
- 206010038748 Restrictive cardiomyopathy Diseases 0.000 description 1
- 102100038042 Retinoblastoma-associated protein Human genes 0.000 description 1
- 206010039085 Rhinitis allergic Diseases 0.000 description 1
- 206010061494 Rhinovirus infection Diseases 0.000 description 1
- 108010003581 Ribulose-bisphosphate carboxylase Proteins 0.000 description 1
- 241000606701 Rickettsia Species 0.000 description 1
- 206010067470 Rotavirus infection Diseases 0.000 description 1
- 102100025373 Runt-related transcription factor 1 Human genes 0.000 description 1
- 101150019443 SMAD4 gene Proteins 0.000 description 1
- 102000001332 SRC Human genes 0.000 description 1
- 108060006706 SRC Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 206010039438 Salmonella Infections Diseases 0.000 description 1
- 208000021811 Sandhoff disease Diseases 0.000 description 1
- 241000219287 Saponaria Species 0.000 description 1
- 241000447727 Scabies Species 0.000 description 1
- 206010039587 Scarlet Fever Diseases 0.000 description 1
- 101000702553 Schistosoma mansoni Antigen Sm21.7 Proteins 0.000 description 1
- 101000714192 Schistosoma mansoni Tegument antigen Proteins 0.000 description 1
- 208000036752 Schizophrenia, paranoid type Diseases 0.000 description 1
- 201000001542 Schneiderian carcinoma Diseases 0.000 description 1
- 206010039705 Scleritis Diseases 0.000 description 1
- 206010040047 Sepsis Diseases 0.000 description 1
- 102100036077 Serine/threonine-protein kinase pim-1 Human genes 0.000 description 1
- 206010040550 Shigella infections Diseases 0.000 description 1
- 108700031298 Smad4 Proteins 0.000 description 1
- 206010041067 Small cell lung cancer Diseases 0.000 description 1
- ABBQHOQBGMUPJH-UHFFFAOYSA-M Sodium salicylate Chemical compound [Na+].OC1=CC=CC=C1C([O-])=O ABBQHOQBGMUPJH-UHFFFAOYSA-M 0.000 description 1
- 208000021712 Soft tissue sarcoma Diseases 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 208000008582 Staphylococcal Food Poisoning Diseases 0.000 description 1
- 206010041925 Staphylococcal infections Diseases 0.000 description 1
- 108010039811 Starch synthase Proteins 0.000 description 1
- 206010072148 Stiff-Person syndrome Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 108010022348 Sulfate adenylyltransferase Proteins 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical group [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 101100117496 Sulfurisphaera ohwakuensis pol-alpha gene Proteins 0.000 description 1
- 208000002286 Susac Syndrome Diseases 0.000 description 1
- 102100040365 T-cell acute lymphocytic leukemia protein 1 Human genes 0.000 description 1
- 102100033111 T-cell leukemia homeobox protein 1 Human genes 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 208000022292 Tay-Sachs disease Diseases 0.000 description 1
- 239000004809 Teflon Substances 0.000 description 1
- 229920006362 Teflon® Polymers 0.000 description 1
- 208000024313 Testicular Neoplasms Diseases 0.000 description 1
- 206010057644 Testis cancer Diseases 0.000 description 1
- 206010043376 Tetanus Diseases 0.000 description 1
- 206010043395 Thalassaemia sickle cell Diseases 0.000 description 1
- 102100028702 Thyroid hormone receptor alpha Human genes 0.000 description 1
- 241000130764 Tinea Species 0.000 description 1
- 201000010618 Tinea cruris Diseases 0.000 description 1
- 206010067197 Tinea manuum Diseases 0.000 description 1
- 206010051526 Tolosa-Hunt syndrome Diseases 0.000 description 1
- 101710183280 Topoisomerase Proteins 0.000 description 1
- 102100039580 Transcription factor ETV6 Human genes 0.000 description 1
- 102100030780 Transcriptional activator Myb Human genes 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 206010052779 Transplant rejections Diseases 0.000 description 1
- 206010044608 Trichiniasis Diseases 0.000 description 1
- 208000005448 Trichomonas Infections Diseases 0.000 description 1
- 206010044620 Trichomoniasis Diseases 0.000 description 1
- 241000223105 Trypanosoma brucei Species 0.000 description 1
- 208000034784 Tularaemia Diseases 0.000 description 1
- 102000044209 Tumor Suppressor Genes Human genes 0.000 description 1
- 108700025716 Tumor Suppressor Genes Proteins 0.000 description 1
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 1
- 102100033728 Tumor necrosis factor receptor superfamily member 18 Human genes 0.000 description 1
- 241000287411 Turdidae Species 0.000 description 1
- 102100022596 Tyrosine-protein kinase ABL1 Human genes 0.000 description 1
- 102100026150 Tyrosine-protein kinase Fgr Human genes 0.000 description 1
- 102100035221 Tyrosine-protein kinase Fyn Human genes 0.000 description 1
- 102100024036 Tyrosine-protein kinase Lck Human genes 0.000 description 1
- 102100026857 Tyrosine-protein kinase Lyn Human genes 0.000 description 1
- 108020004417 Untranslated RNA Proteins 0.000 description 1
- 102000039634 Untranslated RNA Human genes 0.000 description 1
- 206010046298 Upper motor neurone lesion Diseases 0.000 description 1
- 241000202921 Ureaplasma urealyticum Species 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 208000002495 Uterine Neoplasms Diseases 0.000 description 1
- 206010046851 Uveitis Diseases 0.000 description 1
- 208000037009 Vaginitis bacterial Diseases 0.000 description 1
- 241000700647 Variola virus Species 0.000 description 1
- 208000002687 Venezuelan Equine Encephalomyelitis Diseases 0.000 description 1
- 201000009145 Venezuelan equine encephalitis Diseases 0.000 description 1
- 241000607598 Vibrio Species 0.000 description 1
- 241000607272 Vibrio parahaemolyticus Species 0.000 description 1
- 208000033559 Waldenström macroglobulinemia Diseases 0.000 description 1
- 201000006449 West Nile encephalitis Diseases 0.000 description 1
- 206010057293 West Nile viral infection Diseases 0.000 description 1
- 208000008383 Wilms tumor Diseases 0.000 description 1
- 206010052428 Wound Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 208000006269 X-Linked Bulbo-Spinal Atrophy Diseases 0.000 description 1
- 208000003152 Yellow Fever Diseases 0.000 description 1
- 241000607734 Yersinia <bacteria> Species 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 241000607477 Yersinia pseudotuberculosis Species 0.000 description 1
- 208000001455 Zika Virus Infection Diseases 0.000 description 1
- 201000004296 Zika fever Diseases 0.000 description 1
- 239000008351 acetate buffer Substances 0.000 description 1
- 201000000621 achalasia Diseases 0.000 description 1
- 239000012445 acidic reagent Substances 0.000 description 1
- 208000006336 acinar cell carcinoma Diseases 0.000 description 1
- 208000037919 acquired disease Diseases 0.000 description 1
- NIXOWILDQLNWCW-UHFFFAOYSA-N acrylic acid group Chemical group C(C=C)(=O)O NIXOWILDQLNWCW-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 108010036419 acyl-(acyl-carrier-protein)desaturase Proteins 0.000 description 1
- 210000005006 adaptive immune system Anatomy 0.000 description 1
- 210000002534 adenoid Anatomy 0.000 description 1
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 201000002454 adrenal cortex cancer Diseases 0.000 description 1
- 208000020990 adrenal cortex carcinoma Diseases 0.000 description 1
- 210000004100 adrenal gland Anatomy 0.000 description 1
- 208000007128 adrenocortical carcinoma Diseases 0.000 description 1
- 208000030597 adult Refsum disease Diseases 0.000 description 1
- 201000006966 adult T-cell leukemia Diseases 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- PPQRONHOSHZGFQ-LMVFSUKVSA-N aldehydo-D-ribose 5-phosphate Chemical group OP(=O)(O)OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PPQRONHOSHZGFQ-LMVFSUKVSA-N 0.000 description 1
- 201000009961 allergic asthma Diseases 0.000 description 1
- 201000010105 allergic rhinitis Diseases 0.000 description 1
- 108090000637 alpha-Amylases Proteins 0.000 description 1
- 230000002707 ameloblastic effect Effects 0.000 description 1
- 231100000540 amenorrhea Toxicity 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- CBTVGIZVANVGBH-UHFFFAOYSA-N aminomethyl propanol Chemical compound CC(C)(N)CO CBTVGIZVANVGBH-UHFFFAOYSA-N 0.000 description 1
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 1
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 1
- 235000011130 ammonium sulphate Nutrition 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 235000019418 amylase Nutrition 0.000 description 1
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 1
- 239000012491 analyte Substances 0.000 description 1
- 208000006730 anaplasmosis Diseases 0.000 description 1
- PYKYMHQGRFAEBM-UHFFFAOYSA-N anthraquinone Natural products CCC(=O)c1c(O)c2C(=O)C3C(C=CC=C3O)C(=O)c2cc1CC(=O)OC PYKYMHQGRFAEBM-UHFFFAOYSA-N 0.000 description 1
- 150000004056 anthraquinones Chemical class 0.000 description 1
- 230000001093 anti-cancer Effects 0.000 description 1
- 230000000840 anti-viral effect Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 238000000149 argon plasma sintering Methods 0.000 description 1
- 206010003119 arrhythmia Diseases 0.000 description 1
- 206010003230 arteritis Diseases 0.000 description 1
- 244000309743 astrovirus Species 0.000 description 1
- 201000008937 atopic dermatitis Diseases 0.000 description 1
- 208000006424 autoimmune oophoritis Diseases 0.000 description 1
- 206010071572 autoimmune progesterone dermatitis Diseases 0.000 description 1
- 208000010928 autoimmune thyroid disease Diseases 0.000 description 1
- 208000029407 autoimmune urticaria Diseases 0.000 description 1
- 230000005784 autoimmunity Effects 0.000 description 1
- 210000003050 axon Anatomy 0.000 description 1
- 230000003376 axonal effect Effects 0.000 description 1
- 206010003882 axonal neuropathy Diseases 0.000 description 1
- 201000008680 babesiosis Diseases 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 201000009904 bacterial meningitis Diseases 0.000 description 1
- 208000003373 basosquamous carcinoma Diseases 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 108010051210 beta-Fructofuranosidase Proteins 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- WGDUUQDYDIIBKT-UHFFFAOYSA-N beta-Pseudouridine Chemical group OC1OC(CN2C=CC(=O)NC2=O)C(O)C1O WGDUUQDYDIIBKT-UHFFFAOYSA-N 0.000 description 1
- 210000000941 bile Anatomy 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- 238000007622 bioinformatic analysis Methods 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 239000002981 blocking agent Substances 0.000 description 1
- 239000010836 blood and blood product Substances 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 229940125691 blood product Drugs 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 210000002798 bone marrow cell Anatomy 0.000 description 1
- 239000003618 borate buffered saline Substances 0.000 description 1
- 229910021538 borax Inorganic materials 0.000 description 1
- KGBXLFKZBHKPEV-UHFFFAOYSA-N boric acid Chemical compound OB(O)O KGBXLFKZBHKPEV-UHFFFAOYSA-N 0.000 description 1
- 239000004327 boric acid Substances 0.000 description 1
- 208000034526 bruise Diseases 0.000 description 1
- 239000007853 buffer solution Substances 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 208000035269 cancer or benign tumor Diseases 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 150000003857 carboxamides Chemical class 0.000 description 1
- 239000003183 carcinogenic agent Substances 0.000 description 1
- 208000002458 carcinoid tumor Diseases 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 230000008709 cellular rearrangement Effects 0.000 description 1
- 229940106157 cellulase Drugs 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 208000003796 chancre Diseases 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 238000001311 chemical methods and process Methods 0.000 description 1
- 239000013626 chemical specie Substances 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 208000006990 cholangiocarcinoma Diseases 0.000 description 1
- 210000004252 chorionic villi Anatomy 0.000 description 1
- 230000008711 chromosomal rearrangement Effects 0.000 description 1
- 208000021668 chronic eosinophilic leukemia Diseases 0.000 description 1
- 208000032852 chronic lymphocytic leukemia Diseases 0.000 description 1
- 208000013507 chronic prostatitis Diseases 0.000 description 1
- 208000024376 chronic urticaria Diseases 0.000 description 1
- 201000010002 cicatricial pemphigoid Diseases 0.000 description 1
- 210000000589 cicatrix Anatomy 0.000 description 1
- 229960001265 ciclosporin Drugs 0.000 description 1
- 210000004240 ciliary body Anatomy 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 210000002777 columnar cell Anatomy 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- 208000028831 congenital heart disease Diseases 0.000 description 1
- 210000002808 connective tissue Anatomy 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 229920001577 copolymer Polymers 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- OFEZSBMBBKLLBJ-BAJZRUMYSA-N cordycepin Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)C[C@H]1O OFEZSBMBBKLLBJ-BAJZRUMYSA-N 0.000 description 1
- OFEZSBMBBKLLBJ-UHFFFAOYSA-N cordycepine Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(CO)CC1O OFEZSBMBBKLLBJ-UHFFFAOYSA-N 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- ALEXXDVDDISNDU-JZYPGELDSA-N cortisol 21-acetate Chemical compound C1CC2=CC(=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@@](C(=O)COC(=O)C)(O)[C@@]1(C)C[C@@H]2O ALEXXDVDDISNDU-JZYPGELDSA-N 0.000 description 1
- 201000003278 cryoglobulinemia Diseases 0.000 description 1
- 208000004921 cutaneous lupus erythematosus Diseases 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 229930182912 cyclosporin Natural products 0.000 description 1
- 208000031513 cyst Diseases 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000003210 demyelinating effect Effects 0.000 description 1
- 206010061811 demyelinating polyneuropathy Diseases 0.000 description 1
- 208000025729 dengue disease Diseases 0.000 description 1
- 239000005549 deoxyribonucleoside Substances 0.000 description 1
- 230000000779 depleting effect Effects 0.000 description 1
- 230000002074 deregulated effect Effects 0.000 description 1
- 230000003831 deregulation Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- LTMHDMANZUZIPE-PUGKRICDSA-N digoxin Chemical compound C1[C@H](O)[C@H](O)[C@@H](C)O[C@H]1O[C@@H]1[C@@H](C)O[C@@H](O[C@@H]2[C@H](O[C@@H](O[C@@H]3C[C@@H]4[C@]([C@@H]5[C@H]([C@]6(CC[C@@H]([C@@]6(C)[C@H](O)C5)C=5COC(=O)C=5)O)CC4)(C)CC3)C[C@@H]2O)C)C[C@@H]1O LTMHDMANZUZIPE-PUGKRICDSA-N 0.000 description 1
- 229960005156 digoxin Drugs 0.000 description 1
- LTMHDMANZUZIPE-UHFFFAOYSA-N digoxine Natural products C1C(O)C(O)C(C)OC1OC1C(C)OC(OC2C(OC(OC3CC4C(C5C(C6(CCC(C6(C)C(O)C5)C=5COC(=O)C=5)O)CC4)(C)CC3)CC2O)C)CC1O LTMHDMANZUZIPE-UHFFFAOYSA-N 0.000 description 1
- SWSQBOPZIKWTGO-UHFFFAOYSA-N dimethylaminoamidine Natural products CN(C)C(N)=N SWSQBOPZIKWTGO-UHFFFAOYSA-N 0.000 description 1
- 206010013023 diphtheria Diseases 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- GRWZHXKQBITJKP-UHFFFAOYSA-L dithionite(2-) Chemical compound [O-]S(=O)S([O-])=O GRWZHXKQBITJKP-UHFFFAOYSA-L 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-N dithiophosphoric acid Chemical class OP(O)(S)=S NAGJZTKCGNOGPW-UHFFFAOYSA-N 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 208000019479 dysautonomia Diseases 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 210000003162 effector t lymphocyte Anatomy 0.000 description 1
- 208000000292 ehrlichiosis Diseases 0.000 description 1
- 230000009881 electrostatic interaction Effects 0.000 description 1
- 230000000408 embryogenic effect Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 206010014665 endocarditis Diseases 0.000 description 1
- 230000002124 endocrine Effects 0.000 description 1
- 201000011523 endocrine gland cancer Diseases 0.000 description 1
- 210000000750 endocrine system Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 244000000015 environmental pathogen Species 0.000 description 1
- 230000002327 eosinophilic effect Effects 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 230000003628 erosive effect Effects 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- RTZKZFJDLAIYFH-UHFFFAOYSA-N ether Substances CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 1
- 125000001033 ether group Chemical group 0.000 description 1
- 229920006227 ethylene-grafted-maleic anhydride Polymers 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 108010093305 exopolygalacturonase Proteins 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 210000003195 fascia Anatomy 0.000 description 1
- 210000004700 fetal blood Anatomy 0.000 description 1
- 210000003754 fetus Anatomy 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 230000003176 fibrotic effect Effects 0.000 description 1
- 108010060641 flavanone synthetase Proteins 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical class O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 229960005102 foscarnet Drugs 0.000 description 1
- 239000012520 frozen sample Substances 0.000 description 1
- 208000024386 fungal infectious disease Diseases 0.000 description 1
- ZZUFCTLCJUWOSV-UHFFFAOYSA-N furosemide Chemical compound C1=C(Cl)C(S(=O)(=O)N)=CC(C(O)=O)=C1NCC1=CC=CO1 ZZUFCTLCJUWOSV-UHFFFAOYSA-N 0.000 description 1
- 108010074605 gamma-Globulins Proteins 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 201000006592 giardiasis Diseases 0.000 description 1
- 210000004195 gingiva Anatomy 0.000 description 1
- 229940116332 glucose oxidase Drugs 0.000 description 1
- 235000019420 glucose oxidase Nutrition 0.000 description 1
- 239000007986 glycine-NaOH buffer Substances 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 208000001786 gonorrhea Diseases 0.000 description 1
- 210000002503 granulosa cell Anatomy 0.000 description 1
- ZJYYHGLJYGJLLN-UHFFFAOYSA-N guanidinium thiocyanate Chemical compound SC#N.NC(N)=N ZJYYHGLJYGJLLN-UHFFFAOYSA-N 0.000 description 1
- 210000003780 hair follicle Anatomy 0.000 description 1
- 201000009277 hairy cell leukemia Diseases 0.000 description 1
- 229910052736 halogen Inorganic materials 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 208000018578 heart valve disease Diseases 0.000 description 1
- 229940059442 hemicellulase Drugs 0.000 description 1
- 108010002430 hemicellulase Proteins 0.000 description 1
- 201000001505 hemoglobinuria Diseases 0.000 description 1
- 208000007475 hemolytic anemia Diseases 0.000 description 1
- 230000002008 hemorrhagic effect Effects 0.000 description 1
- 208000005252 hepatitis A Diseases 0.000 description 1
- 201000010284 hepatitis E Diseases 0.000 description 1
- 208000008675 hereditary spastic paraplegia Diseases 0.000 description 1
- 208000029080 human African trypanosomiasis Diseases 0.000 description 1
- 235000020256 human milk Nutrition 0.000 description 1
- 210000004251 human milk Anatomy 0.000 description 1
- 210000004276 hyalin Anatomy 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 230000000148 hypercalcaemia Effects 0.000 description 1
- 208000030915 hypercalcemia disease Diseases 0.000 description 1
- 230000003225 hyperhomocysteinemia Effects 0.000 description 1
- 208000015210 hypertensive heart disease Diseases 0.000 description 1
- 208000036260 idiopathic disease Diseases 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 208000026278 immune system disease Diseases 0.000 description 1
- 208000014165 immunodeficiency 21 Diseases 0.000 description 1
- 229940072221 immunoglobulins Drugs 0.000 description 1
- 230000004957 immunoregulator effect Effects 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 201000001371 inclusion conjunctivitis Diseases 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 201000011422 infant botulism Diseases 0.000 description 1
- 208000019715 inherited Creutzfeldt-Jakob disease Diseases 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 229940117681 interleukin-12 Drugs 0.000 description 1
- 229940124829 interleukin-23 Drugs 0.000 description 1
- 208000036971 interstitial lung disease 2 Diseases 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- JYJIGFIDKWBXDU-MNNPPOADSA-N inulin Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)OC[C@]1(OC[C@]2(OC[C@]3(OC[C@]4(OC[C@]5(OC[C@]6(OC[C@]7(OC[C@]8(OC[C@]9(OC[C@]%10(OC[C@]%11(OC[C@]%12(OC[C@]%13(OC[C@]%14(OC[C@]%15(OC[C@]%16(OC[C@]%17(OC[C@]%18(OC[C@]%19(OC[C@]%20(OC[C@]%21(OC[C@]%22(OC[C@]%23(OC[C@]%24(OC[C@]%25(OC[C@]%26(OC[C@]%27(OC[C@]%28(OC[C@]%29(OC[C@]%30(OC[C@]%31(OC[C@]%32(OC[C@]%33(OC[C@]%34(OC[C@]%35(OC[C@]%36(O[C@@H]%37[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O%37)O)[C@H]([C@H](O)[C@@H](CO)O%36)O)[C@H]([C@H](O)[C@@H](CO)O%35)O)[C@H]([C@H](O)[C@@H](CO)O%34)O)[C@H]([C@H](O)[C@@H](CO)O%33)O)[C@H]([C@H](O)[C@@H](CO)O%32)O)[C@H]([C@H](O)[C@@H](CO)O%31)O)[C@H]([C@H](O)[C@@H](CO)O%30)O)[C@H]([C@H](O)[C@@H](CO)O%29)O)[C@H]([C@H](O)[C@@H](CO)O%28)O)[C@H]([C@H](O)[C@@H](CO)O%27)O)[C@H]([C@H](O)[C@@H](CO)O%26)O)[C@H]([C@H](O)[C@@H](CO)O%25)O)[C@H]([C@H](O)[C@@H](CO)O%24)O)[C@H]([C@H](O)[C@@H](CO)O%23)O)[C@H]([C@H](O)[C@@H](CO)O%22)O)[C@H]([C@H](O)[C@@H](CO)O%21)O)[C@H]([C@H](O)[C@@H](CO)O%20)O)[C@H]([C@H](O)[C@@H](CO)O%19)O)[C@H]([C@H](O)[C@@H](CO)O%18)O)[C@H]([C@H](O)[C@@H](CO)O%17)O)[C@H]([C@H](O)[C@@H](CO)O%16)O)[C@H]([C@H](O)[C@@H](CO)O%15)O)[C@H]([C@H](O)[C@@H](CO)O%14)O)[C@H]([C@H](O)[C@@H](CO)O%13)O)[C@H]([C@H](O)[C@@H](CO)O%12)O)[C@H]([C@H](O)[C@@H](CO)O%11)O)[C@H]([C@H](O)[C@@H](CO)O%10)O)[C@H]([C@H](O)[C@@H](CO)O9)O)[C@H]([C@H](O)[C@@H](CO)O8)O)[C@H]([C@H](O)[C@@H](CO)O7)O)[C@H]([C@H](O)[C@@H](CO)O6)O)[C@H]([C@H](O)[C@@H](CO)O5)O)[C@H]([C@H](O)[C@@H](CO)O4)O)[C@H]([C@H](O)[C@@H](CO)O3)O)[C@H]([C@H](O)[C@@H](CO)O2)O)[C@@H](O)[C@H](O)[C@@H](CO)O1 JYJIGFIDKWBXDU-MNNPPOADSA-N 0.000 description 1
- 229940029339 inulin Drugs 0.000 description 1
- 239000001573 invertase Substances 0.000 description 1
- 235000011073 invertase Nutrition 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 208000012947 ischemia reperfusion injury Diseases 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000011901 isothermal amplification Methods 0.000 description 1
- 201000002215 juvenile rheumatoid arthritis Diseases 0.000 description 1
- 206010023332 keratitis Diseases 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 229940043355 kinase inhibitor Drugs 0.000 description 1
- 210000001865 kupffer cell Anatomy 0.000 description 1
- 229940116108 lactase Drugs 0.000 description 1
- 238000002357 laparoscopic surgery Methods 0.000 description 1
- 208000003849 large cell carcinoma Diseases 0.000 description 1
- 230000001418 larval effect Effects 0.000 description 1
- 201000010901 lateral sclerosis Diseases 0.000 description 1
- 201000002364 leukopenia Diseases 0.000 description 1
- 231100001022 leukopenia Toxicity 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 206010024627 liposarcoma Diseases 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000011528 liquid biopsy Methods 0.000 description 1
- 239000011344 liquid material Substances 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 238000000504 luminescence detection Methods 0.000 description 1
- 208000016992 lung adenocarcinoma in situ Diseases 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 201000000014 lung giant cell carcinoma Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 201000000966 lung oat cell carcinoma Diseases 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 210000004880 lymph fluid Anatomy 0.000 description 1
- 208000001419 lymphocytic choriomeningitis Diseases 0.000 description 1
- 201000010953 lymphoepithelioma-like carcinoma Diseases 0.000 description 1
- 201000001268 lymphoproliferative syndrome Diseases 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 201000000564 macroglobulinemia Diseases 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 235000011147 magnesium chloride Nutrition 0.000 description 1
- 201000004792 malaria Diseases 0.000 description 1
- 208000025854 malignant tumor of adrenal cortex Diseases 0.000 description 1
- 230000001071 malnutrition Effects 0.000 description 1
- 235000000824 malnutrition Nutrition 0.000 description 1
- 208000000516 mast-cell leukemia Diseases 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 210000002752 melanocyte Anatomy 0.000 description 1
- 230000000684 melanotic effect Effects 0.000 description 1
- 210000004379 membrane Anatomy 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 208000037941 meningococcal disease Diseases 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 125000001570 methylene group Chemical group [H]C([H])([*:1])[*:2] 0.000 description 1
- YACKEPLHDIMKIO-UHFFFAOYSA-N methylphosphonic acid Chemical class CP(O)(O)=O YACKEPLHDIMKIO-UHFFFAOYSA-N 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 206010063344 microscopic polyangiitis Diseases 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 208000024191 minimally invasive lung adenocarcinoma Diseases 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 230000004879 molecular function Effects 0.000 description 1
- 208000005871 monkeypox Diseases 0.000 description 1
- 210000001616 monocyte Anatomy 0.000 description 1
- 201000006894 monocytic leukemia Diseases 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 125000004573 morpholin-4-yl group Chemical group N1(CCOCC1)* 0.000 description 1
- 208000005264 motor neuron disease Diseases 0.000 description 1
- 210000000214 mouth Anatomy 0.000 description 1
- 201000002273 mucopolysaccharidosis II Diseases 0.000 description 1
- 208000022018 mucopolysaccharidosis type 2 Diseases 0.000 description 1
- 210000003550 mucous cell Anatomy 0.000 description 1
- 208000010805 mumps infectious disease Diseases 0.000 description 1
- 210000004985 myeloid-derived suppressor cell Anatomy 0.000 description 1
- 208000010125 myocardial infarction Diseases 0.000 description 1
- 208000031225 myocardial ischemia Diseases 0.000 description 1
- 208000001611 myxosarcoma Diseases 0.000 description 1
- UIWVQFSAXYWENY-UHFFFAOYSA-N n'-ethylacetohydrazide Chemical compound CCNNC(C)=O UIWVQFSAXYWENY-UHFFFAOYSA-N 0.000 description 1
- 210000002850 nasal mucosa Anatomy 0.000 description 1
- 208000014761 nasopharyngeal type undifferentiated carcinoma Diseases 0.000 description 1
- 201000011216 nasopharynx carcinoma Diseases 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 201000004931 neurofibromatosis Diseases 0.000 description 1
- 201000001119 neuropathy Diseases 0.000 description 1
- 230000007823 neuropathy Effects 0.000 description 1
- 230000000508 neurotrophic effect Effects 0.000 description 1
- 229910017604 nitric acid Inorganic materials 0.000 description 1
- 201000000032 nodular malignant melanoma Diseases 0.000 description 1
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 210000001331 nose Anatomy 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 230000000269 nucleophilic effect Effects 0.000 description 1
- 150000003833 nucleoside derivatives Chemical class 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 208000015380 nutritional deficiency disease Diseases 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 229920002113 octoxynol Polymers 0.000 description 1
- 208000003177 ocular onchocerciasis Diseases 0.000 description 1
- 208000002042 onchocerciasis Diseases 0.000 description 1
- 206010030861 ophthalmia neonatorum Diseases 0.000 description 1
- 201000005737 orchitis Diseases 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 125000002524 organometallic group Chemical group 0.000 description 1
- 201000000901 ornithosis Diseases 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 230000001590 oxidative effect Effects 0.000 description 1
- 150000002926 oxygen Chemical class 0.000 description 1
- 125000004430 oxygen atom Chemical group O* 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 208000010403 panophthalmitis Diseases 0.000 description 1
- 201000010198 papillary carcinoma Diseases 0.000 description 1
- 208000002851 paranoid schizophrenia Diseases 0.000 description 1
- 244000045947 parasite Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 150000002972 pentoses Chemical class 0.000 description 1
- 210000004303 peritoneum Anatomy 0.000 description 1
- 108040007629 peroxidase activity proteins Proteins 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 210000003800 pharynx Anatomy 0.000 description 1
- UEZVMMHDMIWARA-UHFFFAOYSA-M phosphonate Chemical compound [O-]P(=O)=O UEZVMMHDMIWARA-UHFFFAOYSA-M 0.000 description 1
- 150000008298 phosphoramidates Chemical class 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 150000008299 phosphorodiamidates Chemical class 0.000 description 1
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 1
- 239000003757 phosphotransferase inhibitor Substances 0.000 description 1
- 229940085127 phytase Drugs 0.000 description 1
- 230000019612 pigmentation Effects 0.000 description 1
- 201000000508 pityriasis versicolor Diseases 0.000 description 1
- 230000003169 placental effect Effects 0.000 description 1
- 239000005648 plant growth regulator Substances 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 206010035653 pneumoconiosis Diseases 0.000 description 1
- 229920001748 polybutylene Polymers 0.000 description 1
- 208000030761 polycystic kidney disease Diseases 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 229920001721 polyimide Polymers 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 229920002635 polyurethane Polymers 0.000 description 1
- 239000004814 polyurethane Substances 0.000 description 1
- 238000010837 poor prognosis Methods 0.000 description 1
- 239000001103 potassium chloride Substances 0.000 description 1
- 235000011164 potassium chloride Nutrition 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 208000030266 primary brain neoplasm Diseases 0.000 description 1
- 208000018290 primary dysautonomia Diseases 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 229960003387 progesterone Drugs 0.000 description 1
- 239000000186 progesterone Substances 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 206010036807 progressive multifocal leukoencephalopathy Diseases 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 201000007094 prostatitis Diseases 0.000 description 1
- 235000019833 protease Nutrition 0.000 description 1
- 235000019419 proteases Nutrition 0.000 description 1
- 125000006239 protecting group Chemical group 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 108060006633 protein kinase Proteins 0.000 description 1
- 230000006337 proteolytic cleavage Effects 0.000 description 1
- PTJWIQPHWPFNBW-GBNDHIKLSA-N pseudouridine Chemical group O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-GBNDHIKLSA-N 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 239000010453 quartz Substances 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
- 230000001373 regressive effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 230000000241 respiratory effect Effects 0.000 description 1
- 208000030925 respiratory syncytial virus infectious disease Diseases 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 201000003068 rheumatic fever Diseases 0.000 description 1
- 206010039083 rhinitis Diseases 0.000 description 1
- 150000003290 ribose derivatives Chemical group 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 101150033305 rtcB gene Proteins 0.000 description 1
- 201000005404 rubella Diseases 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 206010039447 salmonellosis Diseases 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 208000014212 sarcomatoid carcinoma Diseases 0.000 description 1
- 208000005687 scabies Diseases 0.000 description 1
- 230000037390 scarring Effects 0.000 description 1
- 201000004409 schistosomiasis Diseases 0.000 description 1
- 208000004259 scirrhous adenocarcinoma Diseases 0.000 description 1
- 230000002784 sclerotic effect Effects 0.000 description 1
- 201000008157 scrotal carcinoma Diseases 0.000 description 1
- 208000011581 secondary neoplasm Diseases 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 208000013223 septicemia Diseases 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 208000007056 sickle cell anemia Diseases 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 208000017520 skin disease Diseases 0.000 description 1
- 201000002612 sleeping sickness Diseases 0.000 description 1
- 229910052708 sodium Inorganic materials 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- JVBXVOWTABLYPX-UHFFFAOYSA-L sodium dithionite Chemical compound [Na+].[Na+].[O-]S(=O)S([O-])=O JVBXVOWTABLYPX-UHFFFAOYSA-L 0.000 description 1
- 229960004025 sodium salicylate Drugs 0.000 description 1
- 235000010339 sodium tetraborate Nutrition 0.000 description 1
- 239000011343 solid material Substances 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 229940063675 spermine Drugs 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 210000003046 sporozoite Anatomy 0.000 description 1
- 210000003802 sputum Anatomy 0.000 description 1
- 208000024794 sputum Diseases 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 239000012086 standard solution Substances 0.000 description 1
- 201000002190 staphyloenterotoxemia Diseases 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 201000010033 subleukemic leukemia Diseases 0.000 description 1
- 125000001424 substituent group Chemical group 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 230000002889 sympathetic effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 229960003080 taurine Drugs 0.000 description 1
- 210000001138 tear Anatomy 0.000 description 1
- 125000001981 tert-butyldimethylsilyl group Chemical group [H]C([H])([H])[Si]([H])(C([H])([H])[H])[*]C(C([H])([H])[H])(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 201000003120 testicular cancer Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 238000005382 thermal cycling Methods 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 208000013818 thyroid gland medullary carcinoma Diseases 0.000 description 1
- 208000030045 thyroid gland papillary carcinoma Diseases 0.000 description 1
- 201000009642 tinea barbae Diseases 0.000 description 1
- 201000004647 tinea pedis Diseases 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 206010044325 trachoma Diseases 0.000 description 1
- 206010044412 transitional cell carcinoma Diseases 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 208000009174 transverse myelitis Diseases 0.000 description 1
- 208000003982 trichinellosis Diseases 0.000 description 1
- 201000007588 trichinosis Diseases 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- BSVBQGMMJUBVOD-UHFFFAOYSA-N trisodium borate Chemical compound [Na+].[Na+].[Na+].[O-]B([O-])[O-] BSVBQGMMJUBVOD-UHFFFAOYSA-N 0.000 description 1
- 201000002311 trypanosomiasis Diseases 0.000 description 1
- 201000008827 tuberculosis Diseases 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
- 208000035408 type 1 diabetes mellitus 1 Diseases 0.000 description 1
- 208000022810 undifferentiated (embryonal) sarcoma Diseases 0.000 description 1
- 210000003932 urinary bladder Anatomy 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 206010046766 uterine cancer Diseases 0.000 description 1
- 238000002255 vaccination Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 125000000391 vinyl group Chemical group [H]C([*])=C([H])[H] 0.000 description 1
- 208000009421 viral pneumonia Diseases 0.000 description 1
- 230000009278 visceral effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
- 238000012049 whole transcriptome sequencing Methods 0.000 description 1
- 239000001018 xanthene dye Substances 0.000 description 1
- 239000002676 xenobiotic agent Substances 0.000 description 1
- 230000002034 xenobiotic effect Effects 0.000 description 1
- 239000008096 xylene Substances 0.000 description 1
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Disclosed herein, inter alia, are compositions and methods that provide a sequencing-efficient solution for detecting genetic features and aberrations.
Description
Cross reference to related applications
The present application claims the benefits of U.S. provisional application No. 63/218,794 filed on 7.6 of 2021, U.S. provisional application No. 63/297,078 filed on 1.6 of 2022, and U.S. provisional application No. 63/348,939 filed on 3 of 2022; each of the U.S. provisional applications is incorporated by reference herein in its entirety for all purposes.
References to "sequence Listing", tables or computer program List appendix submitted in ASCII files
The sequence listing written in file 051385-548001wo_seq_st25.Txt created on month 29 of 2022, byte number 547, machine format IBM-PC, using MS Windows operating system is incorporated herein by reference.
Background
Gene fusion is a somatic change that may lead to cancer. Translocation, copy number changes, and inversion may lead to gene fusion, as well as deregulation of gene expression and novel molecular functions. The Next Generation Sequencing (NGS) method for gene fusion detection may employ non-targeted sequencing (e.g., whole genome or whole transcriptome sequencing) or targeted sequencing of the fusion gene of interest. The targeting method for gene fusion detection can simplify analysis and reduce cost. A popular method for targeted sequencing of gene fusions involves multiplex PCR, wherein primer sets are designed to generate PCR amplicons spanning known breakpoint junctions; anchored Multiplex PCR (AMP); and methods for enriching breakpoint regions of interest using hybridization capture. Multiplex PCR, however, cannot identify fusions involving novel breakpoints and partners; AMPs have relatively high input requirements and more complex workflows, often limited to RNA analysis only; and hybrid capture has a relatively complex workflow and reduced sensitivity compared to PCR-based methods. For targeted and non-targeted approaches, robustness to sample degradation is often critical due to the widespread use of FFPE preserved tissue and cfDNA as input materials.
Disclosure of Invention
In view of the above, there is a need for a method to achieve high sensitivity targeted analysis of gene fusion with minimal workflow complexity and input requirements, as well as robustness to highly degraded materials. Solutions to these and other problems in the art are described herein, among other things.
In one aspect, there is provided a method of differentially amplifying a polynucleotide comprising a fusion gene relative to a polynucleotide not comprising the fusion gene, the method comprising: i) Circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises a fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise a fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides; ii) binding a blocking element to the one or more non-fused circular template polynucleotides; and iii) hybridizing a first primer and a second primer to the one or more non-fusion circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; thereby differentially amplifying the polynucleotide comprising the fusion gene.
In one aspect, there is provided a method of amplifying a polynucleotide comprising a fusion gene, the method comprising: i) Binding a blocking element to a non-fusion circular template polynucleotide, wherein the non-fusion circular template does not comprise a fusion gene; ii) hybridizing the first primer and the second primer to the non-fused circular template polynucleotide; and hybridizing the first primer and the second primer to a fusion circular template polynucleotide, wherein the fusion circular template polynucleotide comprises a fusion gene; and iii) extending the first primer and the second primer with a non-strand displacement polymerase to produce a fusion polynucleotide amplification product.
In one aspect, a kit is provided comprising: a circularizing agent, wherein the circularizing agent is capable of binding the 5 'and 3' ends of a linear nucleic acid molecule; a blocking element capable of binding to one or more circular polynucleotides; a first primer and a second primer; and a polymerase.
Drawings
Figure 1 shows outward facing primers (as shown by the arrow) designed to target regions of the fusion partner of interest adjacent to the breakpoint location of interest. One element, referred to as a blocking element, prevents extension of the polymerase (e.g., a non-extendable oligomer used in conjunction with a non-strand displacement polymerase) that targets unrearranged sequences adjacent to the outward facing primer. The blocking element selectively inhibits amplification of unrearranged templates, resulting in preferential amplification of templates containing the fusion.
FIGS. 2A-2B illustrate a blocked inverse PCR method. Fig. 2A illustrates a method consisting of: (a) an outwardly facing reverse PCR primer pair; (b) A 5' blocking oligomer that selectively binds to an unordered template adjacent to the reverse PCR primer pair and upstream of the intended fusion breakpoint region; and (c) a second optional 3 'blocking oligomer positioned 3' of the intended fusion junction. The relative positioning of the blocking oligomers is indicated in the figure. A 5 'blocking oligomer refers to an oligonucleotide that binds on the 5' side of an exon junction; similarly, a 3 'blocking oligomer refers to an oligonucleotide that binds on the 3' side of an exon junction. In embodiments, and under suitable conditions, the 5' blocking oligomer is not bound, such that the circularized template can be amplified (e.g., the cDNA contains a fusion junction). In the examples, and under appropriate conditions, the 3' blocking oligomer prevents the amplification of fragments with insufficient coverage of the fusion junction. FIG. 2B shows in detail an example showing an outward facing primer containing a target specific sequence (A) and optionally a sequence (B) for downstream library preparation and analysis.
FIG. 3 shows the strategy of FIG. 1 (i.e., a polynucleotide having a sequence of a first region fused to a sequence of a second region at a fusion junction) applied to a template having a fusion. The 5' blocking oligomer does not bind to the outward facing primer, allowing for selective amplification of templates containing the junction from the debris material. A 5 'blocking oligomer refers to an oligonucleotide that binds on the 5' side of an exon junction; similarly, a 3 'blocking oligomer refers to an oligonucleotide that binds on the 3' side of an exon junction. In embodiments, and under suitable conditions, the 5' blocking oligomer prevents amplification of unrearranged templates (e.g., cdnas that do not contain fusion junctions). In the examples, and under appropriate conditions, the 3' blocking oligomer prevents the amplification of fragments with insufficient coverage of the fusion junction.
FIG. 4 shows a circularized template comprising fusion junctions. In an embodiment, the circularized template comprises two junctions: 1) A junction resulting from fusion of the sample and 2) a junction resulting from circularization of the 5 'and 3' ends of the linear nucleic acid molecule. In embodiments, the latter (i.e., the junction resulting from cyclization) may be used to quantify and estimate template abundance and/or perform error correction.
Fig. 5 illustrates an exemplary overview for detecting translocations. After amplification and sequencing, the sequencing reads are mapped to a reference. Translocation events may result in an excess of intergenic mapping sequences that partially align with non-targeted 5' fusion genes (gene a) and targeted fusion partners (gene B) near the breakpoint.
Fig. 6 illustrates a bioinformatics workflow for breakpoint mapping. Briefly, sequencing reads of a target of interest are identified, for example, by k-mer matching or alignment. The cyclized junctions are then identified by k-mer matching or alignment. In some embodiments, k-mer matching may be achieved using a k-mer index reflecting the circularized junction of nucleic acids produced by known fusions. Next, reads are classified as having intra-genic junctions or inter-genic junctions, and mapped positions and densities of the mapped reads are determined. Direct alignment of reads to breakpoints is not necessary, but may aid in analysis.
Fig. 7 illustrates an embodiment of a method described herein applied to analysis of IGH V (D) J-rearrangement. (A) Traditional methods of amplifying IGH rearrangements involve multiplex PCR primers targeting variable gene framework regions in combination with one or more adapter gene primers. Such methods are limited by the following: the need for complex primer pools, the inability to detect rearrangements with somatic hypermutations within the primer binding sites, and the inability to identify translocations involving the IGHJ gene. (B) In contrast, blocking inverse PCR of IGH loci utilizes outward facing primers targeting rarely mutated junction gene regions. The method minimizes the number of primers required, avoids shedding due to somatic hypermutation, enables detection of the IGHJ translocation, and allows estimation of template copy number by analysis of circularized junctions. The inclusion of blocking elements increases the proportion of rearrangements containing amplicons, thereby facilitating downstream sequencing analysis.
Fig. 8 illustrates an embodiment of a design strategy for the method described herein applied to IGH rearrangement. Outward facing primers are designed to amplify each IGHJ gene while blocking the targeting of the oligomer to the region upstream of and adjacent to each junction gene.
Fig. 9 illustrates an embodiment of a workflow for analyzing B cell rearrangements by the methods described herein. Amplification of the IGH, IGK and IGL loci is followed by next generation sequencing. The resulting reads are filtered to remove short and off-target products, cyclized junctions are identified, unique sequences are collapsed, and then the presence of V (D) J rearrangements is annotated by IgBLAST or similar tool. Reads with effective V (D) J rearrangements were used to determine the frequency and template count of each rearrangement and identify clonal rearrangements consistent with the presence of B cell malignancy. The presence or absence of translocation in reads lacking V (D) J rearrangement is assessed using k-mer analysis or methods known in the art (e.g., geneFuse). A final report was generated indicating V (D) J clonality and easy-to-place status of the sample.
Fig. 10 illustrates an embodiment in which outward facing primers (shown as a pair of arrows pointing towards each other) are designed to target regions of a fusion partner of interest adjacent to a breakpoint location of interest are used in combination with inward facing primers (shown as a pair of arrows pointing towards each other) designed to target somatic mutations (e.g., single Nucleotide Polymorphisms (SNPs), insertions, deletions, copy Number Variations (CNVs), etc.). One element, referred to as a blocking element, prevents extension of the polymerase (e.g., a non-extendable oligomer used in conjunction with a non-strand displacement polymerase) that targets unrearranged sequences adjacent to the outward facing primer. The blocking element selectively inhibits amplification of unrearranged templates, resulting in preferential amplification of templates containing the fusion. After circularization and PCR amplification with inwardly facing primers, for example, the SNP-containing region is amplified.
11A-11C illustrate amplification of a region of interest (e.g., a single region of interest or tandem repeats of a region of interest) using a single-pool multiplex amplification reaction (e.g., a single Chi Duochong PCR reaction). FIG. 11A shows an example in which two pairs of overlapping inward facing primers (e.g., 1F and 1R and 2F and 2R) are used to amplify a target region, resulting in three amplification products (e.g., three PCR products: amplification product of the 1F and 1R primer pairs), amplicon 2 (amplification product of the 2F and 2R primer pairs), and a maximum amplicon (amplification product of the 1F and 2R primer pairs), as described in U.S. patent publication US2016/0340746, which discloses that amplification products are identical regardless of whether a linear template or a circular template is used, are identical as a result of the lower amplification efficiency caused by the stabilized secondary structure, amplification reaction products with overlapping inward facing primers are identical, amplification products from the amplification product of the amplification example 11A (e.g., amplification products of the 1F and 1R primer pairs and 2R primer pairs) are used when the linear template is used, and tandem amplification products of the amplification example 11A are used, and tandem amplification products such as the amplification products of the amplification example 1F and 2R primer pairs are used when the tandem amplification products of the amplification example 11A and the amplification products of the amplification example 11A are used are identical, and tandem amplification products are not used for the tandem amplification products are identical, amplification products of the 2R and 1F primer pairs). The duplicate specific amplicon is identified by the presence of a unique primer pair present in the amplicon and a circularized junction within the amplicon (represented by dashed lines).
Fig. 12 shows a graph highlighting the time aspect of monitoring the Measurable Residual Disease (MRD) of Acute Lymphoblastic Leukemia (ALL). Each line represents the level of residual disease over time following therapeutic intervention (e.g., radiation and/or chemotherapy) at different time points monitored by different hypothetical patients after treatment. The response curve contains: DP (disease persistence), VEP (very early relapse), ER (early relapse), LR (late relapse), VLR (very late relapse) and NR (no relapse). 10-2 represents the proportion of leukemic cells, which represents the approximate lower limit of detection of VER.
Figure 13 shows blocking element efficiency as determined by gel electrophoresis analysis. Synthetic oligomers were generated to represent IGH rearrangements (fusion, F) and unrearranged IGH j6 genes (wild type, W). PCR amplification of each template (as shown in fig. 1) was performed using reverse PCR primers in the presence or absence of non-extendable blocking oligomers (indicated +/-) that were able to hybridize to the W template but not to the F template. Arrows indicate the location of the desired product. The PCR amplification products were then visualized on agarose gels.
Fig. 14 shows the results of bioinformatic reconstruction of the detected breakpoint region within the BCL2 locus of chromosome 18 using the methods described herein. Each grey horizontal line represents one sequencing fragment, and the visual representation of the coverage is on top.
Detailed Description
Described herein are novel methods for detecting gene fusions within and across different independent chromosomes.
I. Definition of the definition
Practice of the techniques described herein will employ, unless indicated to the contrary, conventional methods of chemistry, biochemistry, organic chemistry, molecular biology, microbiology, recombinant DNA technology, genetics, immunology and cell biology within the skill of the art, many of which are described below for purposes of illustration. Examples of such techniques are available in the literature. Methods, devices, and materials similar or equivalent to those described herein can be used in the practice of the present invention.
All patents, patent applications, articles and publications mentioned herein, including above and below, are hereby expressly incorporated by reference in their entirety.
Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Various scientific dictionaries containing the terms contained herein are well known and available to those of skill in the art. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, some preferred methods and materials are described. Accordingly, the terms defined immediately below are more fully described by reference to the entire specification. It is to be understood that this disclosure is not limited to the particular methods, protocols, and reagents described, as these may vary depending on the context in which they are used by those skilled in the art. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not meant to limit the scope of the present disclosure.
As used herein, the singular terms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Reference throughout this specification to "one embodiment," "an embodiment," "another embodiment," "a particular embodiment," "a related embodiment," "an embodiment," "other embodiments," or "other embodiments" or combinations thereof means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the foregoing phrases appearing throughout the specification do not necessarily all refer to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
As used herein, the term "about" means a range of values that includes the specified value, which one of ordinary skill in the art would consider reasonably similar to the specified value. In an embodiment, the term "about" means within the standard deviation of using measurements generally acceptable in the art. In an embodiment, about means extending to a range of +/-10% of the specified value. In an embodiment, about means a specified value.
Throughout this specification, unless the context requires otherwise, the words "comprise", "comprising", and "include" will be understood to imply the inclusion of a stated step or element or group of steps or elements but not the exclusion of any other step or element or group of steps or elements. "consisting of … …" means including and limited to things after the phrase "consisting of … …". Thus, the phrase "consisting of … …" indicates that the listed elements are required or mandatory and that no other elements may be present. "consisting essentially of … …" means any element listed after the phrase is included and is limited to other elements that do not interfere with or affect the activity or effect described in the disclosure with respect to the listed elements. Thus, the phrase "consisting essentially of … …" indicates that the listed elements are required or necessary, but that other elements are optional and may be present or absent depending on whether they affect the activity or effect of the listed elements.
As used herein, the term "control" or "control experiment" is used in accordance with its ordinary and customary meaning and refers to an experiment in which the subject or reagent of the experiment is treated as in a parallel experiment, except that the procedure, reagent or variable of the experiment is omitted. In some cases, controls were used as a standard for comparison in assessing experimental efficacy.
As used herein, the term "complement" is used in accordance with its plain and ordinary meaning and refers to a nucleotide (e.g., RNA nucleotide or DNA nucleotide) or nucleotide sequence capable of base pairing with a complementary nucleotide or nucleotide sequence. As described herein and well known in the art, the complementary (matching) nucleotide of adenosine is thymidine in DNA or alternatively RNA, the complementary (matching) nucleotide of adenosine is uracil, and the complementary (matching) nucleotide of guanine is cytosine. Thus, the complement may comprise a nucleotide sequence that base pairs with a corresponding complementary nucleotide of the second nucleic acid sequence. The nucleotides of the complement may partially or completely match the nucleotides of the second nucleic acid sequence. When the nucleotides of the complement are perfectly matched to each nucleotide in the second nucleic acid sequence, the complement forms a base pair with each nucleotide in the second nucleic acid sequence. When the nucleotides of the complement match the nucleotide portion of the second nucleic acid sequence, only some of the nucleotides in the complement form base pairs with the nucleotides of the second nucleic acid sequence. Examples of complementary sequences include coding and non-coding sequences, wherein the non-coding sequence contains complementary nucleotides of the coding sequence and thus forms the complement of the coding sequence. Further examples of complementary sequences are sense and antisense sequences, wherein the sense sequence contains the complementary nucleotides of the antisense sequence and thus forms the complement of the antisense sequence. By "double-stranded" is meant that at least two fully or partially complementary oligonucleotides and/or polynucleotides undergo Watson-Crick type base pairing (Watson-Crick type basepairing) between all or most of their nucleotides, thereby forming a stable complex.
As described herein, complementarity of sequences may be partial, where only some of the nucleic acids match according to base pairing, or complete, where all of the nucleic acids match according to base pairing. Thus, two sequences that are complementary to each other may have a specified percentage of nucleotides that complement each other (e.g., about 60%, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more complementarity within a specified region). In an embodiment, when two sequences are fully complementary, they are complementary, with 100% complementarity. In embodiments, sequences in a pair of complementary sequences form part of a single polynucleotide (e.g., hairpin structure with or without an overhang) or part of an isolated polynucleotide with non-base pairing nucleotides. In embodiments, one or both of a pair of complementary sequences forms part of a longer polynucleotide, which may or may not comprise additional complementary regions.
As used herein, the term "contacting" is used in accordance with its ordinary and ordinary meaning and refers to a process of bringing at least two different substances (e.g., chemical compounds comprising biomolecules or cells) into close enough proximity to react, interact, or physically touch. However, the resulting reaction product may be produced directly from the reaction between the added reagents or from intermediates from one or more of the added reagents that may be produced in the reaction mixture. The term "contacting" may comprise allowing two species, which may be compounds, nucleic acids, proteins or enzymes (e.g., DNA polymerase), to react, interact or physically touch.
As used herein, the term "nucleic acid" is used in accordance with its simple and ordinary meaning and refers to nucleotides (e.g., deoxyribonucleotides or ribonucleotides) and polymers thereof or complements thereof in single-stranded, double-stranded or multi-stranded form. The terms "polynucleotide", "oligonucleotide", "oligomer" and the like refer to the sequence of nucleotides in a general and customary sense. The term "nucleotide" refers in a general and customary sense to a single unit of a polynucleotide, i.e., a monomer. The nucleotide may be a ribonucleotide, a deoxyribonucleotide or a modified form thereof. Examples of polynucleotides contemplated herein include single-and double-stranded DNA, single-and double-stranded RNA having a linear or circular framework, and hybrid molecules having mixtures of single-and double-stranded DNA and RNA. Non-limiting examples of polynucleotides include genes, gene fragments, exons, introns, intergenic DNA (including but not limited to heterochromatic DNA), messenger RNAs (mrnas), transfer RNAs, ribosomal RNAs, ribozymes, cdnas, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of sequences, isolated RNA of sequences, nucleic acid probes, and primers. Polynucleotides useful in the methods of the present disclosure may comprise natural nucleic acid sequences and variants thereof, artificial nucleic acid sequences, or combinations of such sequences. "nucleosides" are similar in structure to nucleotides, but lack a phosphate moiety. Examples of nucleoside analogs are those in which the label is attached to the base and no phosphate group is attached to the sugar molecule. As used herein, the terms "nucleic acid oligomer" and "oligonucleotide" are used interchangeably and are intended to include, but are not limited to, nucleic acids 200 nucleotides or less in length. In some embodiments, the oligonucleotide is a nucleic acid of 2 to 200 nucleotides, 2 to 150 nucleotides, 5 to 150 nucleic acids, or 5 to 100 nucleotides in length.
As used herein, the term "primer" is defined as one or more nucleic acid fragments that can specifically hybridize to a nucleic acid template, bind by a polymerase, and extend during template-directed nucleic acid synthesis. The primer may be of any length, depending on the particular technique for which it is to be used. For example, PCR primers are typically between 10 and 40 nucleotides in length. In some embodiments, the primer is 200 nucleotides or less in length. In certain embodiments, the primer is 10 to 150 nucleotides, 15 to 150 nucleotides, 5 to 100 nucleotides, 5 to 50 nucleotides, or 10 to 50 nucleotides in length. The length and complexity of the nucleic acid immobilized on the nucleic acid template is not critical. The skilled artisan can adjust these factors to provide optimal hybridization and signal generation for a given hybridization procedure and to provide a desired resolution between different gene or genomic locations. Primers allow the addition of nucleotide residues thereto or the synthesis of oligonucleotides or polynucleotides therefrom under suitable conditions known in the art. In one embodiment, the primer is a DNA primer, i.e., a primer consisting of or consisting essentially of deoxyribonucleotide residues. The primer is designed to have a sequence complementary to the template/target DNA region to which the primer hybridizes. The addition of nucleotide residues to the 3' end of the primer by formation of phosphodiester bonds results in DNA extension products. The addition of nucleotide residues to the 3' end of the DNA extension product by formation of phosphodiester bonds will result in additional DNA extension products. In another embodiment, the primer is an RNA primer. In an embodiment, the primer is hybridized to the target polynucleotide. A "primer" comprises a sequence complementary to a polynucleotide template, and a complex formed by hydrogen bonding or hybridization to the template to create a primer/template complex to prime polymerase synthesis, the primer extending during DNA synthesis by the addition of a covalently bonded base attached at the 3' end complementary to the template.
As used herein, the terms "solid support" and "substrate" and "solid surface" refer to a discrete solid or semi-solid surface to which a plurality of primers may be attached. The solid support may encompass any type of solid, porous or hollow sphere, cylinder, or other similar configuration composed of a plastic, ceramic, metal, or polymeric material (e.g., hydrogel) to which the nucleic acid may be immobilized (e.g., covalent or non-covalent). The solid support may beIncluding discrete particles, which may be spherical (e.g., microspheres) or have non-spherical or irregular shapes, such as cubic, rectangular, pyramidal, cylindrical, conical, elliptical, disc-shaped, etc. Solid supports in the form of discrete particles may be referred to herein as "beads," which alone do not imply or require any particular shape. The shape of the beads may be non-spherical. The solid support may further comprise a polymer or hydrogel on the surface to which the primer is attached (e.g., a splint primer is covalently attached to the polymer, wherein the polymer is in direct contact with the solid support). Exemplary solid supports include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylic, polystyrene, and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethane, teflon TM Cyclic olefin copolymers, polyimides, etc.), nylon, ceramics, resins, zeonor, silica or silica-based materials (including silicon and modified silicon), carbon, metals, inorganic glass, fiber bundles, photopatterned dry film resists, UV cured adhesives, and polymers. The solid support of some embodiments has at least one surface positioned within the flow cell. The solid support or region thereof may be substantially planar. The solid support may have surface features such as wells, pits, channels, ridges, raised areas, posts, columns, and the like. The term solid support encompasses substrates (e.g., flow cells) having a surface comprising a polymeric coating covalently attached thereto. In an embodiment, the solid support is a flow cell. The term "flow cell" as used herein refers to a chamber containing a solid surface through which one or more fluidic reagents can flow. Examples of flow cells and related fluidic systems and detection platforms that can be readily used in the methods of the present disclosure are described, for example, in Bentley et al Nature, 456:53-59 (2008).
In some embodiments, the nucleic acid comprises a capture nucleic acid. A capture nucleic acid refers to a nucleic acid that is attached to a substrate (e.g., covalently attached). In some embodiments, the capture nucleic acid comprises a primer. In some embodiments, the capture nucleic acid is a nucleic acid configured to specifically hybridize to a portion of one or more nucleic acid templates (e.g., templates of a library). In some embodiments, a capture nucleic acid configured to specifically hybridize to a portion of one or more nucleic acid templates is substantially complementary to an appropriate portion of the nucleic acid templates or amplicons thereof. In some embodiments, the capture nucleic acid is configured to specifically hybridize to a portion of an adapter or a portion thereof. In some embodiments, the capture nucleic acid or a portion thereof is substantially complementary to a portion of the adapter or complement thereof. In some embodiments, the capture nucleic acid is a probe oligonucleotide. Typically, the probe oligonucleotide is complementary to the target polynucleotide or a portion thereof, and further comprises a label (e.g., a binding moiety) or is attached to the surface such that hybridization to the probe oligonucleotide allows selective separation of unbound polynucleotides from probe-bound polynucleotides in the population. The probe oligonucleotide may or may not be used as a primer.
Nucleic acids, including, for example, nucleic acids having phosphorothioate backbones, may comprise one or more reactive moieties. As used herein, the term reactive moiety comprises any group capable of reacting with another molecule, e.g., a nucleic acid or polypeptide, through covalent, non-covalent, or other interactions. For example, a nucleic acid may comprise an amino acid reactive moiety that reacts with an amino acid on a protein or polypeptide by covalent, non-covalent, or other interactions.
Polynucleotides are typically composed of a specific sequence of four nucleotide bases: adenine (a); cytosine (C); guanine (G); and thymine (T) (uracil (U) represents thymine (T) when the polynucleotide is RNA). Thus, the term "polynucleotide sequence" is an alphabetical representation of a polynucleotide molecule; alternatively, the term may be applied to the polynucleotide molecule itself. This alphabetical representation can be entered into a database in a computer with a central processing unit and used for bioinformatic applications such as functional genomics and homology searches. The polynucleotide may optionally comprise one or more non-standard nucleotides, nucleotide analogs, and/or modified nucleotides.
As used herein, the term "template nucleic acid" refers to any polynucleotide molecule that can be bound by a polymerase and used as a template for nucleic acid synthesis. The template nucleic acid may be a target nucleic acid. In general, the term "target nucleic acid" refers to a nucleic acid molecule or polynucleotide in a starting population of nucleic acid molecules having a target sequence whose presence, amount, and/or nucleotide sequence or changes in one or more of them need to be determined. In general, the term "target sequence" refers to a nucleic acid sequence on a single strand of nucleic acid. The target sequence may be a portion of a gene, a regulatory sequence, genomic DNA, cDNA, RNA comprising mRNA, miRNA, rRNA, or the like. The target sequence may be a target sequence from a sample or a secondary target, such as the product of an amplification reaction. The target nucleic acid need not be any single molecule or sequence. For example, depending on the reaction conditions, the target nucleic acid may be any of a variety of target nucleic acids in a reaction, or all nucleic acids in a given reaction. For example, in a nucleic acid amplification reaction with random primers, all polynucleotides in the reaction may be amplified. As a further example, a pool of targets may be determined simultaneously in a single reaction using polynucleotide primers directed to multiple targets. As yet another example, all or a subset of the polynucleotides in a sample may be modified by the addition of primer binding sequences (e.g., by ligating adaptors containing primer binding sequences) such that each modified polynucleotide becomes a target nucleic acid in a reaction with a corresponding primer polynucleotide. In the context of selective sequencing, a "target nucleic acid" refers to a subset of nucleic acids sequenced from within an initial population of nucleic acids.
The term "polynucleotide fusion" is used in accordance with its plain and ordinary meaning and refers to a polynucleotide formed by the joining of two regions of a reference sequence (e.g., a reference genome) that are not so joined in the reference sequence, thereby forming a fusion junction between the two regions that are not present in the reference sequence. Polynucleotide fusions can be formed through a number of processes, including inter-chromosomal translocations, intra-chromosomal translocations, and other chromosomal rearrangements (e.g., inversions and duplications). Polynucleotide fusion may involve fusion between two gene sequences, referred to as "gene fusion" and results in a "fusion gene". In some cases, the fusion gene is expressed as a fusion transcript (e.g., a fusion mRNA transcript) comprising the sequences of both genes or portions thereof.
"fusion gene" is used in accordance with its ordinary meaning in the art and refers to a hybrid gene or portion thereof formed from two previously independent genes or portions thereof (e.g., in a cell). A "fusion junction" is a point in the sequence of a fusion gene between two previously independent genes or portions thereof. Hybrid genes may be caused by translocation of the gene or gene portion, interstitial deletions and/or chromosomal inversion. An "exon junction" is a point or position in a fusion gene sequence between two previously independent exon sequences or portions thereof.
The nucleic acid may be amplified by a suitable method. The term "amplification" as used herein refers to a process of linearly or exponentially generating an amplicon nucleic acid having the same or substantially the same (e.g., substantially the same) nucleotide sequence and/or complement as a target nucleic acid or a segment thereof in a sample. In some embodiments, the amplification reaction comprises a suitable thermostable polymerase. Thermostable polymerases are known in the art and are stable for extended periods of time at temperatures above 80 ℃ compared to common polymerases found in most mammals. In certain embodiments, the term "amplification" refers to a method comprising the Polymerase Chain Reaction (PCR). The conditions conducive to amplification (i.e., amplification conditions) typically comprise at least the use of a suitable polymerase, a suitable template, a suitable primer or set of primers, a suitable nucleotide (e.g., dNTPs), a suitable buffer, and a suitable annealing, hybridization, and/or extension time and temperature. In certain embodiments, the amplification product (e.g., amplicon) may contain one or more additional and/or different nucleotides than the template sequence or portion thereof from which the amplicon was generated (e.g., the primer may contain "additional" nucleotides (e.g., a 5' portion that does not hybridize to the template), or one or more mismatched bases within the hybridized portion of the primer).
As used herein, "differential amplification" (differential amplification or differential amplification) refers to the degree of amplification of a gene of interest being greater than the degree of amplification of a reference gene, thereby resulting in a greater amount of amplified product from the gene of interest relative to the amount of amplified product from a reference gene. In embodiments, the gene of interest comprises a polynucleotide sequence that includes a fusion gene, and the gene of interest comprises a polynucleotide that does not include a fusion gene.
As used herein, the term "Rolling Circle Amplification (RCA)" refers to a nucleic acid amplification reaction that amplifies a circular nucleic acid template (e.g., a single-stranded DNA circle) by a rolling circle mechanism. Rolling circle amplification reactions are initiated by hybridization of primers to a circular (usually single stranded) nucleic acid template. The nucleic acid polymerase then extends the primer hybridized to the circular nucleic acid template by continuing around the circular nucleic acid template to repeat the sequence of the nucleic acid template once again (rolling circle mechanism). Rolling circle amplification generally produces concatemers comprising tandem repeat units of a circular nucleic acid template sequence. Rolling circle amplification may be Linear RCA (LRCA) that exhibits linear amplification kinetics (e.g., RCA using a single specific primer), or may be exponential RCA (ecrca) that exhibits exponential amplification kinetics. Rolling circle amplification can also be performed using multiple primers (multiplex primer rolling circle amplification or MPRCA) to generate hyperbranched concatamers. For example, in a dual primer RCA, one primer may be complementary to a circular nucleic acid template, as in a linear RCA, while the other may be complementary to a tandem repeat unit nucleic acid sequence of the RCA product. Thus, a double primer RCA can be performed as a chain reaction with exponential (geometric) amplification kinetics, characterized by a branched cascade involving multiple hybridization of two primers, primer extension and strand displacement events. This typically produces a discrete set of multiple duplex double stranded nucleic acid amplification products. Rolling circle amplification can be performed in vitro under isothermal conditions using a suitable nucleic acid polymerase, such as Phi29 DNA polymerase. RCA may be performed by using any DNA polymerase known in the art (e.g., phi29 DNA polymerase, bst DNA polymerase, or SD polymerase).
The nucleic acid may be amplified by a thermal cycling method or an isothermal amplification method. In some embodiments, rolling circle amplification methods are used. In some embodiments, the amplification occurs on a solid support (e.g., within a flow-through cell) to which the nucleic acid, nucleic acid library, or portion thereof is immobilized. In some sequencing methods, a nucleic acid library is added to a flow cell and immobilized to an anchor by hybridization under appropriate conditions. This type of nucleic acid amplification is commonly referred to as solid phase amplification. In some embodiments of solid phase amplification, all or part of the amplified product is synthesized by extension primed by the immobilized primer. The solid phase amplification reaction is similar to standard solution phase amplification except that at least one of the amplification oligonucleotides (e.g., primers) is immobilized on a solid support.
In some embodiments, the solid phase amplification comprises a nucleic acid amplification reaction comprising only one species of oligonucleotide primer immobilized to a surface or substrate. In certain embodiments, the solid phase amplification comprises a plurality of different immobilized oligonucleotide primer species. In some embodiments, the solid phase amplification may comprise a nucleic acid amplification reaction comprising an oligonucleotide primer of one species immobilized on a solid surface and a second, different oligonucleotide primer species in solution. Immobilized primers or solution-based primers of a variety of different species may be used. Non-limiting examples of solid phase nucleic acid amplification reactions include interfacial amplification, bridged PCR amplification, emulsion PCR, wildFire amplification (e.g., U.S. patent publication No. US 20130012399), and the like, or combinations thereof.
In embodiments, the target nucleic acid is a cell-free nucleic acid. In general, the terms "cell-free", "circulating" and "extracellular" (e.g., "cell-free DNA" (cfDNA) and "cell-free RNA" (cfRNA)) as applied to nucleic acids are used interchangeably to refer to nucleic acids present in a sample from a subject or portion thereof, which can be isolated or otherwise manipulated (e.g., as extracted from a cell or virus) without applying a cleavage step to the initially collected sample. Thus, even prior to collection of a subject sample, cell-free nucleic acid is not encapsulated or "dissociated" from the cell or virus from which it was derived. Cell-free nucleic acids can be produced as a byproduct of cell death (e.g., apoptosis or necrosis) or cell shedding, thereby releasing the nucleic acid into the surrounding body fluid or circulation. Thus, cell-free nucleic acids may be isolated from non-cellular fractions of blood (e.g., serum or plasma), other bodily fluids (e.g., urine), or non-cellular fractions of other types of samples.
As used herein, the term "analog" when referring to a chemical compound refers to a compound that has a structure similar to that of another chemical compound, but differs from it in one or more different atoms, functional groups, or substructures replaced by one or more other atoms, functional groups, or substructures. In the context of nucleotides, "nucleotide analogs" and "modified nucleotides" refer to a compound that, like the nucleotides that are analogs thereof, can be incorporated into a nucleic acid molecule (e.g., an extension product) by a suitable polymerase, e.g., a DNA polymerase in the context of a nucleotide analog. The term also encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring or non-naturally occurring, which have similar binding properties as the reference nucleic acid and which are metabolized in a manner similar to the reference nucleotide. Examples of such analogs include, but are not limited to, phosphodiester derivatives including, for example, phosphoramidates, phosphorodiamidates, phosphorothioates (also known as phosphorothioates, which have double bond sulfur substituted oxygen containing phosphates), phosphorodithioates, phosphonocarboxylic acids, phosphonocarboxylic acid esters, phosphonoacetic acid, phosphonoformic acid, methylphosphonates, borophosphonates, or O-methylphosphinamide linkages (see, e.g., eckstein, oligonucleotides and analogs: methods of use (OLIGONUCLEOTIDES AND ANALOGUES: A PRACTICAL APPROACH), oxford university press (Oxford University Press)), and modifications to nucleotide bases as in 5-methylcytidine or pseudouridine; peptide nucleic acid backbones and linkages. Other similar nucleic acids include nucleic acids having a positive backbone; nonionic backbones, modified sugar and non-ribose backbones (e.g., phosphorodiamidate morpholino oligonucleotides or Locked Nucleic Acids (LNAs)), including those described in the following documents: U.S. Pat. nos. 5,235,033 and 5,034,506, and chapters 6 and 7, ASC seminar series 580 (ASC Symposium Series 580), carbohydrate modification in antisense studies (CARBOHYDRATE MODIFICATIONS IN ANTISENSE RESEARCH), editors: sanghui and Cook. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acid. Modification of the ribose-phosphate backbone can be performed for a variety of reasons, for example, to increase the stability and half-life of such molecules in physiological environments, or as probes on biochips. Mixtures of naturally occurring nucleic acids and analogs can be prepared; alternatively, mixtures of different nucleic acid analogs can be prepared, as well as mixtures of naturally occurring nucleic acids and analogs. In embodiments, the internucleotide linkages in the DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
As used herein, "natural" nucleotide is used in accordance with its plain and ordinary meaning and refers to naturally occurring nucleotides that do not contain exogenous markers (e.g., fluorescent dyes or other markers) or chemical modifications, such as chemical modifications (e.g., reversible terminating moieties) that can characterize a nucleotide analog. Examples of natural nucleotides that can be used to perform the procedures described herein include: dATP (2 '-deoxyadenosine-5' -triphosphate); dGTP (2 '-deoxyguanosine-5' -triphosphate); dCTP (2 '-deoxycytidine-5' -triphosphate); dTTP (2 '-deoxythymidine-5' -triphosphate); and dUTP (2 '-deoxyuridine-5' -triphosphate).
As used herein, the term "modified nucleotide" refers to a nucleotide that is modified in some way. Typically, a nucleotide contains a single 5-carbon sugar moiety, a single nitrogenous base moiety, and 1 to 3 phosphate moieties. In embodiments, the nucleotide may comprise a blocking moiety (alternatively referred to herein as a reversible terminator moiety) and/or a labeling moiety. The blocking moiety on a nucleotide prevents covalent bonds from forming between the 3 'hydroxyl moiety of the nucleotide and the 5' phosphate of another nucleotide. The blocking moiety on a nucleotide may be reversible, whereby the blocking moiety may be removed or modified to allow the 3 'hydroxyl group to form a covalent bond with the 5' phosphate of another nucleotide. The blocking moiety may be effectively irreversible under the particular conditions used in the methods set forth herein. In embodiments, the blocking moiety is attached to the 3' oxygen of the nucleotide and is independently-NH 2 、-CN、-CH 3 、C 2 -C 6 Allyl (e.g. -CH) 2 -CH=CH 2 ) Methoxyalkyl (e.g., -CH) 2 -O-CH 3 ) or-CH 2 N 3 . In embodiments, the blocking moiety is linked to the 3' oxygen of the nucleotide, and independently The labeling moiety of a nucleotide may be any moiety that allows for detection of the nucleotide, for example, using spectroscopic methods. Exemplary label moieties are fluorescent labels, mass labels, chemiluminescent labels, electrochemical labels, detectable labels, and the like. One or more of the above moieties may be absent from the nucleotides used in the methods and compositions set forth herein. For example, the nucleotides may lack a labeling moiety or a blocking moiety, or both. Examples of nucleotide analogs include, but are not limited to, 7-deaza-adenine, 7-deaza-guanine, analogs of the deoxynucleotides shown herein, analogs of 5-position labels linked to cytosine or thymine by a cleavable linker or 7-position analogs of deaza-adenine or deaza-guanine, and analogs that use a small chemical moiety to cap the-OH group at the 3' position of deoxyribose. Nucleotide analogs and DNA sequencing based on DNA polymerase are also described in U.S. patent No. 6,664,079, which is incorporated herein by reference in its entirety for all purposes.
In embodiments, the nucleotides of the present disclosure use cleavable linkers to attach the tag to the nucleotide. The use of cleavable linkers ensures that the tag can be removed (if desired) after detection, avoiding any interfering signals to any subsequently incorporated tagged nucleotides. The use of the term "cleavable linker" is not intended to imply that the entire linker needs to be removed from the nucleotide base. The cleavage site may be located at a position on the linker that ensures that a portion of the linker remains attached to the nucleotide base after cleavage. The linker can be attached at any position on the nucleotide base as long as Watson-Crick base pairing is still possible. In the context of purine bases, it is preferred that the linker is linked by 7-position of the purine or by a preferred deazapurine analogue, by an 8-modified purine, by an N-6 modified adenosine or an N-2 modified guanine. For pyrimidines, the linkage is preferably by 5-position on cytidine, thymidine or uracil and N-4 position on cytosine.
In embodiments, the nucleotides of the present disclosure use cleavable linkers to attach the tag to the nucleotide. The use of cleavable linkers ensures that the tag can be removed (if desired) after detection, avoiding any interfering signals to any subsequently incorporated tagged nucleotides. The use of the term "cleavable linker" is not intended to imply that the entire linker needs to be removed from the nucleotide base. The cleavage site may be located at a position on the linker that ensures that a portion of the linker remains attached to the nucleotide base after cleavage. The linker can be attached at any position on the nucleotide base as long as Watson-Crick base pairing is still possible. In the context of purine bases, it is preferred that the linker is linked by 7-position of the purine or by a preferred deazapurine analogue, by an 8-modified purine, by an N-6 modified adenosine or an N-2 modified guanine. For pyrimidines, the linkage is preferably by 5-position on cytidine, thymidine or uracil and N-4 position on cytosine. The term "cleavable linker" or "cleavable moiety" as used herein refers to a divalent or monovalent moiety, respectively, that is capable of being separated (e.g., detached, cleaved, broken, hydrolyzed, stable bonds within the moiety) into distinct entities. Cleavable linkers are cleavable (e.g., specifically cleavable) in response to an external stimulus (e.g., an enzyme, a nucleophilic/basic reagent, a reducing agent, light irradiation, an electrophilic/acidic reagent, an organometallic and metallic reagent, or an oxidizing agent). Chemically cleavable linkers represent a catalyst capable of responding to a chemical species (e.g., acid, bond, oxidant, reductant, pd (0), tris- (2-carboxyethyl) phosphine, dilute nitrous acid, fluoride, tris (3-hydroxypropyl) phosphine), dithionite Sodium (Na) 2 S 2 O 4 ) Or hydrazine (N) 2 H 4 ) Is split in the presence of a linker. Chemically cleavable linkers are not enzymatically cleavable. In an embodiment, the cleavable linker is cleaved by contacting the cleavable linker with a cleavage reagent. In an embodiment, the cleavage agent is a phosphine-containing agent (e.g., TCEP or THPP), sodium dithionite (Na 2 S 2 O 4 ) Weak acid, hydrazine (N) 2 H 4 ) Pd (0) or optical radiation (e.g., ultraviolet radiation). In an embodiment, cutting includes removal. In the context of polynucleotides, a "cleavable site" or "scission bond" is a site that allows for controlled cleavage of a polynucleotide strand (e.g., a linker, primer, or polynucleotide) by chemical, enzymatic, or photochemical means known in the art and described herein. A splice site may refer to a bond of nucleotides (i.e., an internucleoside bond) between two other nucleotides in a nucleotide chain. In embodiments, the scissoring bond may be located at any position within one or more nucleic acid molecules, including at or near a terminus (e.g., the 3' end of an oligonucleotide), or in an internal position of the one or more nucleic acid molecules. In embodiments, the conditions suitable for separating the scissoring bond comprise adjusting pH and/or temperature. In embodiments, the scission site may comprise at least one acid labile bond. For example, the acid labile bond may comprise a phosphoramidate linkage. In an example, the phosphoramidate linkage can be hydrolyzed under acidic conditions, including weakly acidic conditions such as trifluoroacetic acid and a suitable temperature (e.g., 30 ℃) or other conditions known in the art, such as Matthias Mag et al, tetrahedral communication (Tetrahedron Letters), volume 33, 48, 1992,7319-7322. In an embodiment, the splice site may comprise at least one photolabile internucleoside linkage (e.g., an o-nitrobenzyl linkage, as described in Walker et al, J.Am. Chem. Soc.) "1988,110,21,7170-7177, such as o-nitrobenzyloxymethyl or p-nitrobenzyloxymethyl. In embodiments, the splice site comprises at least one uracil nucleobase. In embodiments, uracil nucleobases can be cleaved with Uracil DNA Glycosylase (UDG) or carboxamide pyrimidine DNA glycosylase (Fpg). In implementation In an example, the splice junction comprises a sequence specific nicking site having a nucleotide sequence that is recognized and nicked by a nicking endonuclease or uracil DNA glycosylase.
As used herein, the term "removable" group, such as a labeling or blocking group or a protecting group, is used in accordance with its simple and ordinary meaning and refers to a chemical group that can be removed from a nucleotide analog such that a DNA polymerase can extend a nucleic acid (e.g., a primer or extension product) by incorporating at least one additional nucleotide. Removal may be by any suitable method, including enzymatic, chemical or proteolytic cleavage. Removal of a removable group, such as a blocking group, need not remove the entire removable group, but need only remove a sufficient portion thereof so that the DNA polymerase can extend the nucleic acid by incorporating at least one additional nucleotide using a nucleotide or nucleotide analog.
As used herein, the terms "blocking moiety", "reversible blocking group", "reversible terminator" and "reversible terminator moiety" are used in accordance with their simple and ordinary meanings and refer to cleavable moieties that do not interfere with incorporation of nucleotides contained therein by a polymerase (e.g., DNA polymerase, modified DNA polymerase), but prevent additional chain extension ("unblocked") prior to being removed. For example, a reversible terminator may refer to a blocking moiety located, for example, at the 3' position of a nucleotide, and may be a chemically cleavable moiety, such as allyl, azidomethyl, or methoxymethyl, or may be an enzymatically cleavable group, such as a phosphate. Suitable nucleotide blocking moieties are described in applications WO 2004/018497, U.S. Pat. No. 7,057,026, 7,541,444, WO 96/07669, U.S. Pat. Nos. 5,763,594, 5,808,045, 5,872,244, and 6,232,465, the contents of which are incorporated herein by reference in their entirety. Nucleotides may be labeled or unlabeled. The nucleotide may be modified with a reversible terminator useful in the methods provided herein, and may be a 3 '-O-blocked reversible terminator or a 3' -unblocked reversible terminator. In reversible terminators with 3' -O-blocking In nucleotides, the blocking group may be represented as an-OR [ reversible terminating (capping) group]Where O is the oxygen atom of the 3' -OH of the pentose and R is a blocking group, the tag is attached to a base, which acts as a reporter and can be cleaved. Reversible terminators for 3 '-O-blocking are known in the art and may be, for example, 3' -ONH 2 A reversible terminator, a 3 '-O-allyl reversible terminator, or a 3' -O-azidomethyl reversible terminator. In an embodiment, the reversible terminator moiety is As described herein, the term "allyl" refers to an unsubstituted methylene group attached to a vinyl group having the formula (i.e., -ch=ch 2 ) Has the formula->In an embodiment, the reversible terminator moiety is described in US10,738,072 (which is incorporated herein by reference for all purposes)For example, a nucleotide comprising a reversible terminator moiety can be represented by the formula:
wherein the nucleobase is adenine or an adenine analog, thymine or a thymine analog, guanine or a guanine analog or cytosine or a cytosine analog.
As used herein, the term "label" is used in accordance with its plain and ordinary meaning and refers to a molecule that is capable of generating or causing a detectable signal, either directly or indirectly, by itself or by interaction with another molecule. Non-limiting examples of detectable labels include fluorescent dyes, biotin, digoxin, haptens, and epitopes. In general, a dye is a molecule, compound or substance that can provide an optically detectable signal, such as a colorimetric, luminescent, bioluminescent, chemiluminescent, phosphorescent or fluorescent signal. In an embodiment, the label is a dye. In an embodiment, the dye is a fluorescent dye. Non-limiting examples of dyes, some of which are commercially available, include CF dyes (Biotium limited), alexa Fluor dyes (sameiser), dyLight dyes (sameiser), cy dyes (GE health science), IRDye (Li-Cor biosciences limited), and HiLyte dyes (Anaspec limited). In embodiments, a particular nucleotide type is associated with a particular tag such that the identifying tag identifies the nucleotide with which it is associated. In embodiments, the label is luciferin, which reacts with luciferase to produce a detectable signal in response to one or more bases incorporated into the elongated complementary strand, as in pyrosequencing. In embodiments, the nucleotide comprises a label (e.g., a dye). In an embodiment, the tag is not associated with any particular nucleotide, but detection of the tag identifies whether one or more nucleotides of known identity are added during the extension step (as in the case of pyrosequencing).
In an embodiment, the detectable label is a fluorescent dye. In an embodiment, the detectable label is a fluorescent dye (e.g., a Fluorescence Resonance Energy Transfer (FRET) chromophore) capable of exchanging energy with another fluorescent dye.
In an embodiment, the detectable moiety is part of a derivative of one of the immediately above described detectable moieties, wherein the derivative differs from one of the immediately above described detectable moieties in the modification resulting from conjugation of the detectable moiety to a compound described herein.
The term "cyanine" or "cyanine moiety" as described herein refers to a detectable moiety containing two nitrogen groups separated by a polymethylene chain. In an embodiment, the cyanine moiety has 3 methine structures (i.e., cyanine 3 or Cy 3). In an embodiment, the cyanine moiety has 5 methine structures (i.e., cyanine 5 or Cy 5). In an embodiment, the cyanine moiety has 7 methine structures (i.e., cyanine 7 or Cy 7).
As used herein, the terms "DNA polymerase" and "nucleic acid polymerase" are used in accordance with their ordinary and customary meanings and refer to enzymes capable of synthesizing nucleic acid molecules from nucleotides (e.g., deoxyribonucleotides). Typically, a DNA polymerase adds nucleotides to the 3' end of a DNA strand, one nucleotide at a time. In embodiments, the DNA polymerase is a Pol IDNA polymerase, pol IIDNA polymerase, pol IIIDNA polymerase, pol IV DNA polymerase, pol V DNA polymerase, pol β DNA polymerase, pol μ DNA polymerase, pol λ DNA polymerase, pol σdna polymerase, pol α DNA polymerase, pol δ DNA polymerase, pol epsilon DNA polymerase, pol ηdna polymerase, pol iota DNA polymerase, pol κdna polymerase, pol ζdna polymerase, pol γ DNA polymerase, pol θ DNA polymerase, pol V DNA polymerase, or thermophilic nucleic acid polymerase (e.g., thermomer γ, 9°n polymerase (exo-), thermomer II, thermomer III, or thermomer IX). In embodiments, the DNA polymerase is a modified archaebacteria DNA polymerase. In embodiments, the polymerase is a reverse transcriptase. In an embodiment, the polymerase is a mutant deep sea pneumococcal polymerase (e.g., a mutant deep sea pneumococcal polymerase as described in WO 2018/148723 or WO 2020/056044).
As used herein, the term "exonuclease activity" is used according to its ordinary meaning in the art and refers to the removal of nucleotides from a nucleic acid by a DNA polymerase. For example, during polymerization, nucleotides are added to the 3' end of the primer strand. Sometimes, DNA polymerase incorporates a wrong nucleotide into the 3' -OH end of the primer strand, wherein the wrong nucleotide cannot form hydrogen bonds with the corresponding base in the template strand. Such erroneously added nucleotides are removed from the primer due to the 3 'to 5' exonuclease activity of the DNA polymerase. In embodiments, the exonuclease activity may be referred to as "proofreading". When referring to 3' -5' exonuclease activity, it is understood that DNA polymerase promotes hydrolysis reactions that break the phosphodiester bonds at either 3' end of the polynucleotide strand to cleave the nucleotides. In an embodiment, 3' -5' exonuclease activity refers to the sequential removal of nucleotides in single-stranded DNA in the 3' →5' direction, thereby releasing deoxyribonucleoside 5' -monophosphates one by one. Methods for quantifying exonuclease activity are known in the art, see for example Southworth et al, proceedings of the national academy of sciences (PNAS), volume 93, 8281-8285 (1996).
As used herein, the term "incorporate" or "chemical incorporation," when used with reference to a primer and homologous nucleotide, refers to the process of joining the homologous nucleotide to the primer or extension product thereof by forming a phosphodiester bond.
As used herein, the term "selectivity" or the like of a compound refers to the ability of the compound to distinguish between molecular targets. When used in the context of sequencing, as in "selective sequencing," this term refers to sequencing one or more target polynucleotides from an original starting polynucleotide population, rather than sequencing non-target polynucleotides from the starting population. Typically, selectively sequencing one or more target polynucleotides involves manipulating the target polynucleotides differentially based on known sequences. For example, the target polynucleotide may be hybridized to a probe oligonucleotide, which may be labeled (e.g., with a member of a binding pair) or bound to a surface. In an embodiment, hybridizing the target polynucleotide to the probe oligonucleotide comprises the step of displacing one strand of the double-stranded nucleic acid. The target polynucleotide to which the probe hybridizes may then be separated from the non-hybridized polynucleotides, such as by removing probe-bound polynucleotides from the starting population or by washing away non-probe-bound polynucleotides. The result is a selected subset of the initial population of polynucleotides, which are then sequenced, thereby selectively sequencing the one or more target polynucleotides.
As used herein, the term "specificity" or the like of a compound refers to the ability of a compound to exert a Specific effect (e.g., binding) on a particular molecular target with little or no effect on other proteins in the cell.
As used herein, the terms "bind" and "binding" are used in accordance with their plain and ordinary meanings and refer to association between atoms or molecules. The association may be direct or indirect. For example, the bound atoms or molecules may be directly bound to each other, such as by covalent or non-covalent bonds (e.g., electrostatic interactions (e.g., ionic bonds, hydrogen bonds, halogen bonds), van der Waals interactions (e.g., dipole-dipole, dipole-induced dipole, london dispersion), ring packing (pi effect), hydrophobic interactions, and the like). As a further example, two molecules may bind indirectly to each other by way of direct binding to one or more intermediate molecules, thereby forming a complex.
As used herein, the terms "sequencing," "sequence determination," "determining nucleotide sequence," and the like encompass the determination of partial and complete sequence information, encompass the identification, ordering, or location of nucleotides comprising a polynucleotide being sequenced, and encompass the physical processes used to generate such sequence information. That is, the term encompasses information levels regarding the target polynucleotide, such as sequence comparison, fingerprinting, and the like, as well as the unequivocal identification and ordering of nucleotides in the target polynucleotide. The term also encompasses the identification, ordering and location of one, two or three of the four types of nucleotides within a target polynucleotide. Sequencing methods, as outlined in U.S. Pat. No. 5,302,509, can be performed using the nucleotides described herein. The sequencing method is preferably performed with the target polynucleotide arranged on a solid substrate. The plurality of target polynucleotides may be immobilized to a solid support via a linker molecule, or may be attached to a particle (e.g., microsphere) that may also be attached to a solid substrate. The solid substrate is in the form of a chip, bead, well, capillary, slide, wafer, filter, fiber, porous medium, or column. In embodiments, the solid substrate is gold, quartz, silica, plastic, glass, diamond, silver, metal, or polypropylene. In an embodiment, the solid substrate is porous.
As used herein, the term "sequencing reaction mixture" is used in accordance with its simple and ordinary meaning and refers to an aqueous mixture containing reagents sufficient for dntps or dNTP analogues to add nucleotides to a DNA strand by a DNA polymerase. In an embodiment, the sequencing reaction mixture comprises a buffer. In embodiments, the buffer comprises an acetate buffer, a 3- (N-morpholino) propane sulfonic acid (MOPS) buffer, an N- (2-acetamido) -2-aminoethane sulfonic Acid (ACES) buffer, a Phosphate Buffered Saline (PBS) buffer, a 4- (2-hydroxyethyl) -1-piperazine ethane sulfonic acid (HEPES) buffer, an N- (1, 1-dimethyl-2-hydroxyethyl) -3-amino-2-hydroxypropane sulfonic Acid (AMPSO) buffer, a borate buffer (e.g., borate buffered saline, sodium borate buffer, boric acid buffer), a 2-amino-2-methyl-1, 3-propanediol (AMPD) buffer, an N-cyclohexyl-2-hydroxy-3-aminopropanesulfonic acid (CAPS) buffer, a 2-amino-2-methyl-1-propanol (AMP) buffer, a 4- (cyclohexylamino) -1-butane sulfonic acid (CABS) buffer, a glycine-NaOH buffer, an N-cyclohexyl-2-aminoethane sulfonic acid (CHES) buffer, a Tris (hydroxymethyl) amino-methane (Tris) buffer, or a N-cyclopropane sulfonic acid (CAPS) buffer. In an embodiment, the buffer is a borate buffer. In an embodiment, the buffer is CHES buffer. In an embodiment, the sequencing reaction mixture comprises nucleotides, wherein the nucleotides comprise a reversible terminating moiety and a label covalently linked to the nucleotides through a cleavable linker. In an embodiment, the sequencing reaction mixture comprises a buffer, a DNA polymerase, a detergent (e.g., triton X), a chelating agent (e.g., EDTA), or a salt (e.g., ammonium sulfate, magnesium chloride, sodium chloride, or potassium chloride).
As used herein, the term "sequencing cycle" is used in accordance with its ordinary and customary meaning and refers to the incorporation of one or more nucleotides (e.g., nucleotide analogs) into the 3' end of a polynucleotide with a polymerase and the detection of one or more labels identifying the incorporated one or more nucleotides. Sequencing can be accomplished by, for example, sequencing by synthesis, pyrosequencing, and the like. In embodiments, the sequencing cycle comprises extending the complementary polynucleotide by incorporating a first nucleotide using a polymerase, wherein the polynucleotide hybridizes to the template nucleic acid, thereby detecting the first nucleotide and identifying the first nucleotide. In an embodiment, to begin the sequencing cycle, one or more differently labeled nucleotides and a DNA polymerase may be introduced. After the nucleotides are added, the resulting signal can be detected (e.g., by excitation and emission of a detectable label) to determine the identity of the incorporated nucleotide (based on the label on the nucleotide). Reagents may then be added to remove the 3' reversible terminator and remove the tag from each incorporated base. Reagents, enzymes and other materials can be removed from between steps by washing. Cycling may involve repeating these steps and reading the sequence of each cluster in multiple iterations.
"hybridization" shall mean the attachment of a single stranded nucleic acid sequence (e.g., a primer) to another nucleic acid sequence based on well known principles of sequence complementarity. In one embodiment, the other nucleic acid sequence is a single stranded nucleic acid. The propensity for hybridization between nucleic acid sequences depends on the temperature and ionic strength of their environment, the length of the nucleic acid, and the degree of complementarity. The effect of these parameters on hybridization is described, for example, in Sambrook J, fritsch E.F., maniatisT., molecular cloning: a laboratory Manual (Molecular cloning: a laboratory manual), described in Cold spring harbor laboratory Press (Cold Spring Harbor Laboratory Press, new York) (1989). As used herein, hybridization of a primer or DNA extension product, respectively, can be extended by creating a phosphodiester bond with an available nucleotide or nucleotide analogue capable of forming a phosphodiester bond therewith. For example, hybridization may be performed at a temperature in the range of 15℃to 95 ℃. In some embodiments, hybridization is performed at about 20 ℃, about 25 ℃, about 30 ℃, about 35 ℃, about 40 ℃, about 45 ℃, about 50 ℃, about 55 ℃, about 60 ℃, about 65 ℃, about 70 ℃, about 75 ℃, about 80 ℃, about 85 ℃, about 90 ℃, or about 95 ℃. In other embodiments, the stringency of hybridization can be further altered by adding or removing components of the buffer solution. In some embodiments, nucleic acids or portions thereof configured to hybridize are generally about 80% or more, 81% or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, 99% or more, or 100% complementary to each other over consecutive portions of the nucleic acid sequence. Specific hybridization distinguishes non-specific hybridization interactions (e.g., two nucleic acids that are not configured for specific hybridization, e.g., two nucleic acids that are 80% or less, 70% or less, 60% or less, or 50% or less) by about 2-fold or more, typically about 10-fold or more, and sometimes about 100-fold or more, 1000-fold or more, 10,000-fold or more, 100,000-fold or more, or 1,000,000-fold or more. Two strands of nucleic acid hybridized to each other may form a duplex comprising a double-stranded portion of nucleic acid.
As used herein, the term "extend" or "elongation" is used in accordance with its ordinary and customary meaning and refers to the synthesis of a new polynucleotide strand complementary to a template strand with a polymerase by adding free nucleotides (e.g., dntps) of a reaction mixture complementary to the template in the 5 'to 3' direction. Extension involves condensing the 5 '-phosphate group of the dNTP with the 3' -hydroxyl group at the end of the nascent (elongated) DNA strand.
As used herein, the term "sequencing read" is used in accordance with its plain and ordinary meaning and refers to an inferred sequence of base pairs (or base pair probabilities) corresponding to all or part of a single DNA fragment. Sequencing techniques produce reads of varying lengths. Sequencing reads can comprise 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, or more nucleotide bases. Reads of 20-40 base pairs (bp) in length are known as ultrashort. Typical sequencers produce reads ranging from 100 to 500bp in length. The length of the reads is a factor that may affect the outcome of the biological study. For example, longer read lengths increase the resolution of de novo genome assembly and structural variant detection. In some embodiments, a sequencing read may comprise 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, or more nucleotide bases.
As used herein, the term "k-mer" is used in accordance with its plain and ordinary meaning and refers to a subsequence of a larger sequence string, wherein each k-mer has a length of k. The algorithm for determining the overlap between sequence data may involve the identification of k-mers between reads. Without being bound by theory, sequences sharing a large number of k-mers may be from the same region of the sequence to be identified, e.g., a genomic sequence. The k value is the length of the matching region and is typically about 10-30 base pairs. These regions can be found quickly using data structures such as suffix trees or hash tables. For two overlapping reads that share a k-mer, the two reads will typically have a low error rate or long enough to compensate for the high error opportunities. However, for sequencing reads with relatively frequent errors, the method can be modified to allow for errors in the k-mers. For example, previously developed algorithms use spacer k-mers with "don't care" positioning to allow substitution and increase the sensitivity of successive k-mers. Algorithms for k-mers with such spacing are described, for example, in Navarro, G. (2001) ACM calculation survey (ACM Computing Surveys) 33:31-88; and Farach-Colton et al, (2007) journal of computer and research science (J. Computerand Sys. Sci.), 73:1035-1044, the disclosures of which are incorporated herein by reference in their entirety for all purposes.
As used herein, "single cell" refers to one cell. Individual cells useful in the methods described herein may be obtained from a tissue of interest, or from a biopsy, blood sample, or cell culture. In addition, cells from a particular organ, tissue, tumor, neoplasm, etc., may be obtained and used in the methods described herein. In general, cells from any population can be used in the method, such as a population of prokaryotic or eukaryotic organisms, including bacteria or yeast.
The term "cellular component" is used in accordance with its ordinary meaning in the art and refers to any organelle, nucleic acid, protein, or analyte found in a prokaryotic, eukaryotic, archaebacterial, or other organism cell type. Examples of cellular components (e.g., components of a cell) include RNA transcripts, proteins, membranes, lipids, and other analytes.
"Gene" refers to a polynucleotide sequence capable of conferring a biological function upon transcription and/or translation. Functionally, the genome is subdivided into genes. Each gene is a nucleic acid sequence encoding an RNA or polypeptide. Genes are transcribed from DNA to RNA, which may be non-coding (ncRNA) with direct function, or intermediate messengers (mRNA) that are subsequently translated into proteins. Typically, a gene comprises a plurality of sequence elements, such as coding elements (i.e., sequences encoding functional proteins), non-coding elements, and regulatory elements. Each element can be as short as a few bp to 5kb. In embodiments, the gene is a protein coding sequence of RNA. Non-limiting examples of genes include developmental genes (e.g., adhesion molecules, cyclin kinase inhibitors, wnt family members, pax family members, winged-helix family members, hox family members, cytokines/lymphokines and their receptors, growth/differentiation factors and their receptors, neurotransmitters and their receptors); oncogenes (e.g., ABL1, BCL2, BCL6, CBFA2, CBL, CSF1R, ERBA, ERBB, EBRB2, ETS1, ETV6, FGR, FOS, FYN, HCR, HRAS, JUN, KRAS, LCK, LYN, MDM2, MLL, MYB, MYC, MYCL1, MYCN, NRAS, PIM1, PML, RET, SRC, TAL1, TCL3, and YES); tumor suppressor genes (e.g., APC, BRCA1, BRCA2, MADH4, MCC, NF1, NF2, RB1, TP53, and WT 1); and enzymes (e.g., ACC synthase and oxidase, ACP desaturase and hydroxylase, ADP-glucose pyrophosphorylase, atpase, alcohol dehydrogenase, amylase, amyloglucosidase, catalase, cellulase, chalcone synthase, chitinase, cyclooxygenase, decarboxylase, dextrinase, DNA and RNA polymerase, galactosidase, glucanase, glucose oxidase, granule-bound starch synthase, gtpase, helicase, hemicellulase, integrase, inulin enzyme, invertase, isomerase, kinase, lactase, lipase, lipoxygenase, lysozyme, nopaline synthase, octopine synthase, pectinase, peroxidase, phosphatase, phospholipase, phosphorylase, phytase, plant growth regulator synthase, polygalacturonase, protease and peptidase, pullulanase, recombinase, reverse transcriptase, RUBISCO, topoisomerase, and xylanase). In embodiments, the gene comprises at least one mutation associated with a disease or condition mediated by a mutated form of the gene.
Provided herein are methods and compositions for analyzing a sample (e.g., sequencing nucleic acids within a sample). Samples (e.g., samples comprising nucleic acids) may be obtained from a suitable subject. The sample may be isolated or obtained directly from the subject or portion thereof. In some embodiments, the sample is obtained indirectly from an individual or medical professional. The sample may be any specimen isolated or obtained from a subject or portion thereof. The sample may be any specimen isolated or obtained from a plurality of subjects. Non-limiting examples of a sample include fluid or tissue from a subject, including but not limited to blood or blood products (e.g., serum, plasma, platelets, buffy coat, etc.), umbilical cord blood, chorionic villus, amniotic fluid, cerebrospinal fluid, spinal fluid, lavage fluid (e.g., lung, stomach, peritoneum, catheter, ear, arthroscope), biopsy samples, laparoscopy samples, cells (blood cells, lymphocytes, placental cells, stem cells, bone marrow-derived cells, embryonic or fetal cells), or portions thereof (e.g., mitochondria, nuclei, extracts, etc.), urine, stool, sputum, saliva, nasal mucosa, prostate fluid, lavage fluid, semen, lymph fluid, bile, tears, sweat, milk, breast milk, etc., or combinations thereof. The fluid or tissue sample from which the nucleic acid is extracted may be cell-free (e.g., cell-free). Non-limiting examples of tissue include organ tissue (e.g., liver, kidney, lung, thymus, adrenal gland, skin, bladder, reproductive organ, intestine, colon, spleen, brain, etc., or portions thereof), epithelial tissue, hair follicles, catheters, tubes, bones, eyes, nose, mouth, throat, ear, nails, etc., portions thereof, or combinations thereof. The sample may comprise normal, healthy, diseased (e.g., infected), and/or cancerous (e.g., cancerous cells) cells or tissue. Samples obtained from a subject may comprise cells or cellular material (e.g., nucleic acids) of a variety of organisms (e.g., viral nucleic acids, fetal nucleic acids, bacterial nucleic acids, parasite nucleic acids).
In some embodiments, the sample comprises a nucleic acid or fragment thereof. The sample may comprise nucleic acids obtained from one or more subjects. In some embodiments, the sample comprises nucleic acid obtained from a single subject. In some embodiments, the sample comprises a mixture of nucleic acids. The mixture of nucleic acids may comprise two or more nucleic acid species having different nucleotide sequences, different fragment lengths, different sources (e.g., genomic sources, cellular or tissue sources, subject sources, etc., or combinations thereof), or combinations thereof. The sample may comprise synthetic nucleic acids.
The subject may be any living or non-living organism, including but not limited to a human, a non-human animal, a plant, a bacterium, a fungus, a virus, or a protozoan. The subject may be of any age (e.g., embryo, fetus, infant, child, adult). The subject may be of any sex (e.g., male, female, or a combination thereof). The subject may have become pregnant. In some embodiments, the subject is a mammal. In some embodiments, the subject is a human subject. The subject may be a patient (e.g., a human patient). In some embodiments, the subject is suspected of having a genetic variation or a disease or condition associated with a genetic variation.
As used herein, the term "consensus sequence" refers to a sequence of nucleotides most common at each position within a nucleic acid sequence of a set of sequences (e.g., a set of sequencing reads) that are shown to be aligned at that position. The consensus sequence is typically "assembled" from shorter sequence reads that overlap at least in part. In the case where two sequences contain overlapping sequence information aligned at one end and non-overlapping sequence information at the opposite end, the consensus sequence formed by the two sequences will be longer than either sequence alone. Alignment of multiple such sequences allows for the assembly of many short sequences into longer consensus sequences representing longer sample polynucleotides. In embodiments, aligned sequences used to generate consensus sequences may contain gaps (e.g., representing nucleotides that are not present in a given read because they extend during the dark cycle and are not identified).
In some embodiments, the nucleic acid (e.g., an adapter, linear nucleic acid molecule, or primer) comprises a molecular identifier or a molecular barcode. As used herein, the term "molecular barcode" (which may be referred to as a "tag," "barcode," "molecular identifier," "identifier sequence," or "unique molecular identifier" (UMI)) refers to any material (e.g., nucleotide sequence, nucleic acid molecular features) that is capable of distinguishing between individual molecules in a large heterogeneous population of molecules. In embodiments, barcodes are unique in a pool of barcodes that differ in sequence from each other or are uniquely associated with a particular sample polynucleotide in a pool of sample polynucleotides. In embodiments, each barcode in the pool of adaptors is unique such that sequencing reads comprising the barcode can be identified as originating from a single sample polynucleotide molecule based solely on the barcode. In other embodiments, a single barcode sequence may be used more than once, but adaptors comprising duplicate barcodes are correlated with different sequences and/or different combinations of barcode adaptors, such that sequence reads may still be uniquely distinguished as originating from a single sample polynucleotide molecule based on the barcodes and adjacent sequence information (e.g., sample polynucleotide sequence and/or one or more adjacent barcodes). In embodiments, the length of the barcode is about or at least about 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, or more nucleotides. In embodiments, the length of the barcode is shorter than 20, 15, 10, 9, 8, 7, 6, or 5 nucleotides. In embodiments, the bar code is about 10 to about 50 nucleotides in length, such as about 15 to about 40 or about 20 to about 30 nucleotides in length. In a pool of different barcodes, the length of the barcodes may be the same or different. Typically, barcodes are of sufficient length and comprise sequences that are sufficiently different to allow identification of sequencing reads derived from the same sample polynucleotide molecule. In embodiments, each barcode of the plurality of barcodes differs from each other barcode of the plurality of barcodes in at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotide positions. In some embodiments, a substantially degenerate bar code may be referred to as random. In some embodiments, the barcode may comprise a nucleic acid sequence from a pool of known sequences. In some embodiments, the bar code may be predefined.
In embodiments, the nucleic acid (e.g., an adapter, linear nucleic acid molecule, or primer) comprises a sample barcode. Typically, a "sample barcode" is a nucleotide sequence sufficiently different from other sample barcodes to allow identification of the source of the sample based on the sample barcode sequence associated therewith. In embodiments, multiple nucleotides (e.g., all nucleotides from a particular sample source or sub-sample thereof) are joined to a first sample barcode, while different multiple nucleotides (e.g., all nucleotides from a different sample source or different sub-sample) are joined to a second sample barcode, thereby correlating each multiple polynucleotide with a different sample barcode indicative of a sample source. In embodiments, each sample barcode of the plurality of sample barcodes differs from each other sample barcode of the plurality of sample barcodes in at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotide positions. In some embodiments, a substantially degenerate sample barcode may be referred to as random. In some embodiments, the sample barcode may comprise nucleic acid sequences from a pool of known sequences. In some embodiments, the sample bar code may be predefined. In an embodiment, the sample barcode comprises about 1 to about 10 nucleotides. In embodiments, the sample barcode comprises about 3, 4, 5, 6, 7, 8, 9, or about 10 nucleotides. In an embodiment, the sample barcode comprises about 3 nucleotides. In an embodiment, the sample barcode comprises about 5 nucleotides. In an embodiment, the sample barcode comprises about 7 nucleotides. In an embodiment, the sample barcode comprises about 10 nucleotides. In an embodiment, the sample barcode comprises about 6 to about 10 nucleotides.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or non-stated intervening value in that stated or smaller range is encompassed within the invention. The upper and lower limits of any such smaller ranges (in the more widely enumerated ranges) may independently be included in the smaller ranges, or may be specified in the specific value itself, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
The term "kit" is used in accordance with its ordinary meaning and refers to any delivery system for delivering materials or reagents for practicing the methods of the invention. Such delivery systems include systems that allow for storage, transport, or delivery of the reactive agent (e.g., nucleotides, enzymes, nucleic acid templates, etc. in an appropriate container) and/or support material (e.g., buffers for conducting the reaction, written instructions, etc.) from one location to another. For example, the kit comprises one or more housings (e.g., cassettes) containing the relevant reagents and/or support materials. Such inclusions may be delivered to the intended recipient together or separately. For example, a first container may contain an enzyme and a second container contains a nucleotide. In embodiments, the kit comprises a vessel containing one or more enzymes, primers, adaptors, or other reagents as described herein. The vessel may comprise any structure capable of supporting or containing a liquid or solid material, and may comprise a tube, vial, canister, container, tip, or the like. In an embodiment, the wall of the vessel may allow light to be transmitted through the wall. In an embodiment, the vessel may be optically transparent. The kit may comprise enzymes and/or nucleotides in a buffer.
The methods and kits of the present disclosure can be applied, mutatis mutandis, to RNA sequencing or for determining the identity of ribonucleotides.
An aqueous solution herein refers to a liquid comprising at least 20vol% water. In embodiments, the aqueous solution comprises at least 50vol%, such as at least 75vol%, at least 95vol%, greater than 98vol%, or 100vol% water as the continuous phase.
The term "nucleic acid sequencing device" or the like means an integrated system having one or more chambers, ports and channels that are interconnected and in fluid communication and designed for performing an analytical reaction or process, such as sample introduction, fluid and/or reagent driven devices, temperature control, detection systems, data collection and/or integrated systems, alone or in cooperation with an instrument or instrument providing support functions, for determining the nucleic acid sequence of a template polynucleotide. The nucleic acid sequencing device may further comprise special functional coatings on valves, pumps and internal walls. The nucleic acid sequencing device may comprise a receiving unit or platen that orients the flow cell such that the maximum surface area of the flow cell is available for exposure to the optical lens. Other nucleic acid sequencing devices include those provided by: illumina (tm) limited (e.g., hiSeqTM, miSeqTM, nextSeqTM or NovaSeqTM system), life TechnologiesTM (e.g., abiprism or solid system), pacific biosciences (e.g., systems using SMRTTM technology, such as the sequenl (tm) or RS IITM system), or Qiagen (Qiagen) (e.g., genereaderTM system).
"disease" or "condition" or "disease state" refers to any abnormal organism or condition of a cell, tissue or organism. A disease may refer to a survival state or health condition of a patient or subject. In some embodiments, the disease is a disease associated with (e.g., caused by) an activated or overactive kinase or abnormal kinase activity. The disease state may be the result of, inter alia, environmental pathogens, such as viral infections (e.g., HIV/AIDS, hepatitis b, hepatitis c, influenza, measles, etc.), bacterial infections, parasitic infections, fungal infections, or infections of some other organism. The disease state may also be the result of some other environmental factor, such as a chemical toxin or a chemical carcinogen. As used herein, a disease state further comprises a genetic disorder in which one or more copies of a gene are altered or disrupted, thereby affecting its biological function. Exemplary genetic diseases include, but are not limited to polycystic kidney disease, familial multiple endocrine tumor type I, neurofibromatosis, tay-Sachs disease (Huntington's disease), sickle cell anemia, thalassemia and Down's syndrome (Down's syndrome), among others (see, e.g., metabolic and molecular basis of genetic diseases (The Metabolic andMolecular Bases of Inherited Diseases), 7 th edition, mcGraw-Hill inc (New York)). Other exemplary diseases include, but are not limited to, cancer, hypertension, alzheimer's disease, neurodegenerative diseases, and neuropsychiatric disorders such as bipolar disorder or paranoid schizophrenia. The disease state is monitored to determine the level or severity (e.g., stage or progression) of one or more disease states in the subject, and more particularly, to detect a change in a biological state of the subject associated with the one or more disease states (see, e.g., U.S. patent No. 6,218,122, which is incorporated herein by reference in its entirety). In embodiments, the methods provided herein are also applicable to monitoring a disease state of a subject undergoing one or more therapies. Thus, in some embodiments, the present disclosure also provides methods for determining or monitoring the efficacy of one or more therapies on a subject (i.e., determining the level of therapeutic effect). In embodiments, the methods of the present disclosure may be used to assess treatment efficacy in clinical trials, e.g., as early surrogate markers of success or failure in such clinical trials. Within eukaryotic cells, there are hundreds to thousands of interconnected signaling pathways. Thus, perturbation of intracellular protein function has many effects on transcription of other proteins and other genes linked by primary, secondary, and sometimes tertiary pathways. This wide interconnection between the functions of the various proteins means that changes in any one protein may result in compensatory changes in a large number of other proteins. In particular, partial disruption of even a single protein within a cell, such as by exposure to a drug or by a disease state that regulates gene copy number (e.g., genetic mutation), results in sufficiently many other characteristic compensatory changes in gene transcription that can be used to define "signatures" of specific transcript alterations associated with functional disruption, e.g., specific disease states or therapies, even at stages where changes in protein activity are undetectable.
As used herein, the term "neurodegenerative disease" refers to a disease or condition in which the function of the subject's nervous system is impaired. Examples of neurodegenerative diseases that can be detected by the methods described herein include Alexander's disease, alzheimer's disease (Alzheimer's disease)eimer's disease), amyotrophic lateral sclerosis, ataxia telangiectasia, bat disease (Batten disease) (also known as s Pi Ermei Ier-woget-Sjogren-Batten disease), bovine Spongiform Encephalopathy (BSE), kanvan disease (Canavan disease), kechen syndrome (Cockayne syndrome), corticobasal degeneration, creutzfeldt-Jakob disease (Creutzfeldt-Jakob disease), frontotemporal dementia, grave Shi Xiesan syndrome (Gerstmann-17--Scheinker syndrome), huntington's disease, HIV-associated dementia, kennedy's disease, krabbe's disease, kuru, lewy body dementia (Lewy body dementia), mahado-Joseph disease (Machado-Joseph disease) (spinocerebellar ataxia type 3), multiple sclerosis, multiple system atrophy, narcolepsy, neurophobia, parkinson's disease, peter's barch disease (petizaeus-Merzbacher Disease), pick's disease, primary lateral sclerosis, prion disease, lei Fusu ms disease (Refsum's disease), sandhoff's disease, sxie's disease, spinocerebirth's disease, spinocerebirt-spinocerebirt, or spinocerebirt-amy-amyotrophic lateral degeneration (stethod-37) of the type 23, or spinocerebirth-amyotrophic lateral degeneration (stethosis).
As used herein, the term "autoimmune disease" refers to a disease or condition in which the subject's immune system responds abnormally to one or more components (e.g., biomolecules, proteins, cells, tissues, organs, etc.) of the subject. In some embodiments, the autoimmune disease is a condition in which the subject's immune system responds abnormally to one or more components of the subject as if the components were not themselves. Examples of exemplary autoimmune diseases that can be detected using the methods provided herein include Acute Disseminated Encephalomyelitis (ADEM), acute necrotizing hemorrhagic leukoencephalitis, addison's disease, low blood gammaglobulin, asthma, allergic rhinitis, alopecia areata, amyloidosis, ankylosing spondylitis, anti-GBM/anti-TBM nephritis, anti-phospholipid syndrome (APS), arthritis, autoimmune aplastic anemia, autoimmune familial autonomic nerve abnormality (Autoimmune dysautonomia), autoimmune hepatitis, autoimmune hyperlipidemia, autoimmune immunodeficiency, autoimmune Inner Ear Disease (AIED), autoimmune myocarditis, autoimmune pancreatitis autoimmune retinopathy, autoimmune Thrombocytopenic Purpura (ATP), autoimmune thyroid disease, axonal or neuronal neuropathy (Axonalor neuronal neuropathies), ballosis disease (Balo disease), behcet's disease, bullous pemphigoid, cardiomyopathy, giant lymph node hyperplasia, celiac disease, chagas disease (Chagas disease), chronic Inflammatory Demyelinating Polyneuropathy (CIDP), chronic Recurrent Multifocal Osteomyelitis (CRMO), chager-Schttus syndrome (Churg-Strausslenderstyle), cicatricial pemphigoid/benign mucosal pemphigoid, crohn's disease, ke Genshi syndrome (Cogans syndom), condensation of concentrated disease (Cold agglutinindisease), congenital heart block, chronic demyelinating polyneuropathy (CIDP), ke Saiji viral myocarditis, CREST disease, mixed condensed globulinemia (Essential mixed cryoglobulinemia), demyelinating neuropathy, dermatitis herpetiformis, dermatomyositis, devic's disease (neuromyelitis optica), discoid lupus erythematosus, deretsler's syndrome (Dressler's syndrome), endometriosis, eosinophilic fasciitis, erythema nodosum, experimental allergic encephalomyelitis, ewens syndrome (Evanslyndrome), fibroalveolitis, giant cell arteritis (temporal arteritis), glomerulonephritis, goodpasture's syndrome (Graves ' disease), gravey ' ophtalmopathy, gravey's eye disease (Grave's) Gravey's hand syndrome, green-Barlish syndrome, behcet's encephalitis, gravee's disease Hashimoto thyroiditis, hemolytic anemia, kennock-Lin Zidian (Henoch-Schonlein purpura), herpes gestation, hypogammaglobulinemia, ichthyosis, idiopathic Thrombocytopenic Purpura (ITP), igA nephropathy, igG 4-related sclerosing diseases, immunoregulatory lipoproteins, inclusion body myositis, inflammatory bowel disease, insulin dependent diabetes mellitus (type 1), interstitial cystitis, juvenile arthritis, juvenile diabetes, kawasaki syndrome (Kawasaki syndrome), lambert-Eatone's syndrome (Lambert-Eatonsyndrome), white cell disruption vasculitis, lichen planus, lichen sclerosus, wood-like conjunctivitis, linear IgA disease (LAD), lupus (SLE), lyme disease (Lyme disease), chronic disease, meniere's disease, microscopic polyangiitis, mixed tienchymosis (MCTD), mo Lunshi ulcers (Mooren's ulcer), muha-beemann disease (Mucha-haemanndisease), multiple sclerosis, myasthenia gravis, myositis, narcolepsy, neuromyelitis optica (Devic's), neutropenia, ocular scarring pemphigoid, optic neuritis, recurrent rheumatism, PANDAS (childhood autoimmune neuropsychiatric conditions associated with streptococci), paraneoplastic cerebellar degeneration, paroxysmal sleep hemoglobinuria (PNH), paro's syndrome (Parry Romberg syndrome), pal Sang Nage-tescens syndrome (Parsonna-Turnersyndrome), ciliary body flatitis (Pars) peripheral uveitis, pemphigus, peripheral neuropathy, PANDAS venous encephalomyelitis (Perivenous encephalomyelitis), pernicious anemia, POEMS syndrome, polyarteritis nodosa, autoimmune polyadenylic syndrome type I, type II and type III, polymyositis rheumatica, polymyositis, post myocardial infarction syndrome, post pericardial osteotomy syndrome (Postpericardiotomy syndrome), autoimmune progesterone dermatitis, primary biliary cirrhosis, primary sclerosing cholangitis, psoriasis, psoriatic arthritis, idiopathic pulmonary fibrosis, pyoderma gangrene, pure red cell aplasia, raynaud's phenomenon (Raynauds phenomenon), reflex sympathetic dystrophia, lyter's syndrome (Reiter's syndrome), recurrent polyadenylic chondritis, polymorphous leg syndrome, post peritoneal fibrosis, rheumatic fever, rheumatoid arthritis, sarcoidosis, schmidt syndrome (Schmidt syndrome), scleritis, scleroderma, sjogren's syndrome (Sjogren's ssyndrome), sperm and testis autoimmunity, stiff person syndrome, subacute Bacterial Endocarditis (SBE), su Saike syndrome (Susac's syndrome), sympathogenic ophthalmitis, large arteritis (Takayasu's) temporal arteritis/giant cell arteritis, thrombocytopenic purpura (TTP), tolosha-hunter syndrome (Tolosa-Hunt syndrome), transverse myelitis, ulcerative colitis, undifferentiated Connective Tissue Disease (UCTD), uveitis, vasculitis, vesicular dermatosis, vitiligo or Wegener's granulomatosis (Wegener's).
Primary immunodeficiency disease (PIDD) comprises a rare genetic condition that impairs the immune system. Without a functional immune response, people with PID may be chronically debilitating infected, such as Epstein-Barr virus (EBV), which increases the risk of developing cancer. Non-limiting examples of primary immunodeficiency disorders include autoimmune lymphoproliferative syndrome (ALPS), APS-1 (apec), BENTA disease, caspase Eight Deficiency Status (CEDS), CARD9 deficiency and other candidiasis susceptibility syndrome, chronic Granulomatosis (CGD), common Variable Immunodeficiency (CVID), congenital neutropenia syndrome, CTLA4 deficiency, DOCK8 deficiency, GATA2 deficiency, glycosylation disorders with immunodeficiency, high Immunoglobulin E Syndrome (HIES), high immunoglobulin M syndrome, interferon gamma, interleukin 12 and interleukin 23 deficiency, leukocyte Adhesion Deficiency (LAD), LRBA deficiency, PI3 kinase disease, PLCG2 related antibody deficiency and immune disorders (PLAID), severe Combined Immunodeficiency (SCID), STAT3 dominant negative disease, STAT3 function acquired disease, warts, hypopropylemia, infection and granulocytopenia (wh) syndrome, wister-alder syndrome, wilt-aldrich, and lymphoproliferative disorder (xak-X), and non-linked lymphosis.
As used herein, the term "cardiovascular disease" refers to a disease or condition that affects the heart or blood vessels. In embodiments, the cardiovascular disease comprises a disease caused by or exacerbated by atherosclerosis. Exemplary cardiovascular diseases that can be detected using the methods provided herein include alcoholic cardiomyopathy, coronary artery disease, congenital heart disease, arrhythmogenic right ventricular cardiomyopathy, restrictive cardiomyopathy, non-obstructive cardiomyopathy, diabetes mellitus, hypertension, hyperhomocysteinemia, hypercholesterolemia, atherosclerosis, ischemic heart disease, heart failure, pulmonary heart disease, hypertensive heart disease, left ventricular hypertrophy, coronary heart disease, (congestive) heart failure, hypertensive cardiomyopathy, cardiac arrhythmias, inflammatory heart disease, endocarditis, inflammatory cardiac hypertrophy, myocarditis, valvular heart disease, stroke, or myocardial infarction. In embodiments, the disease is a cardiovascular disease associated with gene fusion. Whole genome association (GWA) studies reveal many potential disease modifying gene fusion events; see, e.g., paone et al, front of cardiovascular medicine (front. Cardioview. Med.), 6.01, 2018, which is incorporated herein by reference.
As used herein, the term "cancer" refers to all types of cancers, neoplasms, or malignant tumors found in mammals, including leukemia, carcinoma, and sarcoma. Exemplary cancers that can be detected using the methods provided herein include thyroid cancer, endocrine system cancer, brain cancer, breast cancer, cervical cancer, colon cancer, head and neck cancer, liver cancer, kidney cancer, lung cancer, non-small cell lung cancer, melanoma, mesothelioma, ovarian cancer, pancreatic cancer, sarcoma, gastric cancer, uterine cancer, or medulloblastoma. Further examples include Hodgkin's Disease, non-Hodgkin's Lymphoma, multiple myeloma, neuroblastoma, glioma, glioblastoma multiforme, ovarian cancer, rhabdomyosarcoma, primary thrombocythemia, primary macroglobulinemia, primary brain tumor, malignant pancreatic cancer, malignant carcinoid, bladder cancer, precancerous skin lesions, testicular cancer, lymphoma, thyroid cancer, neuroblastoma, esophageal cancer, genitourinary tract cancer, malignant hypercalcemia, endometrial cancer, adrenal cortex cancer, endocrine or exocrine pancreatic neoplasm, medullary thyroid cancer, melanoma, colorectal cancer, papillary thyroid cancer, hepatocellular carcinoma or prostate cancer.
The term "leukemia" refers to a progressive, malignant disease of the blood-forming organs and is generally characterized by the deregulated proliferation and development of leukocytes and their precursors in the blood and bone marrow. Leukemia is generally classified clinically based on the following: (1) duration and nature of acute or chronic disease; (2) the type of cell involved; medullary (myelogenous), lymphoid (lymphoid) or monocytic; and (3) an increase or non-increase in the number of abnormal cells in the blood-leukemic or non-leukemic (sub-leukemic) cell line. Exemplary leukemias that can be detected using the methods provided herein include, for example, acute non-lymphoblastic leukemia, chronic lymphocytic leukemia, acute myelogenous leukemia, chronic myelogenous leukemia, acute promyelocytic leukemia, adult T-cell leukemia, non-leukemia (aleukemic leukemia), leukemia (a leukocythemic leukemia), basophilic leukemia, blast leukemia, bovine leukemia, chronic myelogenous leukemia, skin leukemia, embryogenic leukemia, eosinophilic leukemia, gross's leukemia, hairy cell leukemia, hematoblast leukemia (hemoblastic leukemia), hematoblast leukemia (hemocytoblastic leukemia), histiocytic leukemia, stem cell leukemia, acute monocytic leukemia leukopenia, lymphoblastic leukemia (lymphogenous leukemia), lymphoid leukemia, lymphosarcoma cell leukemia, mast cell leukemia, megakaryoblastic leukemia, micro myeloblastic leukemia (micromyeloblastic leukemia), monocytic leukemia, myeloblastic leukemia, myelogenous leukemia, myelomonocytic leukemia, internal Grignard leukemia (Naegeli leukemia), plasma cell leukemia, multiple myeloma, plasma cell leukemia, promyelocytic leukemia, reed's cell leukemia (Rieder cell leukemia), schilin's leukemia (Schiling's leukemia), stem cell leukemia, leukemia, sub-leukemic leukemia or undifferentiated cell leukemia.
The term "sarcoma" generally refers to a tumor that consists of a substance similar to embryonic connective tissue and is generally composed of tightly packed cells embedded in a fibrillar or homogeneous substance. Sarcomas which can be detected by the methods provided herein include chondrosarcoma, fibrosarcoma, lymphosarcoma, melanoma, myxosarcoma, osteosarcoma, abbe's sarcoma (Abemethyl's sarcoma), liposarcoma, acinoid soft tissue sarcoma, ameloblastic sarcoma, glucagonoma, green carcinoma sarcoma, choriocarcinoma, embryonal sarcoma, wilms 'tumor sarcoma (Wilms' tur sarcoma), endometrial sarcoma, interstitial sarcoma, ewing's sarcoma (Ewing's sarcoma), fascia sarcoma, fibroblast sarcoma, giant cell sarcoma, granulocytosarcoma, hodgkin's sarcoma, idiopathic multiple pigmentation hemorrhagic sarcoma, B cell immunoblastic sarcoma, lymphomas, T cell immunoblastic sarcoma, zhan Enxun's sarcoma (Jensen's sarcomas), kaposi's sarcomas), propox (Kupffer cell sarcoma), vascular sarcoma, leukemia sarcoma, kaposi's sarcomas, reticuloma, capillary sarcoma, emotion sarcoma (EW) or hemangiosarcoma (Edrum's), hemangiosarcoma (35, equipped sarcoma, or hemangiosarcoma
The term "melanoma" means a tumor caused by the melanocyte system of the skin and other organs. Melanoma that can be detected using the methods provided herein include, for example, acrofreckle nevus melanoma, melanotic melanoma, benign young melanoma, claudeman' S melanoma, S91 melanoma, harding-Passey melanoma (hardding-Passey melanoma), juvenile melanoma, malignant lentigo, malignant melanoma, nodular melanoma, subungual melanoma, or superficial diffuse melanoma.
The term "cancer" refers to a malignant new growth consisting of epithelial cells that tend to infiltrate the surrounding tissue and cause metastasis. Exemplary carcinomas that can be detected using the methods provided herein include, for example, medullary thyroid carcinoma, familial medullary thyroid carcinoma, acinar carcinoma, adenoid cystic carcinoma (adenocystic carcinoma), adenoid cystic carcinoma (adenoid cystic carcinoma), adenocarcinoma, adrenocortical carcinoma, alveolar cell carcinoma, basal-like carcinoma, basal squamous cell carcinoma, bronchioloalveolar carcinoma, bronchiolar carcinoma, brain carcinoma, cholangiocellular carcinoma, choriocarcinoma, mucinous carcinoma, acne carcinoma, uterine body carcinoma, ethmoid carcinoma, armor carcinoma, skin carcinoma, columnar cell carcinoma, ductal carcinoma, hard carcinoma, embryo carcinoma, medullary carcinoma, epidermoid carcinoma, adenoid epithelial cell carcinoma, explanted carcinoma, ulcerative carcinoma (carcinoma ex ulcere) fibrocarcinoma, mucilaginous carcinoma (gelatiniforni carcinoma), gelatinous carcinoma, giant cell carcinoma, adenocarcinoma, granulosa cell carcinoma, hair matrix carcinoma (hematoid carcinoma), blood sample carcinoma hepatocellular carcinoma, greetings cell adenocarcinoma (Hurthle cell carcinoma), vitreous carcinoma (hyaline cancer), adrenoid carcinoma, naive embryonal carcinoma, carcinoma in situ, epidermoid carcinoma, intraepithelial carcinoma gram Long Paqie mole's cancer (Krompcher's cancer), cookigitz cell carcinoma (Kulchitzky-cell cancer), large cell carcinoma, bean-like carcinoma (lenticular carcinoma), bean-like carcinoma (carcinoma lenticulare), lipoma-like carcinoma, lymphoepithelial carcinoma, medullary carcinoma, melanin carcinoma, soft carcinoma, mucous carcinoma, mucinous carcinoma, mucous cell carcinoma, mucous epidermoid carcinoma, mucinous carcinoma, myxoid carcinoma, myxoma-like carcinoma, and myxoma-like carcinoma, nasopharyngeal carcinoma, oat cell carcinoma, ossifiable carcinoma (carcinomassias), bone-like carcinoma (osteoid carcinoma), papillary carcinoma, periportal carcinoma, premalignant carcinoma, acanthocellular carcinoma, mushy-paste carcinoma (pultaceous carcinoma), renal cell carcinoma, reserve cell carcinoma, sarcoid carcinoma, schneider carcinoma (schneiderian carcinoma), hard carcinoma, scrotum carcinoma, ring cell carcinoma, simple carcinoma, small cell carcinoma, potato carcinoma, globular cell carcinoma, spindle cell carcinoma, medullary carcinoma (carcinomasphingaosum), squamous carcinoma, squamous cell carcinoma, string carcinoma (string carcinoma), vasodilatory carcinoma (carcinoma telangiectaticum), vasodilatory carcinoma (carcinoma telangiectodes), transitional cell carcinoma, nodular skin carcinoma (carcinoma tuberosum), nodular skin carcinoma (turberscarcinoma), wart or villous carcinoma.
As used herein, the term "abnormal" refers to a difference from normal. When used to describe enzymatic activity, abnormal refers to activity that is greater or less than the average activity of a normal control or normal non-diseased control sample. Abnormal activity may refer to the amount of activity that causes a disease, wherein returning the abnormal activity to a normal or non-disease related amount (e.g., by administering a compound) results in a decrease in the disease or one or more symptoms of the disease.
"blocking element" refers to an agent (e.g., polynucleotide, protein, nucleotide) that reduces and/or inhibits nucleotide incorporation (i.e., extension of a primer) relative to the absence of the blocking element. In embodiments, the blocking element is a non-extendable oligomer (e.g., a 3' -blocked oligonucleotide). The blocking element on a nucleotide may be reversible, whereby the blocking moiety may be removed or modified to allow the 3 'hydroxyl group to form a covalent bond with the 5' phosphate of another nucleotide. For example, a reversible terminator may refer to a blocking moiety located, for example, at the 3' position of a nucleotide, and may be a chemically cleavable moiety, such as allyl, azidomethyl, or methoxymethyl. In embodiments, the blocking moiety is irreversible (e.g., a blocking element comprising the blocking moiety irreversibly prevents extension). In embodiments, the blocking element comprises an oligonucleotide with a 3' dideoxynucleotide or similar modification to prevent polymerase extension and is used in conjunction with a non-strand displacement polymerase. In another example embodiment, the blocking element comprises one or more modified nucleotides comprising a cleavable linker (e.g., linked to a 5', 3', or nucleobase) comprising PEG, thereby blocking extension. In another example embodiment, the blocking element comprises one or more modified nucleotides that are linked to biotin to which a protein (e.g., streptavidin) can bind, thereby blocking polymerase extension. In another exemplary embodiment, the blocking element comprises modified nucleotides that are complementary to each other, such as iso dGTP or iso dCTP. In a polymerization reaction lacking the appropriate complementary modified nucleotide, the extension of the primer is stopped. In another exemplary embodiment, the blocking element comprises one or more sequences that are recognized and bound by one or more single-stranded DNA binding proteins, thereby blocking polymerase extension at the binding site. In another exemplary embodiment, the blocking element comprises one or more sequences that are recognized and bound by one or more short RNA or PNA oligonucleotides, thereby blocking the extension of DNA polymerase that is incapable of strand displacement RNA or PNA.
The term "clonotype" is used in accordance with its ordinary meaning in the art and refers to a recombinant nucleic acid encoding an immune receptor or a portion thereof. For example, clonotype refers to a recombinant nucleic acid that is typically extracted from a T cell or B cell, but it may also be derived from a cell-free source encoding a T Cell Receptor (TCR) or B Cell Receptor (BCR) or a portion thereof. In embodiments, the clonotype may encode all or a portion of a VDJ rearrangement of IgH, a DJ rearrangement of IgH, a VJ rearrangement of IgK, a VJ rearrangement of IgL, a VDJ rearrangement of tcrβ, a DJ rearrangement of tcrβ, a VJ rearrangement of tcrα, a VJ rearrangement of tcrγ, a VDJ rearrangement of tcrδ, a VD rearrangement of tcrδ, a Kde-V rearrangement, and the like. Clonotypes may also encode translocation breakpoint regions that involve immunoreceptor genes, such as Bcl1-JH or Bc12-JH. On the one hand, clonotypes have sequences long enough to represent or reflect the diversity of the immune molecules from which they are derived, and therefore, the length of a clonotype may vary greatly. In some embodiments, the clonotypes range in length from 25 to 400 nucleotides; in other embodiments, the clonotypes range in length from 25 to 200 nucleotides.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
II method
In one aspect, a method of detecting a genetic feature in one or more nucleic acid molecules is provided, the method comprising: a) Providing one or more linear nucleic acid molecules; b) Circularizing one or more linear nucleic acid molecules to form a circular template polynucleotide comprising a continuous strand lacking free 5 'and 3' ends, and amplifying the one or more circular template polynucleotides to produce a plurality of amplified products; c) Sequencing the plurality of amplification products to produce a plurality of sequencing reads; d) Identifying the presence or absence of a genetic feature in a nucleic acid molecule by analyzing the plurality of sequencing reads (e.g., analyzing the plurality of sequencing reads relative to a control or reference); and e) detecting a genetic feature in one or more nucleic acid molecules when the presence of the genetic feature is identified in the plurality of sequencing reads, wherein the genetic feature comprises an intrachromosomal rearrangement or a gene fusion. In embodiments, the genetic trait is clonotype. In embodiments, the genetic feature is a polynucleotide fusion (e.g., fusion gene).
In one aspect, a method is provided for detecting a polynucleotide fusion comprising a sequence of a first region fused to a sequence of a second region at a fusion junction. In an embodiment, the method comprises: (a) Circularizing one or more linear nucleic acid molecules to form a circular template polynucleotide comprising a continuous strand lacking free 5 'and 3' ends; (b) Amplifying a circular template polynucleotide comprising a fusion junction in an amplification reaction comprising a first primer, a second primer, a blocking element, and a polymerase to produce a fusion amplification product; and (c) detecting the fusion amplification product, thereby detecting polynucleotide fusion. In an embodiment, the method comprises: (a) Circularizing one or more linear nucleic acid molecules to form a circular template polynucleotide comprising a continuous strand lacking free 5 'and 3' ends; (b) Amplifying a circular template polynucleotide comprising a fusion junction in an amplification reaction comprising a first primer, a second primer, a blocking element, and a polymerase to produce a fusion amplification product, wherein: (i) The first region comprises a first strand comprising, from 5 'to 3', a sequence that specifically binds to the blocking element, a sequence that specifically hybridizes to the first primer, and a sequence that is complementary to the sequence that specifically hybridizes to the second primer; (ii) The fusion junction is located between the sequence that specifically binds to the blocking element and the sequence that specifically hybridizes to the first primer; (iii) The blocking element inhibits extension of the polymerase along the sequence to which it binds; and (iv) the circular template polynucleotide comprising the fusion junction does not comprise a sequence or complement thereof that specifically binds to the blocking element; and (c) detecting the fusion amplification product, thereby detecting polynucleotide fusion.
In another aspect, a method of differentially amplifying a polynucleotide comprising a fusion gene relative to a polynucleotide not comprising the fusion gene is provided. In an embodiment, the method comprises: i) Circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises a fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise a fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides; ii) binding a blocking element to the one or more non-fused circular template polynucleotides; and iii) hybridizing a first primer and a second primer to the one or more non-fusion circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; thereby differentially amplifying polynucleotides comprising fusion genes (e.g., fusion genes comprising fusion junctions). In embodiments, the circular template polynucleotide comprises a continuous strand lacking free 5 'and 3' ends. In an embodiment, the first quantity is a quantity or quantity. In an embodiment, the second number is a quantity or number. In an embodiment, the first number is a plurality. In an embodiment, the second number is a plurality of
In one aspect, a method of differentially amplifying a polynucleotide comprising a fusion gene relative to a polynucleotide not comprising the fusion gene is provided. In an embodiment, the method comprises: i) Binding a blocking element to one or more non-fusion circular template polynucleotides; and ii) hybridizing the first and second primers to the one or more non-fusion circular template polynucleotides; iii) Hybridizing the first primer and the second primer to one or more fusion circular template polynucleotides; and iv) extending with a polymerase to produce a first amount of non-fused polynucleotide amplification product and a second amount of fused polynucleotide amplification product, wherein the first amount is detectably less than the second amount; thereby differentially amplifying polynucleotides comprising fusion genes (e.g., fusion genes comprising fusion junctions). In embodiments, the circular template polynucleotide comprises a continuous strand lacking free 5 'and 3' ends. In embodiments, prior to step i) (i.e., binding blocking element), the method further comprises circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises a fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise a fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides.
In one aspect, there is provided a method of amplifying a polynucleotide comprising a fusion gene, the method comprising: i) Binding a blocking element to a non-fusion circular template polynucleotide, wherein the non-fusion circular template does not comprise a fusion gene; ii) hybridizing a first primer and a second primer to the non-fusion circular template polynucleotide; and hybridizing the first primer and the second primer to a fusion circular template polynucleotide, wherein the fusion circular template polynucleotide comprises a fusion gene; and iii) extending the first primer and the second primer with a non-strand displacement polymerase to produce a fusion polynucleotide amplification product.
In another aspect, a method of amplifying a plurality of polynucleotides is provided, the method comprising: circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises a target sequence (e.g., a sequence of interest, such as a gene, SNV, CNV, indel, or fusion gene); binding the blocking element to one or more circular template polynucleotides that do not contain the target sequence; and hybridizing the first primer and the second primer to the circular template polynucleotide and extending with a polymerase amplification product, wherein the amount of amplification product comprising the target sequence is greater than the amount of amplification product not comprising the target sequence. In embodiments, the target sequences comprise cancer somatic mutations, copy number variations, and gene fusions, including those involving novel partners or breakpoints.
In yet another aspect, a method of amplifying a polynucleotide comprising an unknown sequence is provided. In an embodiment, the method comprises: contacting a plurality of circular nucleic acid molecules with a plurality of blocking elements, wherein one or more of the circular nucleic acid molecules comprises an unknown sequence and one or more of the circular nucleic acid molecules comprises a known sequence, and wherein a blocking element binds to a known sequence; contacting the plurality of circular nucleic acid molecules with a plurality of first primers and a plurality of second primers; and extending the first primer and the second primer to produce a plurality of amplification products comprising known and unknown sequences, wherein a greater amount of amplification product comprising the unknown sequence is produced relative to the amplification product comprising the known sequence. In embodiments, the method further comprises detecting (e.g., sequencing) an amplification product comprising an unknown sequence.
In one aspect, a method of differentially amplifying a polynucleotide comprising a first fusion gene relative to a polynucleotide comprising a second fusion gene is provided. In an embodiment, the method comprises: i) Circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises a first fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules comprises a second fusion gene, thereby forming one or more second fusion gene circular template polynucleotides; ii) binding a blocking element to the one or more second fusion gene circular template polynucleotides; and iii) hybridizing the first and second primers to the one or more second fusion gene circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of second fusion gene polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; thereby differentially amplifying the polynucleotide comprising the first fusion gene. In embodiments, the circular template polynucleotide comprises a continuous strand lacking free 5 'and 3' ends.
In one aspect, a method is provided for identifying the frequency of convergence of a subject's immune repertoire (e.g., for predicting a clinical response of a subject to a therapy by identifying the frequency of convergence of a subject's immune repertoire prior to receiving the therapy). In an embodiment, the method further comprises: a) Obtaining a sample from a subject, the sample comprising one or more linear nucleic acid molecules comprising an immunoreceptor sequence (e.g., a T Cell Receptor (TCR), B cell receptor (BCR, or Ab) target); b) Circularizing one or more linear nucleic acid molecules to form circular template polynucleotides comprising contiguous strands lacking free 5 'and 3' ends, and amplifying the one or more circular template polynucleotides to produce a plurality of amplified products comprising an immunoacceptor sequence; c) Sequencing the plurality of amplification products to produce a plurality of sequencing reads; d) Identifying immune receptor clones by analyzing the plurality of sequencing reads; and e) detecting a converged immune receptor clone in the immune receptor clone, wherein the converged immune receptor clone has a similar or identical amino acid sequence and a different nucleotide sequence. In embodiments, the method comprises hybridizing a blocking element to the one or more circular template polynucleotides prior to amplification. In embodiments, the method does not comprise hybridizing a blocking element to the one or more circular template polynucleotides. In embodiments, the method further comprises determining the frequency of convergent immune receptor clones in the sample. In embodiments, the method further comprises treating the subject with immunotherapy when the frequency of the converged immune receptor clone in the sample is greater than the converged frequency cutoff value, wherein the sequence identifying the converged immune receptor clone comprises a CDR3 sequence.
As used herein, the term "immune repertoire" refers to a collection of T cell receptors and B cell receptors (e.g., immunoglobulins) that make up the adaptive immune system of an organism. As used herein, "frequency of convergence" refers to the aggregate frequency (excluding allele information) of clones sharing variable genes.
In an embodiment, the amplification comprises a multiplex amplification reaction comprising a plurality of amplification primer pairs comprising a plurality of junction (J) gene primers for a majority of J genes of the target immune receptor (i.e., the primer pairs comprise complementary sequences of J genes). The methods described herein allow for targeting of the junction genes with outward facing primers and thereby detection of V (D) J regions, rather than direct targeting of each V gene. In an example, V gene identity and sequences comprising CDR3 amino acid sequences are used to identify convergent immune receptor clones. In embodiments, the sequences identifying the converged immune receptor clone comprise CDR1 and CDR3 sequences or CDR2 and CDR3 sequences. In an embodiment, the converged immune receptor clone has the same CDR3 amino acid sequence. In embodiments, the target immunoreceptor nucleic acid molecule comprises FR1, CDR1, FR2, CDR2, FR3, and CDR3 coding regions of the target immunoreceptor.
As used herein, a "convergent TCR set" is a set of T Cell Receptors (TCRs) that are similar in amino acid sequence and functionally equivalent or identical or hypothesized to be identical in amino acid sequence. Because of amino acid similarity, it is generally assumed that the convergent TCR sets recognize the same antigen. In some embodiments, the converging TCR panel members are identical or assumed to be identical in the variable gene and CDR3 amino acid sequences, despite having different nucleotide sequences. Convergent TCR panel members may be caused by differences in non-templated nucleotide bases at VDJ junctions that occur during the generation of productive TCR gene rearrangements. To assess TCR convergence, for example, it is determined that TCR β chains are identical in amino acid sequence but have different nucleotide sequences.
In some embodiments, the subject is treated with therapy in a manner that depends on the frequency of converging immune receptor clones. For example, in some embodiments, a subject having a frequency of convergent immunoreceptor clones greater than a frequency cutoff value for convergent indicates that the subject is a candidate for therapy, and a subject having a frequency of convergent immunoreceptor clones less than a frequency cutoff value for convergent indicates that the subject is not a candidate for therapy. In some embodiments, provided methods comprise identifying converging immune receptor clones from immune receptor clones present in a sample at a frequency of greater than 1/50,000. In some embodiments, the converged frequency cutoff value is a frequency greater than 0.01. In some embodiments, the subject has cancer and is a candidate for immunotherapy. In other embodiments, the subject is a candidate for vaccination against the source of infection or infectious disease. In other embodiments, the subject is a candidate for treatment with an autoimmune inhibitor.
In some embodiments, provided methods comprise using V gene identity and sequences comprising CDR3 amino acid sequences to identify convergent immunoreceptor clones. In some embodiments, provided methods comprise identifying a converged immune receptor clone using a sequence comprising CDR3 sequences, CDR1 and CDR3 sequences, or CDR2 and CDR3 sequences.
In some embodiments, provided methods comprise identifying a converged TCR clone as comprising those having TCR variability and CDR3 rearrangement that are similar or identical in amino acid sequence but different in nucleotide sequence. For example, a significant portion of TCRs that differ from each other by one amino acid residue may have similar or identical specificity for an antigen, and thus such TCRs may be considered convergent.
In some embodiments, the change in TCR clone frequency that converges during treatment with therapy can be used as a predictor of response to therapy. In a manner that depends on the type of disease and the treatment, in some embodiments, responders may be distinguished from non-responders by an increase in the frequency of TCR clones converging during therapy. For example, in cancers (or chronic viral infections) in which the converging TCR clones of the T cell population consist primarily of progenitor-depleted T cell phenotypes, terminally depleted phenotypes or effector phenotypes, and effector T cells of other T cell phenotypes, an increase in the frequency of converging TCR clones during treatment may be indicative of an increase in anti-cancer (or anti-viral) T cell activity. In other cancers, the converged TCR clones may have predominantly a T regulatory phenotype, and an increase in frequency of converged TCR clones during therapy may indicate poor prognosis.
In some embodiments, the measurement or determination of the frequency of converging TCR clones is combined with other T cell library features, such as measurement of T cell clonal expansion, to improve prediction of clinical responsiveness. In some embodiments, a measurement or determination of the frequency of converging TCR clones is combined with a measurement of B cell pool characteristics, such as B cell clone expansion, to improve prediction of clinical responsiveness. In some embodiments, measurement or determination of the frequency of converging TCR clones is combined with measurement or detection of expression of one or more genes associated with immune responses to improve prediction of clinical responsiveness. Such immune response related genes include, but are not limited to, PD-1 and/or PD-L1 genes, interferon-gamma pathway genes, and myeloid-derived suppressor cell related genes. Procedures and reagents for detecting or measuring such gene expression are known in the art and include, but are not limited to, quantitative or semi-quantitative PCR assays, comparative hybridization methods or sequencing procedures, and reagents and kits for use, including, but not limited to, taqman assays and oncoming immune response research assays (sameinshi technologies).
In embodiments, the method further comprises identifying a clonotype. In embodiments, the method further comprises quantifying the clonotypes present in the sample (e.g., exhibiting clonotype properties). "clonotype properties" refers to a collection of different clonotypes derived from a lymphocyte population and their relative abundances, which may be expressed as frequencies (i.e., values between 0 and 1) in a given population, for example. Typically, the lymphocyte population is obtained from a tissue sample. The term "clonotype properties" relates to the immunological concept of the immune "pool" as described below, but is more general: arstina et al Science 280:958-961 (1999); and Kedzierka et al, molecular immunology (mol. Immunol.), 45 (3): 607-618 (2008).
In an embodiment, the clonotype profile comprises at least 10 3 Different clonotypes. In an embodiment, the clonotype profile comprises at least 10 8 Different clonotypes. In an embodiment, the clonotype profile comprises at least 10 5 Different clonotypes. In an embodiment, the clonotype profile comprises at least 10 6 Different clonotypes. In the case of an embodiment of the present invention,such clonotype properties may further comprise the abundance (i.e., quantification) or relative frequency of each different clonotype. In an embodiment, the clonotype property is a set of different recombinant nucleotide sequences (and abundance thereof) or fragments thereof encoding a T receptor (TCR) or B Cell Receptor (BCR), respectively, in a lymphocyte population of an individual, wherein the nucleotide sequences of the set have a correspondence (e.g., a 1:1 correspondence) with different lymphocytes or clonal sub-populations thereof of substantially all lymphocytes of the population.
In embodiments, the first primer hybridizes to one or more non-fused circular template polynucleotides and the second primer hybridizes to one or more fused circular template polynucleotides. In embodiments, the second primer hybridizes to one or more non-fused circular template polynucleotides and the first primer hybridizes to one or more fused circular template polynucleotides. In an embodiment, the plurality of first primers hybridizes to the plurality of non-fused circular template polynucleotides. In an embodiment, the plurality of second primers hybridizes to the plurality of fusion circular template polynucleotides.
In embodiments, the one or more linear nucleic acid molecules comprise DNA, RNA, or cDNA; optionally wherein the DNA or RNA is cell-free nucleic acid. In embodiments, the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion junction comprises an exon junction. In embodiments, the one or more linear nucleic acid molecules comprise cDNA and the fusion junctions comprise exon junctions. In embodiments, the one or more linear nucleic acid molecules comprise RNA and the fusion junction comprises an exon junction. In embodiments, the one or more linear nucleic acid molecules comprise DNA and the fusion junction comprises an exon junction. In embodiments, the one or more linear nucleic acid molecules comprise a sample barcode sequence, a molecular identifier sequence, or both a sample barcode sequence and a molecular identifier sequence.
In embodiments, the fusion gene comprises an inter-chromosomal translocation (e.g., a fusion junction of two different chromosomes) or an intra-chromosomal translocation (e.g., a fusion junction of the same chromosome). In embodiments, the fusion gene comprises an interchhromosomal translocation. In embodiments, the fusion gene comprises an intrachromosomal translocation. In embodiments, the chromosomal translocation comprises a partially or fully rearranged B cell or T cell antigen receptor. In embodiments, the intrachromosomal translocation comprises a partially rearranged B cell antigen receptor. In embodiments, the intrachromosomal translocation comprises a partially rearranged T cell antigen receptor. In embodiments, the intrachromosomal translocation comprises a fully rearranged B cell antigen receptor. In embodiments, the intrachromosomal translocation comprises a fully rearranged T cell antigen receptor.
In embodiments, the sequence of the first region comprises the sequence of a first gene (e.g., the entire gene sequence or a portion thereof) and the sequence of the second region comprises the sequence of a second gene (e.g., the entire gene sequence or a portion thereof). In embodiments, the location at which the first gene is linked to the second gene by an internucleoside linkage is a fusion junction.
In an embodiment, the linear nucleic acid molecules are obtained from a peripheral blood sample using conventional techniques. For example, white blood cells can be isolated from a blood sample using conventional techniques, such as the rosetteep kit. The volume of the blood sample may range from 100 μl to 10mL. In embodiments, the volume of the blood sample ranges from 100 μl to 2mL, and nucleic acid molecules (e.g., DNA and/or RNA) can then be extracted from such blood samples using conventional techniques, such as dnasy blood and tissue kits. Optionally, subsets of leukocytes, such as lymphocytes, may be further isolated using conventional techniques, such as Fluorescence Activated Cell Sorting (FACS) or Magnetically Activated Cell Sorting (MACS). Cell-free DNA nucleic acid molecules may also be extracted from peripheral blood samples using conventional techniques as described in: US6,258,540 or Huang et al, methods of molecular biology (biol.), 444:203-208 (2008), each of which is incorporated herein by reference. For example, peripheral blood may be collected in EDTA tubes, which may then be fractionated into plasma, white blood cells, and red blood cell components by centrifugation. DNA from cell-free plasma fractions (e.g., 0.5 to 2.0 mL) can be extracted using a QIAamp DNA Blood Mini Kit (Blood Mini Kit) Kit according to manufacturer's protocol. Various methods and commercially available kits for isolating different sub-populations of T cells and B cells are known in the art and include, but are not limited to, subset selection immunomagnetic bead isolation or flow cytometric cell sorting using antibodies specific for one or more of any of a variety of known T cell and B cell surface markers. Illustrative markers include, but are not limited to, one or a combination of the following: CD2, CD3, CD4, CD8, CD14, CD19, CD20, CD25, CD28, CD45RO, CD45RA, CD54, CD62L, CDw137 (41 BB), CD154, GITR, foxP3, CD54, and CD28. For example, and as known to those of skill in the art, cell surface markers such as CD2, CD3, CD4, CD8, CD14, CD19, CD20, CD45RA, and CD45RO can be used to determine T, B and monocyte lineages and subpopulations in flow cytometry. Similarly, forward light scattering, side scattering, and/or cell surface markers, such as CD25, CD62L, CD, CD137, CD154, may be used to determine the activation status and functional characteristics of the cells. The linear nucleic acid molecules (e.g., DNA or RNA) can be extracted from cells in a sample, such as a blood or lymph sample or other sample from a subject known to have or suspected of having a disease (e.g., a lymphohematologic malignancy), using standard methods known in the art or commercially available kits.
In embodiments, the blocking element comprises an oligonucleotide, a protein, or a combination thereof. In an embodiment, the blocking element comprises an oligonucleotide. In an embodiment, the blocking element is an oligonucleotide. In an embodiment, the blocking element is an oligonucleotide having 5-25 nucleotides. In an embodiment, the blocking element is an oligonucleotide having 10-50 nucleotides. In an embodiment, the blocking element is an oligonucleotide having 20-75 nucleotides. In embodiments, the blocking element is an oligonucleotide having about 5, about 10, about 20, about 25, about 50, or about 75 nucleotides. In an embodiment, the blocking element is a non-extendable oligomer. In embodiments, the blocking element comprises two or more oligonucleotides arranged in tandem. In embodiments, the blocking element comprises an oligonucleotide and an oligonucleotide that is the inverse complement or partial inverse complement of the oligonucleotide (e.g., producing a pair of partially overlapping oligonucleotides). In an embodiment, the blocking element is a single stranded oligonucleotide having a 5 'end and a 3' end. In an embodiment, the blocking element comprises a 3' -blocked oligonucleotide. In an embodiment, the blocking element comprises a blocking moiety on the 3' nucleotide. The blocking moiety on a nucleotide may be reversible, whereby the blocking moiety may be removed or modified to allow the 3 'hydroxyl group to form a covalent bond with the 5' phosphate of another nucleotide. For example, a reversible terminator may refer to a blocking moiety located, for example, at the 3' position of a nucleotide, and may be a chemically cleavable moiety, such as allyl, azidomethyl, or methoxymethyl, or may be an enzymatically cleavable group, such as a phosphate. In embodiments, the blocking moiety is irreversible (e.g., a blocking element comprising the blocking moiety irreversibly prevents extension).
In an embodiment, the blocking element is a non-extendable oligonucleotide. Blocking groups known in the art may be placed at or near the 3' end of an oligonucleotide (e.g., primer) to prevent extension, as described in US 2010/0167353. Primers or other oligonucleotides may be modified at the 3 'terminal nucleotide to prevent or inhibit the onset of DNA synthesis by, for example, adding a 3' deoxyribonucleotide residue (e.g., cordycepin), a 2',3' -dideoxyribonucleotide residue, a non-nucleotide linkage, or an alkane-diol modification (see, e.g., U.S. patent No. 5,554,516). Alkane diol modifications that can be used to inhibit or block primer extension are also described by: wilk et al (1990 Nucleic Acids Res.) 18 (8): 2065) and Arnold et al (U.S. Pat. No. 6,031,091). Further examples of suitable blocking groups include 3' hydroxy substitution (e.g., 3' -phosphate, 3' -triphosphate or 3' -phosphodiester with an alcohol, such as 3-hydroxypropyl), 2'3' -cyclic phosphate, 2' hydroxy substitution of terminal RNA bases (e.g., phosphate or a sterically bulky group, such as Triisopropylsilyl (TIPS) or tert-butyldimethylsilyl (TBDMS)). 2 '-alkylsilyl groups substituted at the 3' end of oligonucleotides, such as TIPS and TBDMS, are described in US2007/0218490, which is incorporated herein by reference. Bulky substituents may also be incorporated on the base of the 3' terminal residue of the oligonucleotide to block primer extension.
In embodiments, the blocking element comprises an oligonucleotide with a 3' dideoxynucleotide or similar modification to prevent polymerase extension and is used in conjunction with a non-strand displacement polymerase. In some embodiments, the blocking oligomer contains one or more non-natural bases (e.g., LNA bases) that facilitate hybridization of the blocking agent to the target sequence. In some embodiments, the blocking oligomer contains additional modified bases to increase resistance to exonuclease digestion (e.g., one or more phosphorothioate linkages). In an embodiment, the blocking element is an oligonucleotide comprising one or more modified nucleotides that are complementary to each other, such as iso-dGTP or iso-dCTP. In a polymerization reaction lacking complementary modified nucleotides, extension is blocked. In another embodiment, the blocking element is an oligonucleotide comprising a 3' cleavable linker comprising PEG, thereby blocking extension. In another embodiment, the blocking element is an oligonucleotide comprising one or more sequences recognized and bound by one or more short RNA or PNA oligonucleotides, thereby blocking the extension of strand displacing DNA polymerase that is not capable of strand displacing RNA or PNA. In embodiments, the blocking element is a modified nucleotide (e.g., a nucleotide comprising a reversible terminator, such as a 3' -reversible termination moiety).
In embodiments, the blocking element comprises an oligonucleotide, a protein, or a combination thereof. In an embodiment, the blocking element comprises a protein. In embodiments, the blocking element comprises one or more proteins. The blocking element need not be an oligomer; in some embodiments, for example, the blocking element is a protein that selectively binds to the target sequence and prevents polymerase extension. In embodiments, the blocking element is an oligonucleotide comprising one or more modified nucleotides. In embodiments, the blocking element is an oligonucleotide comprising one or more modified nucleotides, wherein the one or more modified nucleotides are linked to biotin to which a protein (e.g., streptavidin) can bind, thereby blocking polymerase extension. In embodiments, the blocking element comprises one or more sequences that are recognized and bound by one or more single-stranded DNA binding proteins, thereby blocking polymerase extension at the binding site.
In embodiments, the blocking element comprises a CRISPR-Cas9 complex. For example, guide RNAs that specifically target non-fusion sequences are used to introduce them into samples containing circularized ssDNA. The CRISPR-Cas9 complex then targets and cleaves the non-fusion sequences present in any circular ssDNA molecule. After linearizing the non-fused circular ssDNA molecules by CRISPR complexes, an exonuclease digestion can then be performed to digest the linear ssDNA molecules, thereby enriching the circular ssDNA molecules containing the fusion gene (e.g., lacking the non-fused gene sequence targeted by the guide RNA).
In an embodiment, the blocking element comprises biotin. For example, after circularization, the biotinylated blocking element is hybridized to a non-fusion gene sequence. The circular ssDNA molecules hybridized to the biotinylated blocking elements are then pulled down using, for example, streptavidin-coated magnetic beads, thereby depleting any sample containing non-fused circular molecules prior to amplification.
In embodiments, the blocking element comprises a restriction site. For example, the blocking element acts as a splint to enable restriction enzyme mediated digestion of non-fused circular ssDNA containing molecules into non-amplifiable linear fragments. The methylation-blocking oligomer can be used in combination with a methylation-sensitive restriction enzyme (e.g., notI, naeI, nsbI, salI, hapII or HaeII).
In an embodiment, the binding blocking member comprises a binding blocking member upstream of the first primer. The terms "upstream" and "downstream" are used in accordance with their ordinary meaning in the art and refer to positioning toward the 5 'end (upstream) or positioning toward the 3' end (downstream) when referring to a nucleic acid. In an embodiment, the blocking element binds to about 1 to 150 nucleotides upstream relative to the first primer. In an embodiment, the blocking element binds to about 1 to 15 nucleotides upstream relative to the first primer. In embodiments, the blocking element binds to about 10 to about 25 nucleotides upstream relative to the first primer.
In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 1 to 100 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 10 to about 50 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 50 to about 200 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 50 to about 100 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 25 to about 50 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 50 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 25 nucleotides, downstream of the fusion junction within the fusion gene. In embodiments, the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 10 nucleotides, downstream of the fusion junction within the fusion gene.
In embodiments, the method further comprises binding a second blocking element to the one or more non-fusion circular template polynucleotides downstream relative to the second primer. In embodiments, the second blocking element binds to about 100 to about 300 nucleotides downstream relative to the second primer. In embodiments, the second blocking element binds to about 75 to about 150 nucleotides downstream relative to the second primer. In embodiments, the second blocking element binds to about 50 to about 300 nucleotides downstream relative to the second primer. In embodiments, the second blocking element binds to about 100 to about 400 nucleotides downstream relative to the second primer. In embodiments, the second blocking element binds to about 100 to about 400 nucleotides downstream relative to the second primer.
In an embodiment, the method further comprises repeating steps ii) and iii). In an embodiment, the method further comprises repeating the following: ii) binding a blocking element to the one or more non-fused circular template polynucleotides; and iii) hybridizing a first primer and a second primer to the one or more non-fusion circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; thereby differentially amplifying polynucleotides comprising fusion genes (e.g., fusion genes comprising fusion junctions).
In embodiments, the first primer and the second primer hybridize to complementary sequences of the one or more fused circular template polynucleotides and the one or more non-fused circular template polynucleotides, wherein the first primer and the second primer are about 1 to about 50 nucleotides apart. In embodiments, the first primer and the second primer hybridize to complementary sequences of the one or more fused circular template polynucleotides and the one or more non-fused circular template polynucleotides, wherein the first primer and the second primer are about 1 to about 10 nucleotides apart. In embodiments, the first primer and the second primer hybridize to complementary sequences of the one or more fused circular template polynucleotides and the one or more non-fused circular template polynucleotides, wherein the first primer and the second primer are about 5 to about 25 nucleotides apart. In an embodiment, the first primer and the second primer are about 10 nucleotides apart. In an embodiment, the first primer and the second primer are about 25 nucleotides apart. In an embodiment, the first primer and the second primer are about 50 nucleotides apart. In an embodiment, the first primer and the second primer are about 75 nucleotides apart. In an embodiment, the first primer and the second primer are separated by about 100 nucleotides.
In embodiments, the second amount is about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 75% greater than the first amount. In embodiments, the second amount is about 0.01%, about 0.05%, about 0.010%, about 0.015%, about 0.020%, about 0.025%, about 0.030%, about 0.040%, about 0.050%, about 0.075% greater than the first amount. In embodiments, the second amount is about 0.1%, about 0.5%, about 0.10%, about 0.15%, about 0.20%, about 0.25%, about 0.30%, about 0.40%, about 0.50%, about 0.75% greater than the first amount. In an embodiment, the second number is greater than the first number. In embodiments, the first amount is about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 75% less than the second amount. In embodiments, the first amount is about 0.01%, about 0.05%, about 0.010%, about 0.015%, about 0.020%, about 0.025%, about 0.030%, about 0.040%, about 0.050%, about 0.075% less than the second amount. In embodiments, the first amount is about 0.1%, about 0.5%, about 0.10%, about 0.15%, about 0.20%, about 0.25%, about 0.30%, about 0.40%, about 0.50%, about 0.75% less than the second amount.
In embodiments, the second number is about 2 times, at least about 1.5 times, at least about 2.0 times, at least about 2.5 times, at least about 5 times, at least about 10 times, or more than about 10 times the first number. In an embodiment, the second number is about 1.0 times the first number. In an embodiment, the second number is about 2.0 times the first number. In an embodiment, the second number is about 5.0 times the first number. In an embodiment, the second number is about 20 times the first number.
In an embodiment, the second amount of quantification after one extension cycle is measurably higher than the first amount. In an embodiment, the method produces a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product at a ratio of 1.00:1.01. In an embodiment, the ratio of the first number to the second number is 1.00:1.02. In an embodiment, the ratio of the first number to the second number is 1.00:1.05. In an embodiment, the ratio of the first number to the second number is 1.00:1.10. After 35 extension cycles (e.g., 35 PCR cycles, each of which comprises the steps of primer hybridization, primer extension, and denaturation), a second amount enriched by about 1.999-fold relative to the first amount is produced at a ratio of 1.00:1.02, wherein the enrichment is Multiple of 1.02 35 . In an embodiment, the second number of quantification after a plurality of extension cycles (e.g., 5, 10, 15, 20) is measurably higher than the first number. In embodiments, the second amount quantified after 1 minute, 2 minutes, 3 minutes, 4 minutes, 5 minutes, 10 minutes, 15 minutes, or 20 minutes of amplification (e.g., eRCA) is measurably higher than the first amount.
In embodiments, the one or more linear nucleic acid molecules are about 20 to about 1000 nucleotides in length, about 100 to about 300 nucleotides in length, about 300 to about 500 nucleotides in length, or about 500 to about 1000 nucleotides in length. In embodiments, the one or more linear nucleic acid molecules are about 20 to 1000 nucleotides in length. In embodiments, the one or more linear nucleic acid molecules are about 100 to about 300 nucleotides in length. In embodiments, the one or more linear nucleic acid molecules are about 300 to about 500 nucleotides in length. In embodiments, the one or more linear nucleic acid molecules are about 500 to about 1000 nucleotides in length. In embodiments, the one or more linear nucleic acid molecules are about 20, about 50, about 75, about 100, about 150, about 200, about 250, about 300, about 350, about 400, about 450, about 500, about 550, about 600, about 650, about 700, about 750, about 800, about 850, about 900, about 950, or about 1000 nucleotides in length.
In an embodiment, the linear molecule is derived from a biological sample. In an embodiment, the linear molecule is derived from a sample. In an embodiment, the linear molecule is derived from a patient suffering from a disease. In embodiments, the linear molecule is derived from a cancer patient. "patient" refers to a living organism (i.e., a subject) suffering from or susceptible to a disease or condition. Non-limiting examples include humans, other mammals, cows, rats, mice, dogs, monkeys, goats, sheep, cows, deer, and other non-mammals. In some embodiments, the patient is a human.
In embodiments, the one or more linear nucleic acid molecules comprise DNA, RNA, or cDNA; optionally wherein the DNA or RNA is a cell-free nucleic acid molecule. In embodiments, the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion junction is located at an exon junction. In embodiments, the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction formed by alternative splicing. In embodiments, the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction formed by a splice defect.
In embodiments, the one or more linear nucleic acid molecules comprise a barcode sequence. In embodiments, a plurality of linear nucleic acid molecules (e.g., all linear nucleic acid molecules from a particular sample source or sub-sample thereof) are conjugated to a first barcode sequence, while a different plurality of linear nucleic acid molecules (e.g., all linear nucleic acid molecules from a different sample source or different sub-sample) are conjugated to a second barcode sequence, thereby correlating each of the plurality of linear nucleic acid molecules with a different barcode sequence indicative of the sample source. In embodiments, each barcode sequence of the plurality of barcode sequences differs from each other barcode sequence of the plurality of barcode sequences by at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotide positions. In some embodiments, a substantially degenerate barcode sequence may be referred to as random. In some embodiments, the barcode sequence may comprise a nucleic acid sequence from a pool of known sequences. In some embodiments, the barcode sequence may be predefined. In embodiments, the barcode sequence comprises about 1 to about 10 nucleotides. In embodiments, the barcode sequence comprises about 3, 4, 5, 6, 7, 8, 9, or about 10 nucleotides. In an embodiment, the barcode sequence comprises about 3 nucleotides. In an embodiment, the barcode sequence comprises about 5 nucleotides. In an embodiment, the barcode sequence comprises about 7 nucleotides. In an embodiment, the barcode sequence comprises about 10 nucleotides. In embodiments, the barcode sequence comprises about 6 to about 10 nucleotides.
FIGS. 1 and example 1 describe examples of how cDNA can be fragmented to produce linear nucleic acid molecules. In embodiments, the polynucleotide is fragmented to an average length of about 150, about 250, or about 350 base pairs prior to circularizing one or more linear nucleic acid molecules. Fragmentation can be achieved by methods known in the art (e.g., enzymatic fragmentation, acoustic fragmentation). In embodiments, the polynucleotide is fragmented using enzymatic or acoustic fragmentation to produce linear nucleic acid molecules. In embodiments, the input polynucleotide is derived from a fresh or freshly frozen sample and is minimally degraded prior to fragmentation. Next, the ssDNA fragments are circularized by the CircLigaseTM or methods described herein. In some embodiments, circularization is facilitated by denaturing the nucleic acid prior to circularization. Residual linear DNA molecules may optionally be digested. This can be accomplished by methods known in the art (e.g., treatment with Exo I and/or Exo III enzymes).
In embodiments, circularization comprises intramolecular conjugation of the 5 'and 3' ends of the linear nucleic acid molecules. In an embodiment, cyclizing comprises a ligation reaction. In an embodiment, the two ends of the linear nucleic acid molecule are directly linked together. In an embodiment, the two ends of the linear nucleic acid molecule are joined together with the aid of bridging oligonucleotides (sometimes referred to as splint oligonucleotides) that are complementary to the two ends of the linear nucleic acid molecule. Methods for forming circular DNA templates are known in the art, e.g., linear polynucleotides are prepared in a non-template driven reaction with a circularized ligase, such as CircLigase TM 、CircLigase TM II. Taq DNA ligase, hiFiTaq DNA ligase, T4 DNA ligase orThe DNA ligase performs circularization. In some embodiments, circularization is facilitated by denaturing the double-stranded linear nucleic acid prior to circularization. Residual linear DNA molecules may optionally be digested. In some embodiments, cyclization is promoted by chemical ligation (e.g., click chemistry, e.g., copper catalyzed reaction of an alkyne (e.g., 3 'alkyne) and an azide (e.g., 5' azide). In an embodiment, the linear DNA fragment is a-tailed (e.g., a-tailed using Taq DNA polymerase) prior to circularization.
In an embodiment, the circularization of the linear nucleic acid molecule is performed with CircLigase TM The enzyme is performed. In embodiments, circularization of the linear nucleic acid molecule is performed with a thermostable RNA ligase or a mutant thereof. In an embodiment, the circularization of the linear nucleic acid molecule is performed with RNA ligase from phage TS2126 or mutants thereof. For example, the RNA ligase may be a TS2126 RNA ligase as described in U.S. patent publication 2005/0266439, which is incorporated herein by reference in its entirety.
In embodiments, circularization comprises ligating the first hairpin and the second hairpin adaptors to a linear nucleic acid molecule, thereby forming a circular polynucleotide.
In embodiments, the hairpin adaptors comprise a single nucleic acid strand comprising a stem loop structure. The hairpin adaptors may be of any suitable length. In some embodiments, the hairpin adaptors are at least 40, at least 50, or at least 100 nucleotides in length. In some embodiments, the hairpin adaptors are in the range of 45 to 500 nucleotides, 75 to 500 nucleotides, 45 to 250 nucleotides, 60 to 250 nucleotides, or 45 to 150 nucleotides in length. In some embodiments, the hairpin adaptors comprise a nucleic acid having a 5 'end, a 5' portion, a loop, a 3 'portion, and a 3' end (e.g., arranged in a 5 'to 3' orientation). In some embodiments, the 5 'portion of the hairpin adapter anneals to and/or hybridizes with the 3' portion of the hairpin adapter, thereby forming the stem portion of the hairpin adapter. In some embodiments, the 5 'portion of the hairpin adapter is substantially complementary to the 3' portion of the hairpin adapter. In certain embodiments, the hairpin adaptors comprise a stem portion (i.e., a stem) and a loop, wherein the stem portion is substantially double-stranded, thereby forming a duplex. In some embodiments, the loop of the hairpin adapter comprises a nucleic acid strand that is non-complementary (e.g., substantially non-complementary) to itself or any other portion of the hairpin adapter. In some embodiments, the second adapter comprises a sample barcode sequence, a molecular identifier sequence, or both a sample barcode sequence and a molecular identifier sequence. In some embodiments, the second adapter comprises a sample barcode sequence.
In some embodiments, the duplex region or stem portion of the hairpin adapter comprises an end configured for ligation to an end of a double-stranded nucleic acid (e.g., a nucleic acid fragment, e.g., a library insert). In embodiments, the end of the duplex region or stem portion of the hairpin adapter comprises a 5 'overhang or 3' overhang that is complementary to a 3 'overhang or 5' overhang of one end of the double stranded nucleic acid. In some embodiments, one end of the duplex region or stem portion of the hairpin adapter comprises a blunt end that can be linked to a blunt end of a double-stranded nucleic acid. In certain embodiments, the end of the duplex region or stem portion of the hairpin adapter comprises a phosphorylated 5' end. In some embodiments, the stem portion of the hairpin adapter is at least 15, at least 25, or at least 40 nucleotides in length. In some embodiments, the stem portion of the hairpin adapter ranges in length from 15 to 500 nucleotides, 15 to 250 nucleotides, 15 to 200 nucleotides, 15 to 150 nucleotides, 20 to 100 nucleotides, or 20 to 50 nucleotides.
In some embodiments, the loop of the hairpin adapter comprises one or more of the following: primer binding sites, capture nucleic acid binding sites (e.g., nucleic acid sequences complementary to capture nucleic acids), UMI, sample barcodes, sequencing adaptors, tags, and the like, or a combination thereof. In certain embodiments, the loop of the hairpin adapter comprises a primer binding site. In certain embodiments, the loop of the hairpin adapter comprises a primer binding site and UMI. In certain embodiments, the loop of the hairpin adapter comprises a binding motif.
In some embodiments, the predicted, calculated, average, mean, or absolute melting temperature (Tm) of the loop of the hairpin adapter is greater than 50 ℃, greater than 55 ℃, greater than 60 ℃, greater than 65 ℃, greater than 70 ℃, or greater than 75 ℃. In some embodiments, the predicted, estimated, calculated, average, mean or absolute melting temperature (Tm) of the loop of the hairpin adapter is in the range of 50-100 ℃, 55-100 ℃, 60-100 ℃,65-100 ℃, 70-100 ℃, 55-95 ℃, 65-95 ℃, 70-95 ℃, 55-90 ℃, 65-90 ℃, 70-90 ℃, or 60-85 ℃. In an embodiment, the Tm of the ring is about 65 ℃. In an embodiment, the Tm of the ring is about 75 ℃. In an embodiment, the Tm of the ring is about 85 ℃. The Tm of the loop of the hairpin adapter can be altered (e.g., increased) to the desired Tm using suitable methods, such as by altering (e.g., increasing GC content), altering (e.g., increasing) length, and/or by including modified nucleotides, nucleotide analogs, and/or modified nucleotide linkages, non-limiting examples of which include locked nucleic acids (LNA, e.g., bicyclic nucleic acids), bridged nucleic acids (BNA, e.g., limiting nucleic acids), C5 modified pyrimidine bases (e.g., 5-methyl-dC, propynylpyrimidine, etc.), and alternative backbone chemicals, such as Peptide Nucleic Acids (PNA), morpholino, etc., or combinations thereof. Thus, in some embodiments, the loop of the hairpin adapter comprises one or more modified nucleotides, nucleotide analogs, and/or modified nucleotide linkages.
In some embodiments, the loops of the hairpin adaptors independently comprise a GC content of greater than 40%, greater than 50%, greater than 55%, greater than 60%, greater than 65%, or greater than 70%. In certain embodiments, the loops of the hairpin adaptors independently comprise a GC content in the range of 40-100%, 50-100%, 60-100%, or 70-100%. In embodiments, the GC content of the ring is about or greater than about 40%. In embodiments, the GC content of the ring is about or greater than about 50%. In embodiments, the GC content of the ring is about or greater than about 60%. Non-base modifying genes can also be incorporated into the loop of the hairpin adapter to increase Tm, non-limiting examples of which include Minor Groove Binders (MGBs), spermine, G-clamp, uaq anthraquinone caps, and the like, or combinations thereof. The loop of the hairpin adapter may be of any suitable length. In some embodiments, the loop of the hairpin adapter is at least 15, at least 25, or at least 40 nucleotides in length. In some embodiments, the hairpin adaptors are in the range of 15 to 500 nucleotides, 15 to 250 nucleotides, 20 to 200 nucleotides, 30 to 150 nucleotides, or 50 to 100 nucleotides in length.
In certain embodiments, the predicted, estimated, calculated, average, mean or absolute Tm of the duplex region or stem region of the hairpin adapter is in the range of 30-70 ℃, 35-65 ℃, 35-60 ℃, 40-65 ℃, 40-60 ℃, 35-55 ℃, 40-55 ℃, 45-50 ℃ or 40-50 ℃. In embodiments, the Tm of the stem region is about or above about 35 ℃. In embodiments, the Tm of the stem region is about or above about 40 ℃. In embodiments, the Tm of the stem region is about or above about 45 ℃. In embodiments, the Tm of the stem region is about or above about 50 ℃.
In embodiments, circularization comprises contacting the double stranded polynucleotide with at least one prokaryotic telomerase. In embodiments, the double-stranded polynucleotide comprises complementary prokaryotic telomerase target sequences at both ends (e.g., the 5 'and 3' ends of each strand comprise a prokaryotic telomerase recognition sequence or complement thereof). For example, a double-stranded enzyme recognition DNA molecule is inserted at both ends of a target double-stranded DNA molecule (e.g., double-stranded prokaryotic telomerase recognition sequences, such as TeIN prokaryotic telomerase recognition sequences, have been ligated to each end of the dsDNA molecule). Then, for example, E.coli phage N15 prokaryotic telomerase (TelN) catalyzes the recognition of the DNA molecule by a double-stranded enzyme on both ends of the target double-stranded DNA molecule to produce a circularized DNA molecule of the circularized target double-stranded DNA molecule. The TelN recognition sequence is TATCAGCACACAATTGCCCATTATACGCGCGTATAATGGACTATTGTGTGCTGATA (SEQ ID NO: 1). TelN cleaves this sequence at its midpoint and engages the ends of the complementary strand to form a covalent closed end. Additional methods for prokaryotic telomerase cyclization and prokaryotic telomerase are disclosed in PCT patent publications WO2021236792 and WO2021/078947 and U.S. patent publication 2013/0216562, each of which is incorporated herein by reference in its entirety.
In embodiments, circularization comprises hybridizing a splint to both ends of the linear nucleic acid molecule and either i) ligating adjacent ends, or ii) extending the 3' end of the linear nucleic acid molecule along the splint to create a splint complementary sequence, and ligating the 3' end of the complementary sequence to the 5' end of the linear nucleic acid molecule. In an embodiment, the splint comprises a bar code. In embodiments, the splint comprises primer binding sites (e.g., sequences complementary to amplification or sequencing primers).
In one embodiment, enzymes are used to ligate the two ends of the linear nucleic acid molecule. For example, linear polynucleotides are purified in a non-template driven reaction using a circularized ligase (e.g., circLigase TM The enzyme, taq DNA ligase, hiFi Taq DNA ligase, T4 DNA ligase, PBCV-1DNA ligase (also known as SplingR ligase) or amplinase DNA ligase). Non-limiting examples of ligasesComprising a DNA ligase such as DNA ligase I, DNA ligase II, DNA ligase III, DNA ligase IV, T4 DNA ligase, T7 DNA ligase, T3 DNA ligase, E.coli DNA ligase, PBCV-1DNA ligase (also known as SplingR ligase) or Taq DNA ligase. In embodiments, the ligase comprises a T4 DNA ligase, T4 RNA ligase 1, T4 RNA ligase 2, T3 DNA ligase, or T7 DNA ligase. In an embodiment, the enzymatic ligation is performed by a mixture of ligases. In embodiments, the ligase is selected from the group consisting of: t4 DNA ligase, T4 RNA ligase 1, T4 RNA ligase 2, rtcB ligase, T3 DNA ligase, T7 DNA ligase, taq DNA ligase, PBCV-1DNA ligase, thermostable DNA ligases (e.g., 5' AppDNA/RNA ligases), ATP dependent DNA ligases, RNA dependent DNA ligases (e.g., splingR ligases), and combinations thereof. In an embodiment, the two ends of the template polynucleotide are ligated together with the aid of splint primers that are complementary to the two ends of the template polynucleotide. For example, the T4 DNA ligase reaction may be performed by combining a linear polynucleotide, ligation buffer, ATP, T4 DNA ligase, water, and incubating the mixture between about 20 ℃ and about 45 ℃ for about 5 minutes to about 30 minutes. In some embodiments, the T4 ligation reaction is incubated for 30 minutes at 37 ℃. In some embodiments, the T4 ligation reaction is incubated at 45 ℃ for 30 minutes. In the examples, the ligase reaction was terminated by adding Tris buffer with high EDTA and incubating for 1 min.
In embodiments, the linear nucleic acid molecule may undergo intramolecular circularization (by ligation or annealing) without being ligated to a circularized adaptor (e.g., self-circularization). Circularization can be achieved with a ligase at about 4℃to 35 ℃ (without circularization of the adaptors). In an embodiment, a linear nucleic acid molecule of interest may be ligated to a loxP adaptor, and circularization may be mediated by Cre recombinase reaction at about 4-35 ℃, see e.g., US 6,465,254, which is incorporated herein by reference.
In embodiments, the circular polynucleotide is about 100 to about 1000 nucleotides in length, about 100 to about 300 nucleotides in length, about 300 to about 500 nucleotides in length, or about 500 to about 1000 nucleotides in length. In embodiments, the circular polynucleotide is about 300 to about 600 nucleotides in length. In embodiments, the circular polynucleotides are about 100-1000 nucleotides, about 150-950 nucleotides, about 200-900 nucleotides, about 250-850 nucleotides, about 300-800 nucleotides, about 350-750 nucleotides, about 400-700 nucleotides, or about 450-650 nucleotides in length. In embodiments, the cyclic polynucleotide molecule is about 100-1000 nucleotides in length. In embodiments, the cyclic polynucleotide molecule is about 100-300 nucleotides in length. In embodiments, the cyclic polynucleotide molecule is about 300-500 nucleotides in length. In embodiments, the cyclic polynucleotide molecule is about 500-1000 nucleotides in length. In an embodiment, the cyclic polynucleotide molecule is about 100 nucleotides in length. In an embodiment, the circular polynucleotide molecule is about 300 nucleotides in length. In an embodiment, the cyclic polynucleotide molecule is about 500 nucleotides in length. In an embodiment, the circular polynucleotide molecule is about 1000 nucleotides in length. The circular polynucleotides may be conveniently isolated by conventional purification columns, digestion of non-circular DNA by one or more suitable exonucleases, or both.
In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both is about 1 to about 100 nucleotides from the fusion junction. In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both, is about 5 to about 100 nucleotides from the fusion junction. In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both is about 10 to about 100 nucleotides from the fusion junction. In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both, is about 25 to about 100 nucleotides from the fusion junction. In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both, is about 50 to about 100 nucleotides from the fusion junction. In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both is about 75 to about 100 nucleotides from the fusion junction. In embodiments, the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both is about 1, about 5, about 10, about 25, about 50, about 75, or about 100 nucleotides from the fusion junction. In an embodiment, the sequence that specifically hybridizes to the first primer and the sequence that specifically hybridizes to the blocking element do not overlap. In embodiments, the sequence that specifically hybridizes to the first primer and the sequence that specifically hybridizes to the blocking element are about 5, about 10, or about 20 nucleotides apart. In an embodiment, the sequence that specifically binds to the blocking element and the sequence that specifically hybridizes to the first primer are about the same distance from the fusion junction. In an embodiment, the sequence that specifically binds to the blocking element and the sequence that specifically hybridizes to the first primer are at different distances from the fusion junction.
In embodiments, the sequence that specifically hybridizes to the first primer is about 1 to about 50 nucleotides apart from the sequence that is complementary to the sequence that specifically hybridizes to the second primer. In embodiments, the sequence that specifically hybridizes to the first primer is about 5 to about 50 nucleotides apart from the sequence that is complementary to the sequence that specifically hybridizes to the second primer. In embodiments, the sequence that specifically hybridizes to the first primer is about 10 to about 50 nucleotides apart from the sequence that is complementary to the sequence that specifically hybridizes to the second primer. In embodiments, the sequence that specifically hybridizes to the first primer is about 20 to about 50 nucleotides apart from the sequence that is complementary to the sequence that specifically hybridizes to the second primer. In embodiments, the sequence that specifically hybridizes to the first primer is about 30 to about 50 nucleotides apart from the sequence that is complementary to the sequence that specifically hybridizes to the second primer. In embodiments, the sequence that specifically hybridizes to the first primer is about 40 to about 50 nucleotides apart from the sequence that is complementary to the sequence that specifically hybridizes to the second primer. In embodiments, the sequence that specifically hybridizes to the first primer is separated by about 1, about 5, about 10, about 20, about 30, about 40, or about 50 nucleotides from the sequence that is complementary to the sequence that specifically hybridizes to the second primer.
In an embodiment, the sequence that specifically hybridizes to the first primer and the sequence that is complementary to the sequence that specifically hybridizes to the second primer are located within the same exon of the target gene. In an embodiment, the sequence that specifically hybridizes to the first primer and the sequence that is complementary to the sequence that specifically hybridizes to the second primer are located within different exons of the target gene. In an embodiment, the sequence that specifically hybridizes to the first primer and the sequence that is complementary to the sequence that specifically hybridizes to the second primer are adjacent exons of the target gene. Specific hybridization distinguishes non-specific hybridization interactions (e.g., two nucleic acids that are not configured for specific hybridization, e.g., two nucleic acids that are 80% or less, 70% or less, 60% or less, or 50% or less) by about 2-fold or more, typically about 10-fold or more, and sometimes about 100-fold or more, 1000-fold or more, 10,000-fold or more, 100,000-fold or more, or 1,000,000-fold or more. Two strands of nucleic acid hybridized to each other may form a duplex comprising a double-stranded portion of nucleic acid.
In embodiments, the linear nucleic acid molecule is a single stranded nucleic acid molecule. In embodiments, the linear nucleic acid molecule is a double stranded nucleic acid molecule. In embodiments, the method comprises less than 200ng of linear nucleic acid molecules. In embodiments, the method comprises less than 100ng of linear nucleic acid molecules. In embodiments, the method comprises less than 50ng of a linear nucleic acid molecule. In embodiments, the method comprises less than 20ng of linear nucleic acid molecules. In embodiments, the method comprises less than 10ng of linear nucleic acid molecules. In embodiments, the method comprises about 200ng of the linear nucleic acid molecule. In embodiments, the method comprises about 100ng of a linear nucleic acid molecule. In embodiments, the method comprises about 50ng of a linear nucleic acid molecule. In embodiments, the method comprises about 20ng of a linear nucleic acid molecule. In embodiments, the method comprises about 10ng of a linear nucleic acid molecule.
In some embodiments, the double-stranded nucleic acid comprises two complementary strands of nucleic acid. In certain embodiments, the double-stranded nucleic acid comprises a first strand and a second strand that are complementary or substantially complementary to each other. The first strand of double-stranded nucleic acid is sometimes referred to herein as the forward strand, and the second strand of double-stranded nucleic acid is sometimes referred to herein as the reverse strand. In some embodiments, the double stranded nucleic acid comprises two opposite ends. Thus, a double stranded nucleic acid typically comprises a first end and a second end. The ends of the double stranded nucleic acids may comprise 5 'overhangs, 3' overhangs or blunt ends. In some embodiments, one or both ends of the double stranded nucleic acid are blunt ends. In certain embodiments, one or both ends of the double stranded nucleic acid are manipulated using a suitable method to comprise a 5 'overhang, a 3' overhang, or a blunt end. In some embodiments, one or both ends of the double stranded nucleic acid are manipulated during library preparation such that one or both ends of the double stranded nucleic acid are configured for ligation to adaptors using a suitable method. For example, one or both ends of the double stranded nucleic acid may be digested with a restriction enzyme, polished, end repaired, filled, phosphorylated (e.g., by addition of a 5' -phosphate), dT-tailed, dA-tailed, or the like, or a combination thereof.
In embodiments, (i) the first primer comprises a 5' sequence that does not hybridize under amplification conditions to the first strand of the first region; and/or (ii) the second primer comprises a 5' sequence that does not hybridize under amplification conditions to the complement of the first strand of the first region. In embodiments, (i) the first primer comprises a 5' sequence that does not hybridize under amplification conditions to the first strand of the first region; and (ii) the second primer comprises a 5' sequence that does not hybridize under amplification conditions to the complement of the first strand of the first region. In embodiments, (i) the first primer comprises a 5' sequence that does not hybridize under amplification conditions to the first strand of the first region; or (ii) the second primer comprises a 5' sequence that does not hybridize under amplification conditions to the complement of the first strand of the first region. In some embodiments, the 5' sequence of the first primer that does not hybridize to the first strand of the first region comprises a primer binding site for secondary amplification. In some embodiments, the 5' sequence of the first primer that does not hybridize to the first strand of the first region comprises a first sequencing adapter for clustering templates on a flow cell. In some embodiments, the 5' sequence of the first primer that does not hybridize to the first strand of the first region comprises a sample barcode. In some embodiments, the 5' sequence of the second primer that does not hybridize to the complement of the first strand of the first region comprises a primer binding site for secondary amplification. In some embodiments, the 5' sequence of the second primer that does not hybridize to the first strand of the first region comprises a second sequencing adapter for clustering templates on the flow cell. In some embodiments, the 5' sequence of the second primer that does not hybridize to the complement of the first strand of the first region comprises a sample barcode.
In embodiments, (i) the amplification reaction further comprises a second blocking element that inhibits polymerase extension along the sequence to which it binds, and (ii) the first region comprises a first strand comprising, from 5 'to 3', a sequence complementary to the sequence that specifically hybridizes to the second primer, and a sequence complementary to the sequence to which the second blocking element specifically binds. In embodiments, the sequence complementary to the sequence that specifically hybridizes to the second primer is about 100 to about 300 nucleotides apart from the sequence complementary to the sequence that specifically binds to the second blocking element. In embodiments, the sequence complementary to the sequence that specifically hybridizes to the second primer is about 100 to about 200 nucleotides apart from the sequence complementary to the sequence that specifically binds to the second blocking element. In embodiments, the sequence complementary to the sequence that specifically hybridizes to the second primer is about 100 to about 150 nucleotides apart from the sequence complementary to the sequence that specifically binds to the second blocking element. In embodiments, the sequence complementary to the sequence that specifically hybridizes to the second primer is about 100, about 150, about 200, or about 300 nucleotides apart from the sequence complementary to the sequence that specifically binds to the second blocking element.
In an embodiment, the method further comprises: iv) amplifying the one or more non-fused circular template polynucleotides to produce a third amount of non-fused polynucleotide amplification products; and amplifying the one or more fusion circular template polynucleotides to produce a fourth quantity of fusion polynucleotide amplification products, wherein the third quantity and the fourth quantity are substantially the same. In embodiments, amplifying the one or more non-fused circular template polynucleotides comprises hybridizing a third primer and a fourth primer to the one or more non-fused circular template polynucleotides and extending both primers with a polymerase, and wherein amplifying the one or more fused circular template polynucleotides comprises hybridizing a third primer and a fourth primer to the one or more fused circular template polynucleotides and extending both primers with a polymerase. In embodiments, the third primer hybridizes upstream (e.g., in the 5 'direction) and the fourth primer hybridizes downstream (e.g., in the 3' direction) of the target sequence, wherein the target sequence comprises a single nucleotide variant, an insertion, a deletion, an internal tandem repeat, or a copy number variant. In embodiments, the target sequence comprises one or more single nucleotide variants, one or more insertions, one or more deletions, one or more internal tandem repeats, and/or one or more copy number variants. In an embodiment, the method further comprises repeating steps ii), iii) and iv).
In an embodiment, amplification of the circularized or linear polynucleotide comprises a plurality of cycles comprising the steps of primer hybridization, primer extension and denaturation in the presence of a first primer, blocking element and second primer. While each cycle will contain each of these three events (hybridization, extension, and denaturation), the events within a cycle may or may not be discrete. For example, each step may have different reagents and/or reaction conditions (e.g., temperature). Alternatively, some steps may be performed without changing the reaction conditions. For example, the extension may be performed under the same conditions (e.g., the same temperature) as the hybridization. After extension, the conditions are changed to start a new cycle with a new denaturation step, thereby amplifying the polynucleotide. Primer extension products from the early cycle can serve as templates for the late amplification cycle. In an embodiment, the plurality of cycles is from about 5 to about 50 cycles. In an embodiment, the plurality of cycles is from about 10 to about 45 cycles. In an embodiment, the plurality of cycles is from about 10 to about 20 cycles. In an embodiment, the plurality of cycles is about 20 to about 30 cycles. In an embodiment, the plurality of cycles is 10 to 45 cycles. In an embodiment, the plurality of cycles is 10 to 20 cycles. In an embodiment, the plurality of cycles is 20 to 30 cycles. In an embodiment, the plurality of cycles is from about 10 to about 45 cycles. In an embodiment, the plurality of cycles is about 20 to about 30 cycles.
In embodiments, amplifying comprises exponentially amplifying circular template polynucleotides comprising fusion junctions. In embodiments, the amplification comprises exponential rolling circle amplification (eRCA). The exponential RCA is similar to the linear process, except that it uses a second primer having the same sequence as at least a portion of the circular template (Lizardi et al, nature genet., 19:225 (1998)). The double primer system realizes isothermal and exponential amplification. Exponential RCA has been applied to the amplification of acyclic DNA by using linear probes that bind to successive regions of target DNA at both ends thereof, followed by circularization using DNA ligase (Nilsson et al Science 265 5181:208 5 (1994)). In an embodiment, the amplification comprises Hyperbranched Rolling Circle Amplification (HRCA). Hyperbranched RCA uses a second primer complementary to the first amplification product. This allows replication of the product by a strand displacement mechanism, which can produce a dramatic amplification in isothermal reactions (Lage et al, genome Research, 13:294-307 (2003), which is incorporated herein by reference in its entirety).
In embodiments, methods for amplification include, but are not limited to, polymerase Chain Reaction (PCR), strand Displacement Amplification (SDA), transcription Mediated Amplification (TMA), and Nucleic Acid Sequence Based Amplification (NASBA), e.g., as described in U.S. patent No. 8,003,354, which is incorporated herein by reference in its entirety. The amplification methods described above may be used to amplify one or more nucleic acids of interest. For example, PCR, multiplex PCR, SDA, TMA, NASBA, and the like can be used to amplify immobilized nucleic acid fragments resulting from the first amplification method of the two-step methods described herein.
In embodiments, the amplifying comprises bridge amplification; such as described, for example, by U.S. Pat. nos. 5,641,658; 7,115,400; 7,790,418; the disclosure of U.S. patent publication No. 2008/0009420 is listed, each of which is incorporated herein by reference in its entirety. Typically, bridge amplification uses repeated steps of primer annealing to the template, primer extension, and separation of the extended primer from the template. Because the forward primer and the reverse primer are attached to the solid support, the extension product released upon separation from the initial template is also attached to the solid support. The two chains are preferably immobilized on a solid support at the 5' end by covalent attachment. The 3' end of the amplified product is then allowed to anneal to the nearby reverse primer, thereby forming a "bridge" structure. The reverse primer is then extended to produce an additional template molecule that can form another bridge. During bridge PCR, additional chemical additives may be included in the reaction mixture, wherein the DNA strands are denatured by a flow denaturant on the DNA, thereby chemically denaturing the complementary strands. The denaturing agent is then washed out and the polymerase is reintroduced under buffer conditions that allow the primer to anneal and extend.
In an embodiment, the amplification comprises thermal bridge polymerase chain reaction (t-bPCR) amplification. In an embodiment, t-bPCR amplification comprises incubation in an additive that reduces the denaturation temperature of the DNA. In embodiments, the additive is betaine, dimethyl sulfoxide (DMSO), ethylene glycol, formamide, glycerol, guanidine thiocyanate, 4-methylmorpholine-4-oxide (NMO), or a mixture thereof. In embodiments, the additive is betaine, DMSO, ethylene glycol, or mixtures thereof. In embodiments, the additive is betaine, DMSO, or ethylene glycol.
In an embodiment, the amplification comprises chemical bridge polymerase chain reaction (c-bPCR) amplification. In an embodiment, the c-bPCR amplification comprises denaturation using a chemical denaturant. In embodiments, the c-bPCR amplification comprises denaturation using acetic acid, hydrochloric acid, nitric acid, formamide, guanidine, sodium salicylate, sodium hydroxide, dimethyl sulfoxide (DMSO), propylene glycol, urea, or mixtures thereof. In an embodiment, the chemical denaturant is sodium hydroxide or formamide. Chemical bridge polymerase chain reaction involves fluid circulation of a denaturing agent (e.g., formamide) and maintaining the temperature within a narrow temperature range (e.g., +/-5 ℃). In contrast, a thermal bridge polymerase chain reaction comprises a thermal cycle between a high temperature (e.g., 85 ℃ to 95 ℃) and a low temperature (e.g., 60 ℃ to 70 ℃). The thermal bridge polymerase chain reaction may also contain denaturing agents, typically at a much lower concentration than conventional chemical bridge polymerase chain reactions.
In an embodiment, the amplification comprises a fluidic cycle between an extension mixture comprising a polymerase and dntps and a chemical denaturant. In embodiments, the polymerase is a strand displacement polymerase or a non-strand displacement polymerase. In an embodiment, the solution is thermally cycled between about 40 ℃ and about 65 ℃ during fluid circulation of the extension mixture and the chemical denaturant. For example, the extension cycle is maintained at a temperature of 55 ℃ to 65 ℃ and then the denaturation cycle is maintained at a temperature of 40 ℃ to 65 ℃, or the temperature of the denaturation step begins at 60 ℃ to 65 ℃ and drops to 40 ℃ prior to exchanging reagents. In an embodiment, amplifying comprises adjusting the reaction temperature before starting the next cycle. In embodiments, the denaturation cycle and/or extension cycle is maintained at a temperature for a sufficient time and the temperature is adjusted (e.g., increased relative to the starting temperature or decreased relative to the starting temperature) before starting the next cycle. In an embodiment, the denaturation cycle is carried out at a temperature of 60-65 ℃ for about 5-45 seconds, and then the temperature is reduced (e.g., to about 40 ℃) prior to the initiation of the extension cycle (i.e., prior to the introduction of the extension mixture). When the amplicon is exposed to conditions that promote hybridization, the reduced temperature facilitates primer hybridization in a subsequent step, even in the presence of a chemical denaturant. In embodiments, the extension cycle is performed at a temperature of 50 ℃ to 60 ℃ for about 0.5 to 2 minutes, then the temperature is raised (e.g., to between about 60 ℃ and about 70 ℃, or between about 65 ℃ and about 72 ℃) after the extension mixture is introduced. In embodiments, the cycling between the extension mixture and the chemical denaturant is performed at least 5 times, at least 10 times, at least 20 times, at least 30 times, at least 40 times, at least 50 times, at least 75 times, at least 100 times, or at least 200 times. In embodiments, the cycle between the extension mixture and the chemical denaturant is performed about 5 times, about 10 times, about 20 times, about 30 times, about 40 times, about 50 times, about 75 times, about 100 times, or about 200 times. In embodiments, the cycle between the extension mixture and the chemical denaturant is performed a total of 5, 10, 20, 30, 40, 50, 75, 100, 200, or more times. In an embodiment, the fluid circulation is performed in the presence of about 2 to about 15mM Mg2+. In embodiments, the fluid circulation is performed in the presence of about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, or about 15mm mg2+.
In embodiments, detecting the fusion amplification product comprises detecting (e.g., quantifying) the length of the fusion amplification product, detecting one or more probes bound to the fusion amplification product, or sequencing the fusion amplification product. In an embodiment, detecting the fusion amplification product comprises sequencing the fusion amplification product to generate a sequencing read. In an embodiment, detecting the fusion amplification product comprises sequencing the fusion amplification product to generate a sequencing read. In an embodiment, detecting the fusion amplification product comprises sequencing the fusion amplification product to generate a sequencing read.
In embodiments, the method comprises detecting a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product. In an embodiment, the method comprises: detecting the length of the non-fusion polynucleotide amplification product and the length of the fusion polynucleotide amplification product; detecting one or more probes bound to the non-fusion polynucleotide amplification product and the fusion polynucleotide amplification product; or sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product.
In embodiments, sequencing comprises hybridizing one or more sequencing primers to the fusion amplification product and extending the one or more sequencing primers (e.g., extending the one or more sequencing primers with modified, labeled nucleotides and detecting incorporation of the modified, labeled nucleotides).
In embodiments, sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product produces one or more sequencing reads. In embodiments, the method further comprises aligning the substring of one or more sequencing reads with a reference sequence and quantifying the number of sequencing reads of the circular template polynucleotide comprising the fusion junction. In embodiments, the method further comprises aligning the substring of the one or more sequencing reads with a reference sequence, quantifying the number of sequencing reads of the fusion gene circular template polynucleotide, wherein quantifying comprises aligning the substring of the sequencing reads with the reference sequence. In embodiments, the method further comprises aligning the one or more sequencing reads to a reference sequence.
In an embodiment, the method comprises comparing the k-mer substring of one or more sequencing reads to a k-mer table of a fusion gene reference. In embodiments, the method comprises quantifying the number of k-mer substrings shared (i.e., measured and/or detected) between a sequencing read and a fusion gene reference. In an embodiment, the method comprises: (i) Grouping one or more sequencing reads based on the barcode sequence and/or the sequence comprising the fusion splice site; and (ii) within the set, aligning the reads and forming a consensus sequence of reads having the same barcode sequence and/or a sequence comprising a fusion junction. In embodiments, sequencing further comprises generating sequencing reads that span the circularized junction formed between the 5 'and 3' ends of the linear nucleic acid molecule, and quantifying the number of different circularized junction sequences containing the fusion gene (fusion gene circular template polynucleotides).
In embodiments, sequencing comprises sequencing by synthesis, sequencing by binding, sequencing by hybridization, sequencing by ligation, or sequencing by pyrophosphate. A variety of sequencing methods may be used, such as Sequencing By Synthesis (SBS), pyrosequencing, sequencing By Ligation (SBL), or Sequencing By Hybridization (SBH). Pyrosequencing detects the release of inorganic pyrophosphate (PPi) because specific nucleotides are incorporated into nascent nucleic acid strands (Ronaghi et al, analytical biochemistry (Analytical Biochemistry), 242 (1), 84-9 (1996), ronaghi, genome research, 11 (1), 3-11 (2001), ronaghi et al, science, 281 (5375), 363 (1998), U.S. Pat. No. 6,210,891, no. 6,258,568, and No. 6,274,320, each of which is incorporated herein by reference in its entirety). In pyrosequencing, the released PPi can be detected by conversion of ATP sulfurylase to Adenosine Triphosphate (ATP), and the level of ATP produced can be detected by light produced by luciferase. In this way, the sequencing reaction may be monitored by a luminescence detection system. In both SBL and SBH methods, repeated cycles of oligonucleotide delivery and detection are performed on target nucleic acids and their amplicons present at features of the array. SBL methods, including Shendure et al, science 309:1728-1732 (2005); U.S. patent No. 5,599,675; and the method described in U.S. Pat. No. 5,750,341, each of which is incorporated herein by reference in its entirety; and SBH methods as described in Bains et al, journal of theory biology (Journal of TheoreticalBiology), 135 (3), 303-7 (1988); drmanac et al, nature Biotechnology (Nature Biotechnology), 16,54-58 (1998); fodor et al science 251 (4995), 767-773 (1995); and WO 1989/10977, each of which is incorporated herein by reference in its entirety.
In SBS, the extension of a nucleic acid primer along a nucleic acid template is monitored to determine the nucleotide sequence in the template. The underlying chemical process may be catalyzed by a polymerase in which fluorescently labeled nucleotides are added to the primer (and thereby extend the primer) in a template-dependent manner, such that detection of the order and type of nucleotides added to the primer can be used to determine the sequence of the template. A plurality of different nucleic acid fragments that have been ligated at different locations in an array may undergo SBS techniques under specific conditions, where events that occur for different templates are distinguishable due to their location in the array. In embodiments, the sequencing step comprises annealing and extending the sequencing primer to incorporate a detectable label indicative of the identity of the nucleotide in the target polynucleotide, detecting the detectable label, and repeating the extending and detecting of the steps. In embodiments, the methods comprise sequencing one or more bases of a target nucleic acid by extending a sequencing primer that hybridizes to the target nucleic acid (e.g., an amplification product produced by an amplification method described herein). In an embodiment, the sequencing step may be accomplished by a Sequencing By Synthesis (SBS) process. In an embodiment, sequencing comprises sequencing by a synthetic process, wherein individual nucleotides are iteratively identified as they polymerize to form a growing complementary strand. In an embodiment, the nucleotides added to the growing complementary strand comprise both a tag and a reversible chain terminator that prevents further extension, such that the nucleotides can be identified by the tag before the terminator is removed to add and identify another nucleotide. Such reversible chain terminators comprise a removable 3' blocking group, for example as described in U.S. publication nos. 7,541,444, 7,057,026 and 10,738,072. Once such modified nucleotides have been incorporated into the growing polynucleotide strand complementary to the region of the template being sequenced, no free 3' -OH groups are available to direct additional sequence extension and thus no additional nucleotides can be added by the polymerase. Once the identity of the bases incorporated into the growing chain has been determined, the 3' block can be removed to allow the addition of the next consecutive nucleotide. By ordering products derived using these modified nucleotides, it is possible to infer the DNA sequence of the DNA template. Sequencing can be performed using any suitable Sequencing By Synthesis (SBS) technique in which modified nucleotides are added in succession to the free 3' hydroxyl groups, which are typically initially provided by sequencing primers, resulting in synthesis of the polynucleotide strand in the 5' to 3' direction. In embodiments, sequencing comprises detecting a signal sequence. In an embodiment, sequencing comprises extending the sequencing primer with labeled nucleotides. Examples of sequencing include, but are not limited to, sequencing By Synthesis (SBS) processes in which reversibly terminated fluorescent dye-carrying nucleotides are incorporated into a growing strand that is complementary to a target strand being sequenced. In an embodiment, the nucleotides are labeled with up to four unique fluorescent dyes. In embodiments, the nucleotides are labeled with at least two unique fluorescent dyes. In an embodiment, the readout is done by epifluorescence imaging. Non-limiting examples of suitable tags are described in the following: U.S. patent No. 8,178,360; U.S. Pat. No. 5,188,934 (4, 7-dichlorofluorescein dye); U.S. Pat. No. 5,366,860 (spectrally resolvable rhodamine dyes); U.S. Pat. No. 5,847,162 (4, 7-dichloro rhodamine dye); U.S. patent No. 4,318,846 (ether substituted fluorescein dye); U.S. patent No. 5,800,996 (energy transfer dye); U.S. patent No. 5,066,580 (xanthene dye); U.S. patent No. 5,688,648 (energy transfer dye); etc.
In embodiments, generating the first sequencing read or the second sequencing read comprises sequencing by combination (see, e.g., U.S. patent publications US2017/0022553 and US2019/0048404, each of which is incorporated herein by reference in its entirety). As used herein, "binding sequencing" refers to a sequencing technique in which specific binding of a polymerase and homologous nucleotides to a primed template nucleic acid molecule (e.g., a blocked primed template nucleic acid molecule) is used to identify the next correct nucleotide to be incorporated into the primer strand of the primed template nucleic acid molecule. Specific binding interactions do not require nucleotide chemistry to be incorporated into the primer. In some embodiments, the specific binding interactions may be incorporated into the primer strand prior to the nucleotide chemistry, or may be incorporated into the primer prior to a similar next correct nucleotide chemistry. Thus, detection of the next erroneous nucleotide can be performed without incorporating the next correct nucleotide. As used herein, the "next correct nucleotide" (sometimes referred to as a "homologous" nucleotide) is a nucleotide having a base complementary to the base of the next template nucleotide. The next correct nucleotide will hybridize at the 3' end of the primer to complement the next template nucleotide. The next correct nucleotide may, but need not, be capable of being incorporated at the 3' end of the primer. For example, the next correct nucleotide may be a member of a ternary complex that will complete the incorporation reaction, or alternatively, the next correct nucleotide may be a member of a stable ternary complex that does not catalyze the incorporation reaction. Nucleotides having bases that are not complementary to the next template base are referred to as "incorrect" (or "non-homologous") nucleotides.
The use of the sequencing methods outlined above is a non-limiting example, as essentially any sequencing method that relies on nucleotide continuous incorporation into a polynucleotide strand can be used. Suitable alternative techniques include, for example, pyrosequencing methods, fiseq (fluorescence in situ sequencing), MPSS (large scale parallel tag sequencing), or ligation-based sequencing methods.
In embodiments, sequencing comprises multiple sequencing cycles. In embodiments, the sequencing cycle comprises extending the complementary polynucleotide by incorporating a first nucleotide using a polymerase, wherein the polynucleotide hybridizes to the template nucleic acid, thereby detecting the first nucleotide and identifying the first nucleotide. In an embodiment, to begin the sequencing cycle, one or more differently labeled nucleotides and a DNA polymerase may be introduced. After the nucleotides are added, the resulting signal can be detected (e.g., by excitation and emission of a detectable label) to determine the identity of the incorporated nucleotide (based on the label on the nucleotide). Reagents may then be added to remove the 3' reversible terminator and remove the tag from each incorporated base. Reagents, enzymes and other materials can be removed from between steps by washing. Cycling may involve repeating these steps and reading the sequence of each cluster in multiple iterations. In an embodiment, the reads generated by sequencing are greater than 25bp in read length. In an embodiment, the reads generated by sequencing are greater than 50bp in read length. In an embodiment, the reads produced by sequencing are greater than 75bp in read length. In an embodiment, the reads generated by sequencing are greater than 100bp in read length. In an embodiment, the reads generated by sequencing are greater than 150bp in read length. In embodiments, generating a sequencing read comprises determining the identity of a nucleotide in a template polynucleotide.
In an embodiment, the sequencing method relies on the use of modified nucleotides that can act as reversible terminators. Once the modified nucleotide has been incorporated into the growing polynucleotide strand complementary to the region of the template being sequenced, no free 3' -OH groups are available to direct additional sequence extension and therefore no additional nucleotide can be added by the polymerase. Once the identity of the bases incorporated into the growing chain has been determined, the 3' reversible terminator end can be removed to allow the addition of the next consecutive nucleotide. These reactions can be performed in a single experiment if each modified nucleotide is attached with a different label known to correspond to a particular base in order to distinguish between the bases added in each incorporation step. Alternatively, separate reactions may be performed to contain each modified nucleotide separately.
The modified nucleotide may carry a label (e.g., a fluorescent label) to facilitate its detection. Each nucleotide type may carry a different fluorescent label. However, the detectable label need not be a fluorescent label. Any label that allows detection of the incorporated nucleotide may be used. A method for detecting fluorescently labeled nucleotides comprises using a laser of a wavelength specific to the labeled nucleotides, or using other suitable illumination sources. Fluorescence from the label on the nucleotide may be detected (e.g., by a CCD camera or other suitable detection means).
In embodiments, a method of sequencing a nucleic acid comprises extending a complementary polynucleotide (e.g., a primer) hybridized to a nucleic acid by incorporating a first nucleotide (e.g., a modified, labeled nucleotide). In embodiments, the method comprises a buffer exchange or wash step. In an embodiment, a method of sequencing a nucleic acid comprises a sequencing solution. The sequencing solution comprises (a) adenine nucleotides or analogs thereof; (b) (i) a thymine nucleotide or analogue thereof, or (ii) a uracil nucleotide or analogue thereof; (c) a cytosine nucleotide or analog thereof; and (d) guanine nucleotide or analog thereof.
In an embodiment, sequencing comprises extending the sequencing primer by incorporating labeled nucleotides or labeled nucleotide analogs, and detecting the label to generate a signal for each incorporated nucleotide or nucleotide analog, wherein the sequencing primer hybridizes to one of the fusion amplification products.
In an embodiment, detecting the fusion amplification product comprises aligning the substring of each sequencing read with a reference sequence and quantifying the number of aligned sequencing reads of the fusion gene circular template polynucleotide.
In an embodiment, detecting the fusion amplification product comprises comparing the k-mer substring of each sequencing read to a k-mer table of fusion junction references, and quantifying the number of k-mers shared between the sequencing reads and the fusion junction references. The term "fusion junction reference" refers to a collection of previously detected fusion sequences involving the one or more genes of interest.
In an embodiment, detecting the fusion amplification product comprises: (i) Grouping sequencing reads based on the barcode sequence and/or the sequence comprising the fusion splice site; and (ii) within each group, aligning the reads and forming a consensus sequence of reads having the same barcode sequence and/or sequence comprising fusion junctions.
In embodiments, sequencing further comprises generating sequencing reads comprising circularized junctions formed between the 5 'and 3' ends of the linear nucleic acid molecules, and quantifying the number of different circularized junction sequences comprising the fusion junctions. In embodiments, sequencing further comprises generating sequencing reads comprising circularized junctions formed between the 5 'and 3' ends of the linear nucleic acid molecules, and quantifying the number of different circularized junction sequences comprising the fusion junctions.
In embodiments, the method further comprises quantifying the fusion amplification product. The molecular count of the fusion amplification product may be used for diagnostic purposes. As described herein, polynucleotides containing fusions are preferably amplified, enabling accurate quantification over large background levels. Conventional bioinformatics analysis can be used to quantify fusion amplification products. In some embodiments, the bioinformatic analysis may involve counting the number of unique circularized junctions associated with a particular fusion amplification product. In other embodiments, quantification of the fusion amplification product is achieved by comparing the number of sequencing reads or circularized junctions corresponding to the fusion amplification product to the number of controls (e.g., spikes in controls) that are present in a predetermined number of template copies. In still other embodiments, quantification may be performed by qPCR or semi-quantitative PCR.
In embodiments, the one or more linear nucleic acid molecules are derived from a sample of the subject, optionally wherein the sample is an FFPE sample. In an example, FFPE samples were incubated with xylene and washed with ethanol to remove embedded wax, followed by treatment with proteinase K to permeabilize the tissue. In embodiments, the one or more linear nucleic acid molecules are derived from a liquid biopsy (e.g., plasma).
In embodiments, the polynucleotide fusion is a biomarker for cancer, autoimmune disease, primary immunodeficiency, or infectious disease. In embodiments, the polynucleotide fusion is a biomarker for cancer. In embodiments, the polynucleotide fusion is a biomarker for lymphoid malignancies. In embodiments, the polynucleotide fusion is a biomarker of primary immunodeficiency. In embodiments, the polynucleotide fusion is a biomarker for infectious disease. A "biomarker" is a substance associated with a particular property, such as a disease or condition. The change in biomarker levels may be associated with the risk or progression of the disease or the susceptibility of the disease to a given treatment.
In embodiments, the fusion gene causes a disease in a subject in whom the fusion gene is found. In embodiments, the fusion gene is associated with a disease. In embodiments, the disease is cancer, an autoimmune disease, a primary immunodeficiency, or an infectious disease. In some embodiments, the disease is an infectious disease, an autoimmune disease, a genetic disease, or cancer. In embodiments, the disease is an acute disease, a chronic disease (e.g., a disease that exists for more than 6 months), a idiopathic disease, or a syndrome (e.g., down's syndrome). In embodiments, the disease is a recurrent disease (e.g., a disease detectable after an undetectable period of time).
In embodiments, the infectious disease is a disease or disorder associated with infection from a pathogenic organism. In embodiments, the infectious disease is a. African sleeping disease (african trypanosomiasis), AIDS (acquired immunodeficiency syndrome), amebiasis, anaplasmosis, angiostromatosis, xenobiotic, anthrax, cryptosporidiosis, argentina hemorrhagic fever, ascariasis, aspergillosis, astrovirus infection, babesia, bacillus cereus infection, bacterial meningitis, bacterial pneumonia, bacterial vaginosis, bacteroides infection, pouchitis, bartonasis, belis ascariasis infection, BK virus infection, black rot, blastomycosis, bolivia hemorrhagic fever, botulism (and infant botulism), brazil hemorrhagic fever, brucellosis, black stills, burkholderia infection, bruise ulcers, calix virus infection (norovirus and saponaria variabilis), mycosis, candidiasis (white fungus disease; thrush), capillary nematodiasis, calicheasis, cat scratch disease, cellulitis, chagas disease (trypanosomiasis in the United states), chancre, varicella, chikungunya fever, chlamydia pneumoniae infection (taiwan acute respiratory pathogen or TWAR), cholera, blastomycosis, pot disease, clonorchiasis, clostridium difficile colitis, coccidioidomycosis, colorado Tick Fever (CTF), common cold (acute viral nasopharyngitis; acute rhinitis), 2019 coronavirus disease (COVID-19), creutzfeldt-Jakob disease (CJD), creutzfeldt-Jakob disease (CCHF), cryptococcosis, cryptosporidiosis, cryptosporidium, skin larval transfer (CLM), cyclosporine, cyst-tail, cytomegalovirus infection, dengue fever, chain-belt algae infection, binuclear amoeba, diphtheria, schizocephaliasis, melilosis, ebola hemorrhagic fever, echinococcosis, ehrlichiosis, enterobiasis (enterobiasis), enterococci infection, enterovirus infection, epidemic typhus, infectious erythema (fifth disease), infant eruption (sixth disease), fasciolopsis, gingiva, fatal Familial Insomnia (FFI), filariasis, food poisoning caused by clostridium perfringens, free living amoeba infection, clostridium infection, gas gangrene (clostridium necrosis), geotrichum, epidemic typhus, infectious erythema (fifth disease), infant eruption (sixth disease), fasciolopathy Gettman-Stlausle-Shen Kezeng syndrome (GSS), giardiasis, meliosis, jaw nematode disease, gonorrhea, inguinal granuloma (Du Nuofan disease), group A streptococcal infection, group B streptococcal infection, haemophilus influenzae infection, hand-foot-and-mouth disease (HFMD), hantavirus Pulmonary Syndrome (HPS), protoviral disease, helicobacter pylori infection, hemolytic Uremic Syndrome (HUS), hemorrhagic fever with renal syndrome (HFRS), hundla virus infection, hepatitis A, hepatitis B, hepatitis C, hepatitis B, hepatitis D, hepatitis E, herpes simplex, histoplasmosis, hookworm infection, human Bokavirus infection, human Ehrlichia disease, human Granulocytopenia (HGA), human metapneumovirus infection, human monocytic Epstein-Barr disease, human Papilloma Virus (HPV) infection, human parainfluenza virus infection, membranous taeniasis, epstein-Barr virus infectious mononucleosis (Mono), influenza (influenza), isospora, nakaki disease, keratitis, gold Geobacillus infection, kuru, laxafever, legionella (Legionella's disease), ponticke's fever, leishmaniasis, leprosy, leptospirosis, listeria, lyme (Leymbosch borreliosis), lyme filariasis (elephant's disease), lymphocytic choriomeningitis, malaria, marburg Hemorrhagic Fever (MHF), measles, middle East Respiratory Syndrome (MERS), melenoid (Whiter's disease), meningitis, meningococcal disease, postamblymatosis microsporidian, molluscum Contagiosum (MC), monkey pox, mumps, murine typhoid (typhoid), mycoplasma pneumonia, mycoplasma genitalium infection, podophyllosis, myiasis, neonatal conjunctivitis (neonate's eye), nippon virus infection, norovirus, variant Creutzfeldt-Jakob disease (vCJD, nvCJD), nocardia, onchocerciasis (river blindness), posttestosterone, paracoccidioidosis (southern metazosis), pneumonitis, pasteurella, head lice (head lice), body lice (body lice), pubic lice (ani, hair lice), pelvic Inflammatory Disease (PID), pertussis (tussilags), plague, pneumococcal infection, pneumoconiosis (PCP), pneumonia, poliomyelitis, prevotella infection, primary amenorrhea encephalitis (PAM), progressive multifocal leukoencephalopathy, psittacosis, Q fever, rabies, regressive fever, respiratory syncytial virus infection, rhinosporosis, rhinovirus infection, rickettsia pox, rift Valley Fever (RVF), chinesemetic fever (RMSF), rotavirus infection, rubella, salmonellosis, severe Acute Respiratory Syndrome (SARS), scabies, scarlet fever, schistosomiasis, septicemia, shigellosis (bacillary dysentery), shingles, smallpox, sporozoites, staphylococcal food poisoning, staphylococcal infection, round-wire disease, subacute sclerotic encephalitis, non-sexual syphilis, syphilis and yas taeniasis, tetanus (dental autism), contact-dyeing type sores (tinea barbae), tinea capitis (tinea capitis), tinea corporis (tinea corporis), tinea cruris, tinea manuum, tinea nigrum, tinea pedis, tinea unguium (onychomycosis), tinea versicolor (pityriasis versicolor), toxic Shock Syndrome (TSS), toxoplasmosis (ocular larva transitional syndrome (OLM)), toxoplasmosis (visceral larva transitional syndrome (VLM)), toxoplasmosis, trachoma, trichinosis, trichomoniasis, whipworm disease (whipworm infection), tuberculosis, tularemia, typhoid fever, ureaplasma urealyticum infection, valley fever, venezuelan equine encephalitis, wound vibrio vulnerae infection, vibrio parahaemolyticus enteritis, viral pneumonia, west nile fever, white hair sarcoidosis (white sores), yersinia pseudotuberculosis, yersinia, yellow fever, zis-bala, zika fever or binomiasis.
In embodiments, the disease is an autoimmune disease. In the case of an embodiment of the present invention, autoimmune diseases are arthritis, rheumatoid arthritis, psoriatic arthritis, juvenile idiopathic arthritis, multiple sclerosis, systemic Lupus Erythematosus (SLE), myasthenia gravis, juvenile onset diabetes, type 1 diabetes, guillain-Barre syndrome, hashimoto's encephalitis, hashimoto's thyroiditis, ankylosing spondylitis, psoriasis, sjogren's syndrome, vasculitis, glomerulonephritis, autoimmune thyroiditis Behcet's disease, crohn's disease, ulcerative colitis, bullous pemphigoid, sarcoidosis, ichthyosis, graves ' eye disease (Graves ophthalmopathy), inflammatory bowel disease, addison's disease, vitiligo, asthma, allergic asthma, acne vulgaris, celiac disease, chronic prostatitis, inflammatory bowel disease, pelvic inflammatory disease, reperfusion injury, ischemia-reperfusion injury, stroke, sarcoidosis, transplant rejection, interstitial cystitis, atherosclerosis, scleroderma or atopic dermatitis. In embodiments, the autoimmune disease is achalasia, addison's disease, adult Shi Dier disease (Addison's disease), human agammaglobulins, alopecia areata, amyloidosis, ankylosing spondylitis, anti-GBM/anti-TBM nephritis, antiphospholipid syndrome, autoimmune angioedema, autoimmune familial autonomic nerve abnormality, autoimmune encephalomyelitis, autoimmune hepatitis, autoimmune Inner Ear Disease (AIED), autoimmune myocarditis, autoimmune oophoritis, autoimmune orchitis, autoimmune pancreatitis, autoimmune retinopathy, autoimmune urticaria, axons and neuronal neuropathy (an), bal, bullous disease (Bal amese), white plug disease (Behcet's disease), benign pemphigomphosis, bullous disease, karst disease (Castleman disease, CD), disease, gazewal disease (chase), chronic demyelinating disease (chronic demyelinating disease) (chronic myelopathy), chronic myelogenous inflammation (chronic demyelinating disease) Allergic granulomatosis syndrome (CSS) or Eosinophilic Granulomatosis (EGPA), cicatricial pemphigus, crohn's syndrome (Cogan's syndrome), condensed set disease, congenital heart block, coxsackie viral myocarditis (Coxsackie myocarditis), CREST syndrome, crohn's disease, dermatitis herpetiformis, dermatomyositis, devickers disease (Devic's disease) (neuromyelitis), discoid lupus, deler's syndrome, endometriosis, eosinophilic esophagitis (EoE), eosinophilic fasciitis, erythema nodosum, mixed condensed globulinemia, evans syndrome (Evans syndrome), fibromyalgia, fibrotic inflammation, giant cell arteritis, giant cell myositis, glomerulonephritis, pneumococcal syndrome (Goodyear's disease), schodder's disease, multiple sclerosis-finger-haemopoiesis, graves-barren's disease, graves-barren's syndrome (Graves-Barre) disease, HSP), herpes gestation or Pemphigoid Gestation (PG), hidradenitis Suppurativa (HS) (acne), hypogammaglobulinemia, igA nephropathy, igG 4-related sclerosing diseases, immune Thrombocytopenic Purpura (ITP), inclusion Body Myositis (IBM), interstitial Cystitis (IC), and, juvenile arthritis, juvenile diabetes (type 1 diabetes), juvenile Myositis (JM), kawasaki disease (Kawasaki disease), lambert-Eaton syndrome (Lambert-Eaton syndrome), leukocyte-fragmenting vasculitis, lichen planus, lichen sclerosus, conjunctivitis, linear IgA disease (LAD), lupus, chronic lyme disease (Lyme disease chronic), meniere's disease, microscopic multiple vasculitis (MPA), mixed Connective Tissue Disease (MCTD), silkworm erosion keratoulcer (Mooren's ulcer), mu Haer disease (Mucha-Habermann disease), multifocal Motor Neuropathy (MMN) or ncb, multiple sclerosis, myasthenia gravis, myositis, narcolepsy, neonatal lupus, neuromyelitis, PR neutropenia, ocular cicatrix, optic neuritis, recurrent rheumatism (PANDAS), PANDAS, secondary tumors), and nocturnal degeneration (nocturnal) of the blood of the human eye (PNH), pampers Luo Zeng syndrome (Parry Romberg syndrome), parson-Turner syndrome (parson-Turner syndrome), pemphigus, peripheral neuropathy, perivenous encephalomyelitis, pernicious Anemia (PA), POEMS syndrome, polyarteritis nodosa, type I, II, III polyadenylic syndrome, polymyalgia rheumatica, polymyositis, post myocardial infarction syndrome, post pericardial opening syndrome, primary biliary cirrhosis, primary sclerosing cholangitis, progesterone dermatitis, psoriasis, psoriatic arthritis, pure red cell aplastic anemia (PRCA), pyoderma gangrene, raynaud's phenomenon, reactive arthritis, reflex neurotrophic malnutrition, recurrent polyarthritis, polyarteritis nodosa (RLS), retroperitoneal fibrosis, rheumatic arthritis, sarcoidosis, schmidkinetosyndt, schmidday syndrome, scleroderma, sympathocritic picornase, sclerodermasyndrome), sperm and testis autoimmunity, stiff Person Syndrome (SPS), subacute Bacterial Endocarditis (SBE), sosaxogram syndrome (Susac 'ssyndrome), sympathogenic Ophthalmitis (SO), aortic inflammation (Takayasu's arttis), temporal arteritis/giant cell arteritis, thrombocytopenic purpura (TTP), thyroiditis (TED), painful oculoplegia syndrome (tos-Hunt syndrome, THS), transverse myelitis, type 1 diabetes, ulcerative Colitis (UC), undifferentiated connective tissue Disease (uccd), uveitis, vasculitis, vitiligo or small Liu Yuantian Disease (Vogt-Koyanagi-Harada Disease).
In embodiments, the disease is a genetic disease. In embodiments, the genetic disorder is cystic fibrosis, alpha thalassemia, beta thalassemia, sickle cell anemia (sickle cell disease), marfansyndrome, fragile X syndrome, huntington's disease, or hemochromatosis.
In an embodiment, the amplification reaction further comprises: (a) One or more different first primers that specifically hybridize to different portions of the first strand of the first region; (b) For each different first primer, a different second primer that specifically hybridizes to a complement of a portion of the first strand of the first region, the complement being in a 3' position relative to the corresponding different first primer specific hybridization; and (c) for each different first primer, a different blocking oligonucleotide that specifically hybridizes to a portion of the first strand of the first region at a position of 5' relative to the specific hybridization of the different first primer.
In embodiments, the method further comprises detecting one or more different polynucleotide fusions, each different polynucleotide fusion comprising a fusion between a sequence of a different first region and a sequence fusion of a different second region at a different fusion junction, wherein the amplification reaction further comprises a corresponding first primer, a corresponding second primer, and a corresponding blocking oligonucleotide for each different first region.
In embodiments, the polynucleotide fusion comprises a sequence of a first region fused to a sequence of a second region at a fusion junction, wherein the fusion is between two gene sequences, referred to as a gene fusion. A fusion junction may represent a location where a first nucleotide sequence (e.g., a first gene sequence or gene fragment) meets or joins with a second nucleotide sequence (e.g., a second gene or gene fragment). In an embodiment, the polynucleotide fusion is a hybrid gene formed from two previously independent genes (or gene fragments). In some embodiments, the fusion junction is located between the sequence that specifically binds to the blocking element and the sequence that specifically binds to the first primer. In embodiments, the polynucleotide fusion comprises a gene fusion or a gene fragment of any of the foregoing fusions: AGTRAP-BRAF, AKAP9-BRAF, ATIC-ALK, CCDC6-RET, CD74-NRG1, CD74-ROS1, CEP89-BRAF, CLCN6-BRAF, DCTN1-ALK, EML4-ALK, EZR-ROS1, FAM131B-BRAF, FCHSD1-BRAF, GATM-BRAF, GNAI1-BRAF, GOLGA5-RET, GOPC-ROS1, HIP1-ALK, HOOK3-RET, KIF5B-ALK, KIF5B-RET, KTN1-RET, LRIG3-ROS1, LSM 14A-F MKRN1-BRAF, MSN-ALK, MYO5A-ROS1, NCOA4-RET, PCM1-RET, RANBP2-ALK, RELCH-RET, RNF130-BRAF, SDC4-ROS1, SLC34A2-ROS1, SLC3A2-NRG1, SLC45A3-BRAF, SQSTM1-ALK, STRN-ALK, TFG-ALK, TPM3-ROS1, TPR-ALK, TRIM24-BRAF, TRIM24-RET, TRIM27-RET, TRIM33-RET, VCL-ALK, WDCP-ALK, ZCCHC8-ROS1.
In embodiments the polynucleotide fusion comprises a gene fusion or a gene fragment of any of the foregoing fusions: ACSL3-ETV1, ACTB-GLI1, AGGAT 5-MCPH1, AGTRAP-BRAF, AKAP9-BRAF, ARID1A-MAST2, ATIC-ALK, BBS9-PKD1L1, BCR-JAK2, CBFA2T3-GLIS2, CCDC6-RET, CD74-NRG1, CD74-ROS1, CENPK-KMT2A, CEP-BRAF, CLCN6-BRAF, COL1A1-PDGFB, COL1A2-PLAG1, CRTC3-MAML2, DCTN1-ALK, DDX5-ETV4, DHH-RHEBL1 DNAJB1-PRKACA, EIF3E-RSPO2, EIF3K-CYP39A1, EML4-ALK, EPC1-PHF1, ETV6-ITPR2, ETV6-JAK2, ETV6-PDGFRB, ETV6-RUNX1, EZR-ERBB4, EZR-ROS1, FAM131B-BRAF, FBXL18-RNF216, FCHSD1-BRAF, FUS-ATF1, FUS-CREB3L2, FUS-FEV, GATM-BRAF, GMDS-PDE8B, GNAI-BRAF, GOLGA5-RET, GOPC-ROS1 HACL1-RAF1, HAS2-PLAG1, HIP1-ALK, HOOK3-RET, IL6R-ATP8B2, INTS4-GAB2, IRF2BP2-CDX1, JAZF1-PHF1, JAZF1-SUZ12, JPT1-USH1G, KIF B-ALK, KIF5B-RET, KLK2-ETV1, KLK2-ETV4, KMT2A-aBI1, KMT2A-aCTN4, KMT2A-aFF3, KMT2A-aFF4, KMT 2A-aRHGP 26, KMT2A-aRHGEF12, KMT2A-BTBD18 KMT2A-CASP8AP2, KMT2A-CBL, KMT2A-CEP170B, KMT2A-CIP2A, KMT A-CREBBP, KMT2A-EEFSEC, KMT2A-ELL, KMT2A-EP300, KMT2A-EPS15, KMT2A-FOXO4, KMT2A-FRYL, KMT2A-GAS7, KMT2A-GMPS, KMT2A-GPHN, KMT2A-KNL1, KMT2A-LASP1, KMT2A-LPP, KMT2A-MAPRE1, KMT2A-MLLT11, KMT2A-MLLT3, KMT2A-MLLT6, KMT2A-MYO1F, KMT A-NCKIPSD, KMT2A-NRIP3, KMT2A-PDS5A, KMT A-PICALM, KMT2A-SARNP, KMT2A-SH3GL1, KMT2A-TET1, KMT2A-ZFYVE19, KTN1-RET, LIFR-PLAG1, LRIG3-ROS1, LSM14A-BRAF, MBOAT2-PRKCE, MBTD1-CXorf67, MEAF6-PHF1, MKRNN 1-BRAF MN1-ETV6, MSN-ALK, MYO5A-ROS1, NAB2-STAT6, NCOA4-RET, NF1-ASIC2, NONO-TFE3, NOTCH1-GABBR2, NTN1-ACLY, NUP107-LGR5, NUP98-KDM5A, PAX-FOXO 1, PAX3-NCOA2, PAX5-JAK2, PAX7-FOXO1, PCM1-JAK2, PCM1-RET, PLA2R1-RBMS1, PLXND1-TMCC1, PML-RARA PRCC-TFE3, RANBP2-ALK, RBM14-PACS1, RELCH-RET, RNF130-BRAF, SDC4-ROS1, SEC16A-NOTCH1, SFPQ-TFE3, SLC26A6-PRKAR2A, SLC A2-ROS1, SLC3A2-NRG1, SLC45A3-BRAF, SLC45A3-ELK4, SLC45A3-ETV1, SLC45A3-ETV5, SND1-BRAF, SQSTM1-ALK, SRGAP3-RAF1, SS18-SSX1 SS18-SSX2, SS18-SSX4B, SS L1-SSX1, STRN-ALK, TADA2A-MAST1, TBL1XR1-TP63, TCEA1-PLAG1, TCF3-PBX1, TFG-ALK, TPM3-ROS1, TPR-ALK, TRIM24-BRAF, TRIM24-RET, TRIM27-RET, TRIM33-RET, VCL-ALK, WDCP-ALK, YWHAE-NUTM2A, YWHAE-NUTM2B, ZC H7B-BCOR, ZCCHC8-ROS1. In embodiments, the polynucleotide fusion comprises a sequence of a first region fused to a sequence of a second region at a fusion junction, wherein the first region and the second region comprise different genes. In embodiments, the polynucleotide fusion comprises a gene fusion of CREBBP-SRGAP2B, DNAH-IKZF 1, ETV6-SNUPN or ETV6-NUFIP 1. The genes described herein correspond to registered genes identified in the national center for biotechnology information catalogue of the national library of medicine, accessible www.ncbi.nlm.nih.gov/gene/. Alternatively, the gene may be a fusion gene found in a database of known fusion genes, such as ChimerDB, e.g., ye Eun Jang et al, nucleic acids research (Nucleic Acids Research), volume 48, D1, month 08, page D817-D824, or fusion GDB, such as Kim P and Zhou X, nucleic acids research, month 1, 8, 2019; 47 (D1) D994-D1004, each of which is incorporated herein by reference.
In embodiments, the polynucleotide fusion comprises a sequence of a first region fused to a sequence of a second region at a fusion junction, wherein the first region comprises an ABI1 gene or part thereof, an ACLY gene or part thereof, an ACSL3 gene or part thereof, an ACTB gene or part thereof, an ACTN4 gene or part thereof, an AFF3 gene or part thereof, an AFF4 gene or part thereof, an AGPT 5 gene or part thereof, an AKAP9 gene or part thereof, an ALK gene or part thereof, an ARHGAP26 gene or part thereof, an ARHGEF12 gene or part thereof, an ARID1A gene or part thereof, an ASIC2 gene or part thereof, an ATF1 gene or part thereof, an ATIC gene or part thereof, an ATP8B2 gene or part thereof, a BBS9 gene or part thereof, a BCOR gene or part thereof, a BRAF gene or part thereof, a BTBD18 gene or part thereof, a CASP8AP2 gene or part thereof, a CBFA2T3 gene or part thereof, a CBL gene or part thereof, a CCDC6 gene or part thereof, a CD74 gene or part thereof, a CDX1 gene or part thereof, a BCR gene or part thereof, a BRAF 2 gene or part thereof, a BRF 2T3 gene or part thereof, a BL 18 gene or part thereof CENPK gene or a part thereof, CEP170B gene or a part thereof, CEP89 gene or a part thereof, CIP2A gene or a part thereof, CLCN6 gene or a part thereof, COL1A1 gene or a part thereof, COL1A2 gene or a part thereof, CREB3L1 gene or a part thereof, CREBBP gene or a part thereof, CRTC3 gene or a part thereof, CXorf67 gene or a part thereof, CYP39A1 gene or a part thereof, DCTN1 gene or a part thereof, CREB3L2 gene or a part thereof, CRBP gene or a part thereof, CRTC3 gene or a part thereof, CXorf67 gene or a part thereof, CYP39A1 gene or a part thereof, DCTN1 gene or a part thereof, CRTN 1 gene or a part thereof, CREB 3A2 gene or a part thereof, CEL 1B gene or a part thereof, CEL DDX5 gene or a part thereof, DHH gene or a part thereof, DNAJB1 gene or a part thereof, EEFSEC gene or a part thereof, EIF3E gene or a part thereof, EIF3K gene or a part thereof, ELK4 gene or a part thereof, ELL gene or a part thereof, EML4 gene or a part thereof, EP300 gene or a part thereof, EPC1 gene or a part thereof, EPS15 gene or a part thereof, ERBB4 gene or a part thereof, ETV1 gene or a part thereof, ETV4 gene or a part thereof, E1E 2 gene 1E 2 or a part E or a part E or part E or part or portions, ETV5 gene or a part thereof, ETV6 gene or a part thereof, EZR gene or a part thereof, FAM131B gene or a part thereof, FBXL18 gene or a part thereof, FCHSD1 gene or a part thereof, FEV gene or a part thereof, FOXO1 gene or a part thereof, FOXO4 gene or a part thereof, FRYL gene or a part thereof, FUS gene or a part thereof, GAB2 gene or a part thereof, GABBR2 gene or a part thereof, GAS7 gene or a part thereof, GATM gene or a part thereof, GLI1 gene or a part thereof, GLIS2 gene or a part thereof, GABBR2 gene or a part thereof, GAS7 gene or a part thereof, GATM gene or a part thereof, GABBR2 gene or a part thereof, GABBR 7 gene or a part thereof, GABBR2 gene or a part thereof, GATM gene or a part thereof GMDS gene or part thereof, GMPS gene or part thereof, GNAI1 gene or part thereof, GOLGA5 gene or part thereof, GOPC gene or part thereof, GPHN gene or part thereof, HACL1 gene or part thereof, HAS2 gene or part thereof, HIP1 gene or part thereof, HOOK3 gene or part thereof, IL6R gene or part thereof, INTS4 gene or part thereof, IRF2BP2 gene or part thereof, ITPR2 gene or part thereof, JAK2 gene or part thereof, JAZF1 gene or part thereof, and a JPT1 gene or portion thereof, a KDM5A gene or portion thereof, a KIF5B gene or portion thereof, a KLK2 gene or portion thereof, a KMT2A gene or portion thereof, a KNL1 gene or portion thereof, a KTN1 gene or portion thereof, a LGR5 gene or portion thereof, a LIFR gene or portion thereof, an LPP gene or portion thereof, a LRIG3 gene or portion thereof, a LSM14A gene or portion thereof, a MAml2 gene or portion thereof, a MApre1 gene or portion thereof, a MAST2 gene or portion thereof, a MBOAT2 gene or portion thereof, a td1 gene or portion thereof, a MCPH1 gene or portion thereof, a MEAF6 gene or portion thereof, a MLLT1 gene or portion thereof, a MLLT11 gene or portion thereof, a MLLT6 gene or portion thereof, a MN1 gene or portion thereof, a MSN 1 gene or portion thereof, a MYO1 gene or portion thereof, a myf 1 gene or portion thereof, a myla 2 gene or portion thereof, a psd 2 gene or portion thereof, a kib 2 or portion thereof, a k 1 gene or portion thereof, a, NCOA1 gene or a part thereof, NCOA2 gene or a part thereof, NCOA4 gene or a part thereof, NF1 gene or a part thereof, NONO gene or a part thereof, NOTCH1 gene or a part thereof, NRG1 gene or a part thereof, NRIP3 gene or a part thereof, NTN1 gene or a part thereof, NUP107 gene or a part thereof, NUP98 gene or a part thereof, NUTM2A gene or a part thereof, NUTM2B gene or a part thereof, PACS1 gene or a part thereof, PAX3 gene or a part thereof, PAX5 gene or a part thereof, PAX7 gene or a part thereof, PBX1 gene or a part thereof, PCM1 gene or a part thereof, PDE8B gene or a part thereof, PDGFB gene or a part thereof, PDS5A gene or a part thereof, PHF1 gene or a part thereof, PICALM gene or a part thereof, PKD1L1 gene or a part thereof, PLA2R1 gene or a part thereof, XAG 1 gene or a part thereof, PMND 1 gene or a part thereof, KACC 1 gene or a part thereof, PRC 2 or a part thereof; PRKCE gene or a part thereof, RAF1 gene or a part thereof, RANBP2 gene or a part thereof, RARA gene or a part thereof, RBM14 gene or a part thereof, RBMS1 gene or a part thereof, RELCH gene or a part thereof, RET gene or a part thereof, RHEBL1 gene or a part thereof, RNF130 gene or a part thereof, RNF216 gene or a part thereof, ROS1 gene or a part thereof, RSPO2 gene or a part thereof, RUNX1 gene or a part thereof, SARNP gene or a part thereof, SDC4 gene or a part thereof, SEC16A gene or a part thereof, SFPQ gene or a part thereof, SH3GL1 gene or a part thereof, SLC26A6 gene or a part thereof, SLC34A2 gene or a part thereof, SLC3A2 gene or a part thereof, SLC45A3 gene or a part thereof, SND1 gene or a part thereof, SQSTM1 gene or a part thereof, SRGAP3 gene or a part thereof, SS18L1 gene or a part thereof, SSX2 or a part thereof, SSX4 or a part thereof, SSB 4 or a part thereof, or a part thereof STRN gene or a part thereof, SUZ12 gene or a part thereof, TADA2A gene or a part thereof, TBL1XR1 gene or a part thereof, TCEA1 gene or a part thereof, TCF3 gene or a part thereof, TET1 gene or a part thereof, TFE3 gene or a part thereof, TFG gene or a part thereof, TMCC1 gene or a part thereof, TP63 gene or a part thereof, TPM3 gene or a part thereof, TPR gene or a part thereof, TRIM24 gene or a part thereof, TRIM27 gene or a part thereof, TRIM33 gene or a part thereof, USH1G gene or a part thereof, VCL gene or a part thereof, WDCP gene or a part thereof, ywae gene or a part thereof, ZC3H7B gene or a part thereof, zchc 8 gene or a part thereof, or ZFYVE19 gene or a part thereof.
In embodiments, the polynucleotide fusion comprises a sequence of a first region fused to a sequence of a second region at a fusion junction, wherein the second region comprises an ABI1 gene or part thereof, an ACLY gene or part thereof, an ACSL3 gene or part thereof, an ACTN4 gene or part thereof, an AFF3 gene or part thereof, an AFF4 gene or part thereof, an AGPAT5 gene or part thereof, an AKAP9 gene or part thereof, an ALK gene or part thereof, an ARHGAP26 gene or part thereof, an ARHGEF12 gene or part thereof, an ARID1A gene or part thereof, an ASIC2 gene or part thereof, an ATF1 gene or part thereof, an ATIC gene or part thereof, an ATP8B2 gene or part thereof, a BBS9 gene or part thereof, a BCOR gene or part thereof, a BRAF gene or part thereof, a BTBD18 gene or part thereof, a CASP8AP2 gene or part thereof, a CBFA2 gene or part thereof, a CBL gene or part thereof, a CCDC6 gene or part thereof, a CD74 gene or part thereof, a CDX1 gene or part thereof, a BX 2 gene or part thereof, a BRAF 8B2 gene or part thereof, a BTBD18 gene or part thereof, a CASP8AP2 gene or part thereof, a CBA 3 gene or part thereof CENPK gene or a part thereof, CEP170B gene or a part thereof, CEP89 gene or a part thereof, CIP2A gene or a part thereof, CLCN6 gene or a part thereof, COL1A1 gene or a part thereof, COL1A2 gene or a part thereof, CREB3L1 gene or a part thereof, CREBBP gene or a part thereof, CRTC3 gene or a part thereof, CXorf67 gene or a part thereof, CYP39A1 gene or a part thereof, DCTN1 gene or a part thereof, DDX5 gene or a part thereof, DHH gene or a part thereof, DNAJB1 gene or a part thereof, EEFSEC gene or a part thereof, EIF3E gene or a part thereof, EIF3K gene or a part thereof, ELK4 gene or a part thereof, ELL gene or a part thereof, EML4 gene or a part thereof, EPC 300 gene or a part thereof, EPS1 gene or a part thereof, 15 gene or a part thereof, BB4 gene or a part thereof, ETV1 gene or a part thereof, ETV4 gene or a part thereof, or a part thereof ETV6 gene or a part thereof, EZR gene or a part thereof, FAM131B gene or a part thereof, FBXL18 gene or a part thereof, FCHSD1 gene or a part thereof, FEV gene or a part thereof, FOXO1 gene or a part thereof, FOXO4 gene or a part thereof, frayl gene or a part thereof, FUS gene or a part thereof, GAB2 gene or a part thereof, GABBR2 gene or a part thereof, GAS7 gene or a part thereof, GATM gene or a part thereof, GLI1 gene or a part thereof, GLIs2 gene or a part thereof, GMDS gene or a part thereof, GMPS gene or a part thereof, GNAI1 gene or a part thereof, GOLGA5 gene or a part thereof, GOLGA gene or a part thereof, GPHN 1 gene or a part thereof, HACL1 gene or a part thereof, hasa HAS2 gene or a part thereof, HIP1 gene or a part thereof, hos 3 gene or a part thereof, IL6R gene or a part thereof, INTS4 gene or a part thereof, IRF2BP2 gene or a part thereof, JAK2 gene or a part thereof, zpr 1 or a part thereof, JAK2 gene or a part thereof, zpr 1 or a part thereof; KDM5A gene or a part thereof, KIF5B gene or a part thereof, KLK2 gene or a part thereof, KMT2A gene or a part thereof, KNL1 gene or a part thereof, KTN1 gene or a part thereof, LASP1 gene or a part thereof, LGR5 gene or a part thereof, LIFR gene or a part thereof, LPP gene or a part thereof, LRIG3 gene or a part thereof, LSM14A gene or a part thereof, MAML2 gene or a part thereof, MAPRE1 gene or a part thereof, MAST2 gene or a part thereof, LRIG3 gene or a part thereof, MAST2 gene or a part thereof, MBOAT2 gene or a part thereof, MBTD1 gene or a part thereof, MCPH1 gene or a part thereof, MEAF6 gene or a part thereof, MKRN1 gene or a part thereof, MLLT11 gene or a part thereof, MLLT3 gene or a part thereof, MLLT6 gene or a part thereof, MN1 gene or a part thereof, MSN gene or a part thereof, MYO1F gene or a part thereof, MYO5A gene or a part thereof, NAB2 gene or a part thereof, NCKIPSD gene or a part thereof, NCOA1 gene or a part thereof, MSN gene or a part thereof, and, NCOA2 gene or a part thereof, NCOA4 gene or a part thereof, NF1 gene or a part thereof, NONO gene or a part thereof, NOTCH1 gene or a part thereof, NRG1 gene or a part thereof, NUP 3 gene or a part thereof, NUP107 gene or a part thereof, NUP98 gene or a part thereof, NUTM2A gene or a part thereof, NUTM2B gene or a part thereof, PACS1 gene or a part thereof, PAX3 gene or a part thereof, PAX5 gene or a part thereof, PAX7 gene or a part thereof, PBX1 gene or a part thereof, PCM1 gene or a part thereof, PDE8B gene or a part thereof, PDGFRB gene or a part thereof, PDS5A gene or a part thereof, PHF1 gene or a part thereof, PICAL gene or a part thereof, PLXND1 gene or a part thereof, PLR 2A gene or a part thereof, PLAG1 gene or a part thereof, PML gene or a part thereof, KACC 1 gene or a part thereof, PRCC 1 gene or a part thereof, PRC 2 gene or a part thereof, PRC 2 or a part thereof; RAF1 gene or a part thereof, RANBP2 gene or a part thereof, RARA gene or a part thereof, RBM14 gene or a part thereof, RBMS1 gene or a part thereof, RELCH gene or a part thereof, RET gene or a part thereof, RHEBL1 gene or a part thereof, RNF130 gene or a part thereof, RNF216 gene or a part thereof, ROS1 gene or a part thereof, RSPO2 gene or a part thereof, RUNX1 gene or a part thereof, SARNP gene or a part thereof, SDC4 gene or a part thereof, SEC16A gene or a part thereof, SFPQ gene or a part thereof, SH3GL1 gene or a part thereof, SLC26A6 gene or a part thereof, SLC34A2 gene or a part thereof, SLC3A2 gene or a part thereof, SLC45A3 gene or a part thereof, SND1 gene or a part thereof, SQSTM1 gene or a part thereof, SRGAP 18 gene or a part thereof, SS18L1 gene or a part thereof, SSX2 gene or a part thereof, SSX4 or a part thereof, STR 6A6 or a part thereof SUZ12 gene or a part thereof, TADA2A gene or a part thereof, TBL1XR1 gene or a part thereof, TCEA1 gene or a part thereof, TCF3 gene or a part thereof, TET1 gene or a part thereof, TFE3 gene or a part thereof, TFG gene or a part thereof, TMCC1 gene or a part thereof, TP63 gene or a part thereof, TPM3 gene or a part thereof, TPR gene or a part thereof, TRIM24 gene or a part thereof, TRIM27 gene or a part thereof, TRIM33 gene or a part thereof, USH1G gene or a part thereof, VCL gene or a part thereof, WDCP gene or a part thereof, YWHAE gene or a part thereof, ZC3H7B gene or a part thereof, ZCC 8 gene or a part thereof, or ZFYVE19 gene or a part thereof.
In embodiments, the fusion junction may be an unknown fusion junction event, as the methods disclosed herein do not require a priori knowledge of the exact nature of the genomic rearrangement to detect and characterize the fusion. In an embodiment, only the sequence of the first region is known prior to cyclization. In an embodiment, only the sequence of the second region is known prior to cyclization.
In embodiments, the first region and the second region are located on the same chromosome. In embodiments, the first region and the second region are located on different chromosomes.
In embodiments, the polynucleotide fusion comprises a gene encoding a kinase domain or a portion thereof. In embodiments, the polynucleotide fusion comprises a gene fusion of BCL1-JH, BCL2-JH, or MYC-IGL.
In embodiments, the polynucleotide fusion comprises a B-cell or T-cell intrachromosomal rearrangement. In embodiments, the polynucleotide fusion comprises a B cell intrachromosomal rearrangement. In embodiments, the polynucleotide fusion comprises a T cell intrachromosomal rearrangement.
In embodiments, the polynucleotide fusion comprises the following fusion: a rearranged T cell antigen receptor or fragment thereof, a T cell receptor alpha variable (TRAV) gene or fragment thereof, a T cell receptor alpha junction (TRAJ) gene or fragment thereof, a T cell receptor alpha constant (TRAC) gene or fragment thereof, a T cell receptor beta variable (TRBV) gene or fragment thereof, a T cell receptor beta diversity (TRBD) gene or fragment thereof, a T cell receptor beta junction (TRBJ) gene or fragment thereof, a T cell receptor beta constant (TRBC) gene or fragment thereof, a T cell receptor gamma variable (TRGV) gene or fragment thereof, a T cell receptor gamma junction (TRGJ) gene or fragment thereof, a T cell receptor gamma constant (TRGC) gene or fragment thereof, a T cell receptor delta variable (TRDV) gene or fragment thereof, a T cell receptor delta diversity (TRDD) gene or fragment thereof, or a T cell receptor delta constant (TRDC) gene or fragment thereof, or a fragment thereof.
In embodiments, the polynucleotide fusion comprises the following fusion: a rearranged B cell antigen receptor or fragment thereof, an IGHV gene or fragment thereof, an IGHD gene or fragment thereof, or an IGHJ gene or fragment thereof, an IGHJC gene or fragment thereof, an IGKV gene or fragment thereof, an IGKJ gene or fragment thereof, an IGKC gene or fragment thereof, an IGLV gene or portion thereof, an IGLJ gene or portion thereof, an IGLC gene or fragment thereof, an IGK kappa deletion element or portion thereof, an IGK intron enhancer element or portion thereof. In embodiments, the polynucleotide fusion comprises the following fusion: ALK gene or a part thereof, BRAF gene or a part thereof, EGFR gene or a part thereof, ERBB2 gene or a part thereof, KRAS gene or a part thereof, MET gene or a part thereof, NRG1 gene or a part thereof, FGFR2 gene or a part thereof, FGFR3 gene or a part thereof, NTRK1 gene or a part thereof, NTRK2 gene or a part thereof, NTRK3 gene or a part thereof, RET gene or a part thereof, or ROS1 gene or a part thereof.
III.Compositions and kits
In one aspect, a composition is provided that includes a blocking element, a first primer, and a second primer. In embodiments, the composition further comprises an annealing solution (alternatively referred to herein as hybridization buffer or hybridization solution). In embodiments, the annealing solution comprises an aqueous solution, which may contain a buffer (e.g., sodium citrate saline (SSC), tris (hydroxymethyl) aminomethane, or "tris"), an aqueous salt solution (e.g., KCl or (NH) 4 ) 2 SO 4 ) Chelating agents (e.g., EDTA), detergents, surfactants, crowding agents or stabilizers (e.g., PEG, tween-20, BSA). In an embodiment, the annealing solution comprises Tris and the pH is maintained at about 8.0 to about 9.0. In an embodiment, the composition comprises an extension solution. In embodiments, the extension solution comprises an aqueous solution, which may contain a buffer (e.g., saline-sodium citrate (SSC), tris (hydroxymethyl) aminomethane, or "tris "), aqueous salts (e.g., KCl or (Mg) 2 SO 4 ) Nucleotides, polymerases, detergents, chelating agents (e.g., EDTA), surfactants, crowding agents or stabilizers (e.g., PEG, tween-20, BSA). In embodiments, the composition further comprises an additive that reduces the denaturation temperature of DNA. In embodiments, the composition comprises an additive such as betaine, dimethyl sulfoxide (DMSO), ethylene glycol, formamide, glycerol, guanidine thiocyanate, 4-methylmorpholine 4-oxide (NMO), or mixtures thereof. In embodiments, the composition further comprises a denaturant. The denaturant may be acetic acid, hydrochloric acid, nitric acid, formamide, guanidine, sodium salicylate, sodium hydroxide, dimethyl sulfoxide (DMSO), propylene glycol, urea, or mixtures thereof.
In embodiments, the composition comprises a cyclization solution (e.g., a cyclizing agent). In an embodiment, the circularization solution comprises a circularization ligase, e.g. CircLigase TM Taq DNA ligase, hiFi Taq DNA ligase, T4 ligase orDNA ligase. In an embodiment, the circularization solution comprises a splint primer. "splint primer" is used in accordance with its simple and ordinary meaning and refers to a primer having 2 or more sequences complementary to two or more portions of a template polynucleotide. In embodiments, the two sequences are adaptor sequences, wherein one adaptor sequence binds to (i.e., hybridizes to) the 5 'portion of the template polynucleotide and the other adaptor binds to (i.e., hybridizes to) the 3' portion of the template polynucleotide. In an embodiment, the cyclization solution comprises a crowding agent, such as PEG (e.g., 20% -25% PEG-8000). In an embodiment, the cyclizing solution comprises polyethylene glycol (PEG), such as PEG 4000 or PEG 6000, dextran, and/or Ficoll.
In embodiments, the splint primers are about 5 to about 25 nucleotides in length. In embodiments, the splint primers are about 10 to about 40 nucleotides in length. In embodiments, the splint primers are about 5 to about 100 nucleotides in length. In embodiments, the splint primers are about 20 to about 200 nucleotides in length. In embodiments, the splint primers are about or at least about 5, 6, 7, 8, 9, 10, 12, 15, 18, 20, 25, 30, 35, 40, 50 or more nucleotides in length. In embodiments, the splint primers are about or at least about 10 nucleotides in length. In embodiments, the splint primers are about or at least about 15 nucleotides in length. In embodiments, the splint primers are about or at least about 25 nucleotides in length.
In one aspect, a kit is provided comprising: a circularizing agent, wherein the circularizing agent is capable of binding the 5 'and 3' ends of a linear nucleic acid molecule; a blocking element capable of binding to one or more circular polynucleotides; a first primer and a second primer; and a polymerase. In an embodiment, the first primer and the second primer form a primer set. In an embodiment, the kit comprises a plurality of primer sets. In embodiments, the kit comprises 5, 10, 20, 25, 50 or more primer sets.
In an embodiment, the kit comprises at least 22 different primers, for example one forward primer (1F) and six reverse primers (6R) of the IGH locus; three forward (3F) and six reverse (6R) of IGK loci; and one forward primer (1F) and five reverse primers (5R) of the IGL locus. In an embodiment, the kit comprises about 18 elements (i.e., 18 blocking elements targeting 18 different regions). In an embodiment, the kit comprises primers targeting 7 different sequences of the IGH locus. In an embodiment, the kit comprises primers targeting 9 different sequences of the IGK locus. In an embodiment, the kit comprises primers targeting 6 different sequences of the IGL locus. In embodiments, the kit comprises a plurality of different populations of blocking elements, each population of blocking elements binding to a particular sequence.
In one aspect, a kit is provided containing the components necessary to perform the methods as described herein, including the examples. Typically, the kit comprises one or more containers that provide the composition and one or more additional reagents (e.g., buffers suitable for polynucleotide extension). The kit may also comprise template nucleic acids (DNA and/or RNA), one or more primersPolynucleotides, nucleotides (including, for example, deoxyribonucleotides, ribonucleotides, labeled nucleotides, and/or modified nucleotides), buffers, salts, and/or labels (e.g., fluorophores). In embodiments, the kit further comprises instructions. In embodiments, the kit comprises one or more housings (e.g., cassettes, bottles, cartridges) containing the relevant reagents and/or support materials. In embodiments, the kit comprises components useful for circularizing a template polynucleotide using chemical ligation techniques. In embodiments, the kit comprises a kit useful for using a ligase (e.g., circLigase TM Ligase, taq DNA ligase, hiFi Taq DNA ligase, T4 DNA ligase, or amplinase DNA ligase) to cyclize the template polynucleotide. In embodiments, the ligase is an RNA-dependent DNA ligase (e.g., a slintr ligase). For example, such a kit further comprises the following components: (a) For pH control and is a ligase (e.g., circLigase TM Ligase, taq DNA ligase, hiFi Taq DNA ligase, T4 DNA ligase or amplinase DNA ligase) provides a reaction buffer of optimized salt composition, and (b) ligase cofactors. In embodiments, the kit further comprises instructions for its use.
In an embodiment, the kit comprises a plurality of primers, wherein the primers are capable of hybridizing to linear nucleic acid molecules. Nucleic acid hybridization techniques can be used to assess the hybridization specificity of the primers described herein. Hybridization techniques are well known in the art, e.g., suitable moderately stringent conditions for testing hybridization of a polynucleotide as provided herein to other polynucleotides comprise pre-washing in a solution of 5x SSC, 0.5% sds, 1.0mM EDTA (pH 8.0); hybridization in 5 XSSC at 50℃to 60 ℃; then washed twice with 2x, 0.5x and 0.2x SSC each containing 0.1% sds at 65 ℃ for 20 minutes.
In an embodiment, the kit comprises a primer set. In an embodiment, the kit comprises a plurality of primer sets. The number of first set of primers may be the same or different than the number of second set of primers. As used herein, "primer set" or "primer pair" refers to two or more primers that target two or more regions of a polynucleotide. Typically, the primer set comprises a first primer that hybridizes to a 5 'portion of the polynucleotide and a second primer that hybridizes to a 3' portion of the polynucleotide. For example, the forward primer and the reverse primer are located on both sides of the target region of the polynucleotide, and the forward primer and the reverse primer are collectively referred to as a primer set. In an embodiment, the kit comprises a first set of "upstream" or "forward" primers and a second set of "downstream" or "reverse" primers. In an embodiment, the kit further comprises forward and reverse primer sets for specifically amplifying recombinant nucleic acids encoding IgH (VDJ), igH (DJ) and IgK. In some embodiments, the kit further comprises forward and reverse primer sets that specifically amplify recombinant nucleic acids encoding tcrβ, tcrδ, and tcrγ. In embodiments, the kit comprises a plurality of V segment primers (i.e., primers having a sequence complementary to the V coding region) and a plurality of J segment primers (e.g., primers having a sequence complementary to the J coding region), wherein the plurality of V segment primers and the plurality of J segment primers amplify substantially all combinations of V segments and J segments of the rearranged immunoreceptor locus. Substantially all combinations means at least 95%, 96%, 97%, 98%, 99% or more of all combinations of V and J segments of rearranged immunoreceptor loci. In certain embodiments, the plurality of V segment primers and the plurality of J segment primers amplify all combinations of V segments and J segments of the rearranged immunoreceptor locus. In embodiments, the primer may comprise or be at least about 15 nucleotides long, having a sequence identical to or complementary to a contiguous sequence of 15 nucleotides in length of the target V or J segment (i.e., a portion of a genomic polynucleotide encoding a V or J region polypeptide). Longer primers, such as about 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 45, or 50 nucleotide long primers having a sequence identical to or complementary to the contiguous sequence of the polynucleotide segment encoding the target V or J region, may also be used in the methods and kits described herein. In an embodiment, the kit comprises an inwardly facing primer. In an embodiment, the kit comprises an outward facing primer. The primer set may comprise more than two different primers, e.g., one forward primer (1F) and six reverse primers (6R) of the IGH locus, collectively referred to as a primer set for the IGH locus.
In embodiments, the kit further comprises forward and reverse primer sets for amplifying one or more target sequences comprising single nucleotide variants, insertions, deletions, internal tandem repeats, and/or copy number variants. In embodiments, the kit further comprises forward and reverse primer sets for amplifying one or more target sequences comprising one or more single nucleotide variants, one or more insertions, one or more deletions, one or more internal tandem repeats, or one or more copy number variants.
In embodiments, the kit comprises at least 2, 4, 6, 8, 10, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200 or more primer sets. In embodiments, the kit comprises 2 to 10, 10 to 40, 40 to 80, 80 to 150, 150 to 300 or more primer sets. The number of primer sets provided in the kit can be tailored for a particular application, e.g., detecting a known number of recombinant nucleic acids, and/or detecting a known number of single nucleotide variants, insertions, deletions, internal tandem repeats, and/or copy number variants. In embodiments, the kit comprises a plurality (e.g., a plurality) of primer sets for amplifying a single genomic feature.
In an embodiment, the kit comprises a sequencing polymerase and one or more amplification polymerases. In embodiments, the sequencing polymerase is capable of incorporating modified nucleotides. In an embodiment, the polymerase is a DNA polymerase. In embodiments, the DNA polymerase is a Pol IDNA polymerase, pol IIDNA polymerase, pol IIIDNA polymerase, pol IV DNA polymerase, pol V DNA polymerase, pol β DNA polymerase, pol μ DNA polymerase, pol λ DNA polymerase, pol σdna polymerase, pol α DNA polymerase, pol δ DNA polymerase, pol epsilon DNA polymerase, pol ηdna polymerase, pol iota DNA polymerase, pol κdna polymerase, pol ζdna polymerase, pol γ DNA polymerase, pol θ DNA polymerase, pol V DNA polymerase, or thermophilic nucleic acid polymerase (e.g., thermomer γ, 9°n polymerase (exo-), thermomer II, thermomer III, or thermomer IX). In embodiments, the DNA polymerase is a thermophilic nucleic acid polymerase. In embodiments, the DNA polymerase is a modified archaebacteria DNA polymerase. In embodiments, the polymerase is a reverse transcriptase. In embodiments, the polymerase is a mutant deep sea pneumococcal polymerase (e.g., a mutant deep sea pneumococcal polymerase as described in WO 2018/148723 or WO 2020/056044, each of which is incorporated herein by reference for all purposes). In an embodiment, the kit comprises a strand displacement polymerase. In embodiments, the kit comprises a strand displacement polymerase, such as a phi29 polymerase, a Bst polymerase (e.g., bst Lf), a phi29 mutant polymerase, or a thermostable phi29 mutant polymerase.
In an embodiment, the kit comprises a buffer solution. Typically, the buffer solutions contemplated herein are made from weak acids and their conjugate bases or weak bases and their conjugate acids. For example, sodium acetate and acetic acid are buffers that may be used to form an acetate buffer. Other examples of buffers that may be used to prepare the buffer solution include, but are not limited to Tris, bicine, tricine, HEPES, TES, MOPS, MOPSO and PIPES. In addition, other buffers that may be used in enzymatic, hybridization and detection reactions are known in the art. In an embodiment, the buffer solution may comprise Tris. With respect to the embodiments described herein, the pH of the buffer solution may be adjusted to allow for any of the described reactions. In some embodiments, the pH of the buffer solution may be greater than pH 7.0, greater than pH 7.5, greater than pH 8.0, greater than pH 8.5, greater than pH 9.0, greater than pH 9.5, greater than pH 10, greater than pH 10.5, greater than pH 11.0, or greater than pH 11.5. In other embodiments, the pH of the buffer solution may be in the range of, for example, about pH 6 to about pH 9, about pH 8 to about pH 10, or about pH 7 to about pH 9. In embodiments, the buffer solution may comprise one or more divalent cations. Examples of divalent cations may include, but are not limited to, mg 2+ 、Mn 2+ 、Zn 2+ And Ca 2+ . In embodiments, the buffer solution may contain one or more divalent cations in a concentration sufficient to allow hybridization of the nucleic acidsIons. In an embodiment, the kit comprises an annealing solution, an extension solution, and a chemical denaturant. In an embodiment, the kit further comprises an internal standard comprising a plurality of nucleic acids having a length and composition representative of the target nucleic acid, wherein the internal standard is provided at a known concentration.
The kit may further comprise one or more additional containers comprising PCR and sequencing buffers, diluents, subject sample extraction means (e.g., syringe, swab, etc.), and package inserts with instructions for use. Additionally, labels with instructions for use, such as those described above, may be provided on the containers; and/or instructions and/or other information may also be contained on the insert contained with the kit; and/or by the website address provided therein. The kit may also contain laboratory tools such as sample tubes, plate sealers, microcentrifuge tube openers, labels, magnetic particle separators, foam inserts, ice bags, dry ice bags, insulating materials, and the like. The kit may further comprise a pre-packaged or dedicated functionalized substrate as described herein to amplify and/or detect the library molecules. In embodiments, the substrate may comprise a surface suitable for performing a sequencing reaction therein.
In one aspect, a kit is provided, wherein the kit comprises: i) An enzyme that circularizes a nucleic acid (e.g., a circularizing agent as described herein, such as a thermostable ATP-dependent ligase that catalyzes intramolecular ligation of ssDNA templates with 5 '-phosphate and 3' -hydroxyl); ii) a plurality of oligonucleotide primers; iii) A plurality of blocking elements (e.g., blocking elements as described herein); iv) polymerase (e.g., non-strand displacement polymerase, such as) The method comprises the steps of carrying out a first treatment on the surface of the And v) multiple nucleotides (e.g., dNTPs for amplification, extension and/or sequencing in a suitable buffer).
In embodiments, the plurality of oligonucleotide primers comprises at least 7 primers (IGH loci). In an embodiment, a subset of the plurality of primers all target a junction gene. In embodiments, the plurality of oligonucleotide primers comprises at least two different primer populations (e.g., first and second primer pairs, or primer sets). In embodiments, the plurality of oligonucleotide primers comprises about 1, 2, 3, 4, 5, 10, 15, 25, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, or 1000 different primer sets. In embodiments, each primer set is provided at a concentration of about 25nM to about 200 nM. In an embodiment, each primer set is provided at a concentration of about 100 nM. In an embodiment, one blocking element per group is provided.
In an embodiment, the plurality of blocking elements comprises at least two different populations of blocking elements. In embodiments, the blocking element comprises at least 6 different blocking elements (e.g., for IGH loci, 6 blocking elements are used to target each junction gene).
In embodiments, the polymerase isHigh fidelity DNA polymerase, taq DNA polymerase, bst DNA polymerase, T7 DNA polymerase, sulfolobus DNA polymerase or DNA polymerase I.
In embodiments, the kit further comprises a fragmenting enzyme (e.g., an enzyme capable of fragmenting a high molecular weight DNA sample into about 200-300bp DNA fragments). In some embodiments, the primers are used in a single-pool PCR reaction. In other embodiments, the primers are used in a multiplex PCR reaction.
In embodiments, the kit further comprises a restriction enzyme or CRISPR/Cas9 protein for consuming the WT DNA loop. For example, in an embodiment, the WT DNA-specific deletion will be mediated by a WT DNA-specific oligonucleotide (e.g., a blocking element), i.e., cas9 will be guided by a "blocker" guide RNA (i.e., the blocking element is a guide RNA) that will linearize the WT DNA loop, preventing exponential amplification from occurring in subsequent steps. In embodiments, the kit further comprises a plurality of adaptors. In embodiments, the kit further comprises instructions.
In an embodiment, the kit further comprises a blocking element comprising biotin. In embodiments, the kit further comprises a blocking element comprising a restriction site. In embodiments, the kit further comprises a methylation sensitive restriction enzyme (e.g., notI, naeI, nsbI, salI, hapII or HaeII).
In one aspect, a microfluidic device is provided, wherein the microfluidic device is capable of performing any of the methods described herein, including embodiments. Microfluidic devices are suitable for amplifying, processing and/or detecting samples of analytes of interest in flow cells. In this application, the fluidic system is made with reference to nucleic acid sequencing (i.e., genomic instrumentation), which allows for sequencing of nucleic acid molecules. However, the techniques disclosed herein may be applied to any system that utilizes a reaction vessel, such as a flow-through cell, to detect an analyte of interest, and to introduce a solution into the system during preparation, reaction, detection, or any other process on or within the reaction vessel. The term "microfluidic device" means an integrated system having one or more chambers, ports and channels that are interconnected and in fluid communication and designed for performing an analytical reaction or process, such as sample introduction, fluid and/or reagent driven devices, temperature control, detection systems, data collection and/or integrated systems, alone or in cooperation with an instrument or instrument providing a support function, for determining the nucleic acid sequence of a template polynucleotide. In an embodiment, the device includes a light source that irradiates the sample, an objective lens, and a sensor array (e.g., a Complementary Metal Oxide Semiconductor (CMOS) array or a Charge Coupled Device (CCD) array). The nucleic acid sequencing device may further comprise special functional coatings on valves, pumps and internal walls. For example, the microfluidic device is a nucleic acid sequencing device provided by: singular GenomicsTM (e.g., G4. TM. Sequencing platform), illumina (e.g., hiSeqTM, miSeqTM, nextSeqTM or NovaSeq. TM. System), life technologies (e.g., ABIPRISMTM or SOLIDTM System), pacific bioscience (Pacific Biosciences) (e.g., systems using SMRTM technology, such as the SequelTM or RS IITM systems), or Kanji (Qiagen) (e.g., generader. TM. System).
P example
The present disclosure provides the following illustrative embodiments.
Embodiment p1. A method of detecting a polynucleotide fusion comprising a sequence of a first region fused to a sequence of a second region at a fusion junction, the method comprising: (a) Circularizing one or more linear nucleic acid molecules to form a circular template polynucleotide comprising a contiguous strand lacking free 5 'and 3' ends; (b) Amplifying a circular template polynucleotide comprising the fusion junction in an amplification reaction comprising a first primer, a second primer, a blocking element, and a polymerase to produce a fusion amplification product, wherein: (i) The first region comprises a first strand comprising, from 5 'to 3', a sequence that specifically binds to the blocking element, a sequence that specifically hybridizes to the first primer, and a sequence that is complementary to the sequence that specifically hybridizes to the second primer; (ii) The fusion junction is located between the sequence that specifically binds to the blocking element and the sequence that specifically hybridizes to the first primer; (iii) The blocking element inhibits polymerase extension along the sequence to which it binds; and (iv) the circular template polynucleotide comprising the fusion junction does not comprise the sequence or complement thereof that specifically binds to the blocking element; and (c) detecting the fusion amplification product, thereby detecting the polynucleotide fusion.
Embodiment P2 the method of embodiment P1, wherein the one or more linear nucleic acid molecules comprise DNA, RNA, or cDNA; optionally wherein said DNA or said RNA is cell-free nucleic acid.
Embodiment P3. The method of embodiment P2, wherein the one or more linear nucleic acid molecules comprises RNA or cDNA and the fusion junction is located at an exon junction.
Embodiment P4 the method of any one of embodiments P1-P3, wherein the fusion comprises an inter-or intra-chromosomal translocation.
Embodiment P5. the method of embodiment P4, wherein the intrachromosomal translocation comprises a partially or fully rearranged B cell or T cell antigen receptor.
Embodiment P6 the method of any one of embodiments P1 to P5, wherein the sequence of the first region comprises a sequence of a first gene and the sequence of the second region comprises a sequence of a second gene.
Embodiment P7 the method of any one of embodiments P1 to P6, wherein the blocking element comprises an oligonucleotide, a protein, or a combination thereof.
Embodiment P8 the method of any one of embodiments P1 to P7, wherein the one or more linear nucleic acid molecules are about 20 to about 1000 nucleotides in length, about 100 to about 300 nucleotides in length, about 300 to about 500 nucleotides in length, or about 500 to about 1000 nucleotides in length.
Embodiment P9 the method of any one of embodiments P1-P8, wherein the one or more linear nucleic acid molecules comprise a barcode sequence.
Embodiment P10 the method of any one of embodiments P1 to P9, wherein the circularization comprises intramolecular conjugation of the 5 'and 3' ends of the linear nucleic acid molecules.
Embodiment P11 the method of any one of embodiments P1-P10, wherein the cyclizing comprises a ligation reaction.
Embodiment P12 the method of any one of embodiments P1 to P11, wherein the sequence that specifically binds to the blocking element, the sequence that specifically hybridizes to the first primer, or both is about 1 to about 100 nucleotides from the fusion junction.
Embodiment P13 the method of any one of embodiments P1 to P12, wherein the sequence that specifically hybridizes to the first primer is separated from the sequence that is complementary to the sequence that specifically hybridizes to the second primer by about 1 to about 50 nucleotides.
Embodiment P14 the method of any one of embodiments P1 to P13, wherein the sequence that specifically hybridizes to the first primer and the sequence that is complementary to the sequence that specifically hybridizes to the second primer are located within the same exon of a target gene.
Embodiment P15 the method of any one of embodiments P1 to P14, wherein the linear nucleic acid molecule is single stranded.
Embodiment P16 the method of any one of embodiments P1 to P14, wherein the linear nucleic acid molecule is double stranded.
Embodiment P17 the method of any one of embodiments P1 to P16, wherein (i) the first primer comprises a 5' sequence that does not hybridize to the first strand of the first region under amplification conditions; and/or (ii) the second primer comprises a 5' sequence that does not hybridize under amplification conditions to the complement of the first strand of the first region.
Embodiment P18 the method of any one of embodiments P1 to P17, wherein (i) the amplification reaction further comprises a second blocking element that inhibits polymerase extension along the sequence to which it binds, and (ii) the first region comprises a first strand comprising, from 5 'to 3', a sequence complementary to the sequence that specifically hybridizes to the second primer, and a sequence complementary to the sequence that specifically binds to the second blocking element.
Embodiment P19 the method of embodiment P18, wherein the sequence complementary to the sequence specifically hybridizing to the second primer is separated from the sequence complementary to the sequence specifically binding to the second blocking element by about 100 to about 300 nucleotides.
Embodiment P20 the method of any one of embodiments P1 to P19, wherein the amplifying comprises a plurality of cycles comprising the steps of primer hybridization, primer extension, and denaturation in the presence of the first primer, the blocking element, and the second primer.
Embodiment P21 the method of any one of embodiments P1 to P20, wherein the amplifying comprises exponentially including the circular template polynucleotide of the fusion junction.
Embodiment P22 the method of any one of embodiments P1 to P21, wherein detecting the fusion amplification product comprises detecting the length of the fusion amplification product, detecting one or more probes that bind to the fusion amplification product, or sequencing the fusion amplification product.
Embodiment P23 the method of any one of embodiments P1 to P21, wherein detecting the fusion amplification product comprises sequencing the fusion amplification product to generate sequencing reads of the sequences of the first region and the second region.
Embodiment P24 the method of embodiment P23, wherein the sequencing comprises hybridizing one or more sequencing primers to the fusion amplification product and extending the one or more sequencing primers.
Embodiment P25 the method of embodiment P23, wherein the sequencing comprises sequencing by synthesis, sequencing by hybridization, sequencing by ligation, or pyrosequencing.
Embodiment P26 the method of embodiment P23, wherein the sequencing comprises a plurality of sequencing cycles.
Embodiment P27. The method of embodiment P26, wherein the sequencing results in reads greater than 25bp in read length.
Embodiment P28 the method of embodiment P23, wherein the sequencing comprises extending a sequencing primer by incorporating labeled nucleotides or labeled nucleotide analogs and detecting the label to generate a signal for each incorporated nucleotide or nucleotide analog, wherein the sequencing primer hybridizes to one of the fusion amplification products.
Embodiment P29 the method of any one of embodiments P23 to P28, wherein detecting the fusion amplification product comprises aligning the substring of each sequencing read with a reference sequence and quantifying the number of sequencing reads of the circular template polynucleotide comprising the fusion junction.
Embodiment P30 the method of any one of embodiments P23 to P28, wherein detecting the fusion amplification product comprises comparing the k-mer substring of each sequencing read to a k-mer table of fusion junction references, and quantifying the number of k-mers shared between the sequencing reads and the fusion junction references.
Embodiment P31 the method of any one of embodiments P23 to P28, wherein detecting the fusion amplification product comprises (i) grouping sequencing reads based on a barcode sequence and/or a sequence comprising the fusion junction; and (ii) within each group, aligning reads and forming a consensus sequence of reads having the same barcode sequence and/or sequences comprising the fusion junction.
Embodiment P32 the method of any one of embodiments P23 to P31, wherein the sequencing further comprises generating sequencing reads spanning the circularized junction formed between the 5 'and 3' ends of the linear nucleic acid molecule, and quantifying the number of different circularized junction sequences comprising the fusion junction.
Embodiment P33 the method of any one of embodiments P1 to P32, further comprising quantifying the fusion amplification product.
Embodiment P34 the method of any one of embodiments P1 to P33, wherein the one or more linear nucleic acid molecules are derived from a sample of a subject, optionally wherein the sample is an FFPE sample.
Embodiment P35 the method of any one of embodiments P1 to P34, wherein the polynucleotide fusion is a biomarker for cancer, autoimmune disease, primary immunodeficiency or infectious disease.
Embodiment P36 the method of embodiment P35, wherein the polynucleotide fusion is a biomarker for cancer.
Embodiment P37 the method of embodiment P35, wherein the polynucleotide fusion is a biomarker for lymphoid malignancies.
Embodiment P38 the method of any one of embodiments P1 to P37, wherein the amplification reaction further comprises: (a) One or more different first primers that specifically hybridize to different portions of the first strand of the first region; (b) For each different first primer, a different second primer that specifically hybridizes to a complement of a portion of the first strand of the first region, the complement being in a 3' position relative to the corresponding different first primer specific hybridization; and (c) for each different first primer, a different blocking oligonucleotide that specifically hybridizes to a portion of the first strand of the first region at a position of 5' relative to the specific hybridization of the different first primer.
Embodiment P39 the method of any one of embodiments P1 to P38, further comprising detecting one or more different polynucleotide fusions, each different polynucleotide fusion comprising a fusion between a sequence of a different first region at a different fusion junction with a sequence of a different second region, wherein the amplification reaction further comprises a corresponding first primer, a corresponding second primer, and a corresponding blocking oligonucleotide for each different first region.
Embodiment P40 the method according to any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises AGTRAP-BRAF, AKAP9-BRAF, ATIC-ALK, CCDC6-RET, CD74-NRG1, CD74-ROS1, CEP89-BRAF, CLCN6-BRAF, DCTN1-ALK, EML4-ALK, EZR-ROS1, FAM131B-BRAF, FCHSD1-BRAF, GATM-BRAF, GNAI1-BRAF, GOLGA5-RET, GOPC-ROS1, HIP1-ALK, HOOK3-RET, KIF5B-ALK, KIF5B-RET, KTN1-RET, LRIG3-ROS1 LSM14A-BRAF, MKRN1-BRAF, MSN-ALK, MYO5A-ROS1, NCOA4-RET, PCM1-RET, RANBP2-ALK, RELCH-RET, RNF130-BRAF, SDC4-ROS1, SLC34A2-ROS1, SLC3A2-NRG1, SLC45A3-BRAF, SQSTM1-ALK, STRN-ALK, TFG-ALK, TPM3-ROS1, TPR-ALK, TRIM24-BRAF, TRIM24-RET, TRIM27-RET, TRIM33-RET, VCL-ALK, WDCP-ALK, or ZCCHC8-ROS 1.
Embodiment P41 the method of any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises a gene encoding a kinase domain or a portion thereof.
Embodiment P42 the method of any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises a gene fusion of BCL1-JH, BCL2-JH, or MYC-IGL.
Embodiment P43 the method of any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises a fusion of: a rearranged T cell antigen receptor or fragment thereof, a T cell receptor alpha variable (TRAV) gene or fragment thereof, a T cell receptor alpha junction (TRAJ) gene or fragment thereof, a T cell receptor alpha constant (TRAC) gene or fragment thereof, a T cell receptor beta variable (TRBV) gene or fragment thereof, a T cell receptor beta diversity (TRBD) gene or fragment thereof, a T cell receptor beta junction (TRBJ) gene or fragment thereof, a T cell receptor beta constant (TRBC) gene or fragment thereof, a T cell receptor gamma variable (TRGV) gene or fragment thereof, a T cell receptor gamma constant (TRGC) gene or fragment thereof, a T cell receptor delta variable (TRDV) gene or fragment thereof, a T cell receptor delta diversity (TRDD) gene or fragment thereof, or a T cell receptor delta constant (TRDC) gene or fragment thereof.
Embodiment P44 the method of any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises a fusion of: a rearranged B cell antigen receptor or fragment thereof, an IGHV gene or fragment thereof, an IGHD gene or fragment thereof, or an IGHJ gene or fragment thereof, an IGHJC gene or fragment thereof, an IGKV gene or fragment thereof, an IGKJ gene or fragment thereof, an IGKC gene or fragment thereof, an IGLV gene or portion thereof, an IGLJ gene or portion thereof, an IGLC gene or fragment thereof, an IGK kappa deletion element or portion thereof, an IGK intron enhancer element or portion thereof.
Embodiment P45 the method of any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises a fusion of: ALK gene or a part thereof, BRAF gene or a part thereof, EGFR gene or a part thereof, ERBB2 gene or a part thereof, KRAS gene or a part thereof, MET gene or a part thereof, NRG1 gene or a part thereof, FGFR2 gene or a part thereof, FGFR3 gene or a part thereof, NTRK1 gene or a part thereof, NTRK2 gene or a part thereof, NTRK3 gene or a part thereof, RET gene or a part thereof, or ROS1 gene or a part thereof.
Embodiment P46 the method of any one of embodiments P1 to P39, wherein the polynucleotide fusion comprises a B-cell or T-cell intrachromosomal rearrangement.
Embodiment p47 a method of differentially amplifying a polynucleotide comprising a fusion gene relative to a polynucleotide not comprising the fusion gene, the method comprising: i) Circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises the fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise the fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides; ii) binding a blocking element to the one or more non-fused circular template polynucleotides; and iii) hybridizing a first primer and a second primer to the one or more non-fusion circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; whereby the polynucleotides comprising the fusion gene are differentially amplified.
Embodiment P48 the method of embodiment P47, wherein binding the blocking element comprises binding the blocking element upstream of the first primer.
Embodiment P49 the method of embodiment P47 or embodiment P48, wherein the second amount is about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 75% greater than the first amount.
Embodiment P50 the method of embodiment P47 or embodiment P48, wherein the second amount is about 2 times, at least about 1.5 times, at least about 2.0 times, at least about 2.5 times, at least about 5 times, at least about 10 times, or more than about 10 times the first amount.
Embodiment P51 the method of any one of embodiments P47-P50, further comprising detecting the first amount of non-fusion polynucleotide amplification product and the second amount of fusion polynucleotide amplification product.
Embodiment P52 the method of any one of embodiments P47-P51, wherein the one or more linear nucleic acid molecules comprises DNA, RNA, or cDNA; optionally wherein said DNA or said RNA is a cell-free nucleic acid molecule.
Embodiment P53 the method of any one of embodiments P47-P51, wherein the one or more linear nucleic acid molecules comprises RNA or cDNA and the fusion gene comprises an exon junction.
Embodiment P54 the method of any one of embodiments P47-P51, wherein the one or more linear nucleic acid molecules comprises RNA or cDNA and the fusion gene comprises an exon junction formed by alternative splicing.
Embodiment P55 the method of any one of embodiments P47-P51, wherein the one or more linear nucleic acid molecules comprises RNA or cDNA and the fusion gene comprises an exon junction formed by a splice defect.
Embodiment P56 the method of any one of embodiments P47-P55, wherein the fusion gene comprises an inter-or intra-chromosomal translocation.
Embodiment P57 the method of embodiment P56, wherein the intrachromosomal translocation comprises a partially or fully rearranged B cell or T cell antigen receptor.
Embodiment P58 the method of any one of embodiments P47-P57, wherein the blocking element comprises an oligonucleotide, a protein, or a combination thereof.
Embodiment P59 the method of any one of embodiments P47-P57, wherein the one or more linear nucleic acid molecules are about 20 to about 1000 nucleotides in length, about 100 to about 300 nucleotides in length, about 300 to about 500 nucleotides in length, or about 500 to about 1000 nucleotides in length.
Embodiment P60 the method of any one of embodiments P47-P59, wherein the blocking element binds to about 1 to 150 nucleotides upstream relative to the first primer.
Embodiment P61 the method of any one of embodiments P47-P59, wherein the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 1 to 100 nucleotides, downstream of a fusion junction within the fusion gene.
Embodiment P62. The method of any one of embodiments P47-P59, wherein the first primer and the second primer hybridize to complementary sequences of the one or more fused circular template polynucleotides and the one or more non-fused circular template polynucleotides, wherein the first primer and the second primer are from about 1 to about 50 nucleotides.
Embodiment P63 the method of any one of embodiments P47-P62, further comprising binding a second blocking element to the one or more non-fusion circular template polynucleotides downstream relative to the second primer.
Embodiment P64 the method of embodiment P63, wherein the second blocking element binds to about 100 to about 300 nucleotides downstream relative to the second primer.
Embodiment P65 the method of any one of embodiments P47-P64, further comprising repeating steps ii) and iii).
Embodiment P66 the method of any one of embodiments P47-P65, further comprising: detecting the length of the non-fusion polynucleotide amplification product and the length of the fusion polynucleotide amplification product; detecting one or more probes bound to the non-fusion polynucleotide amplification product and the fusion polynucleotide amplification product; or sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product.
Embodiment P67 the method of embodiment P66, wherein sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product produces one or more sequencing reads.
Embodiment P68 the method of embodiment P67, further comprising aligning the substring of one or more sequencing reads to a reference sequence.
Embodiment P69 the method of embodiment P67, further comprising comparing the k-mer substring of the one or more sequencing reads to a k-mer table of a fusion gene reference.
Embodiment P70 the method of embodiment P67, further comprising: grouping one or more sequencing reads based on the barcode sequence and/or the sequence comprising the fusion gene; and within the set, aligning the reads and forming a consensus sequence of reads having the same barcode sequence and/or sequences comprising the fusion gene.
Embodiment P71 the method of embodiment P66, wherein sequencing further comprises: generating one or more sequencing reads comprising a circularized junction formed between the 5 'and 3' ends of the linear nucleic acid molecule; and quantifying the number of different circularized junction sequences comprising said fusion gene.
Further embodiments
The present disclosure provides the following additional illustrative embodiments.
Example 1. A method of differentially amplifying a polynucleotide comprising a fusion gene relative to a polynucleotide not comprising the fusion gene, the method comprising: i) Circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises the fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise the fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides; ii) binding a blocking element to the one or more non-fused circular template polynucleotides; and iii) hybridizing a first primer and a second primer to the one or more non-fusion circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; whereby the polynucleotides comprising the fusion gene are differentially amplified.
Embodiment 2. The method of embodiment 1, wherein binding the blocking element comprises binding the blocking element upstream of the first primer.
Embodiment 3. The method of embodiment 1 or 2, wherein the second amount is about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 75% greater than the first amount.
Embodiment 4. The method of embodiment 1 or 2, wherein the second amount is about 2 times, at least about 1.5 times, at least about 2.0 times, at least about 2.5 times, at least about 5 times, at least about 10 times, or more than about 10 times the first amount.
Embodiment 5. The method of any one of embodiments 1 to 4, further comprising detecting the first amount of non-fusion polynucleotide amplification product and the second amount of fusion polynucleotide amplification product.
Embodiment 6. The method of any one of embodiments 1 to 5, wherein the one or more linear nucleic acid molecules comprise DNA, RNA, or cDNA; optionally wherein said DNA or said RNA is a cell-free nucleic acid molecule.
Embodiment 7. The method of any one of embodiments 1 to 5, wherein the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction.
Embodiment 8. The method of any one of embodiments 1 to 5, wherein the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction formed by alternative splicing.
Embodiment 9. The method of any one of embodiments 1 to 5, wherein the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction formed by a splice defect.
Embodiment 10. The method of any one of embodiments 1 to 9, wherein the fusion gene comprises an inter-or intra-chromosomal translocation.
Embodiment 11. The method of embodiment 10, wherein the intrachromosomal translocation comprises a partially or fully rearranged B cell or T cell antigen receptor.
Embodiment 12. The method of any one of embodiments 1 to 11, wherein the blocking element comprises an oligonucleotide, a protein, or a combination thereof.
Embodiment 13. The method of any one of embodiments 1 to 11, wherein the one or more linear nucleic acid molecules are about 20 to about 1000 nucleotides in length, about 100 to about 300 nucleotides in length, about 300 to about 500 nucleotides in length, or about 500 to about 1000 nucleotides in length.
Embodiment 14. The method of any one of embodiments 1 to 13, wherein the blocking element binds to about 1 to 150 nucleotides upstream relative to the first primer.
Embodiment 15. The method of any one of embodiments 1 to 13, wherein the first primer hybridizes to the one or more fusion circular template polynucleotides, i.e., about 1 to 100 nucleotides, downstream of a fusion junction within the fusion gene.
Embodiment 16. The method of any one of embodiments 1 to 13, wherein the first primer and the second primer hybridize to complementary sequences of the one or more fused circular template polynucleotides and the one or more non-fused circular template polynucleotides, wherein the first primer and the second primer are about 1 to about 50 nucleotides apart.
Embodiment 17. The method of any one of embodiments 1 to 16, further comprising binding a second blocking element to the one or more non-fusion circular template polynucleotides downstream relative to the second primer.
Embodiment 18. The method of embodiment 17 wherein the second blocking element binds to about 100 to about 300 nucleotides downstream relative to the second primer.
Embodiment 19. The method of any one of embodiments 1 to 18, further comprising repeating steps ii) and iii).
Embodiment 20. The method of any one of embodiments 1 to 19, further comprising iv) amplifying the one or more non-fusion circular template polynucleotides to produce a third amount of non-fusion polynucleotide amplification products; and amplifying the one or more fusion circular template polynucleotides to produce a fourth quantity of fusion polynucleotide amplification products, wherein the third quantity and the fourth quantity are substantially the same.
Embodiment 21. The method of embodiment 20, wherein amplifying the one or more non-fused circular template polynucleotides comprises hybridizing a third primer and a fourth primer to the one or more non-fused circular template polynucleotides and extending both primers with a polymerase, and wherein amplifying the one or more fused circular template polynucleotides comprises hybridizing a third primer and a fourth primer to the one or more fused circular template polynucleotides and extending both primers with a polymerase.
Embodiment 22. The method of embodiment 21 wherein the third primer hybridizes upstream of the target sequence and the fourth primer hybridizes downstream of the target sequence, wherein the target sequence comprises a single nucleotide variant, an insertion, a deletion, an internal tandem repeat, or a copy number variant.
Embodiment 23. The method of any of embodiments 1 to 22, further comprising: detecting the length of the non-fusion polynucleotide amplification product and the length of the fusion polynucleotide amplification product; detecting one or more probes bound to the non-fusion polynucleotide amplification product and the fusion polynucleotide amplification product; or sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product.
Embodiment 24. The method of embodiment 23, wherein sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product produces one or more sequencing reads.
Embodiment 25. The method of embodiment 24, further comprising aligning the substring of one or more sequencing reads to a reference sequence.
Embodiment 26. The method of embodiment 24, further comprising comparing the k-mer substring of the one or more sequencing reads to a k-mer table of a fusion gene reference.
Embodiment 27. The method of embodiment 24, further comprising: grouping one or more sequencing reads based on the barcode sequence and/or the sequence comprising the fusion gene; and within the set, aligning the reads and forming a consensus sequence of reads having the same barcode sequence and/or sequences comprising the fusion gene.
Embodiment 28. The method of embodiment 23, wherein sequencing further comprises: generating one or more sequencing reads comprising a circularized junction formed between the 5 'and 3' ends of the linear nucleic acid molecule; and quantifying the number of different circularized junction sequences comprising said fusion gene.
Example 29 a kit comprising: a circularizing agent, wherein the circularizing agent is capable of binding the 5 'and 3' ends of a linear nucleic acid molecule; a blocking element capable of binding to one or more circular polynucleotides; a first primer and a second primer; and a polymerase.
Embodiment 30. A method of amplifying a polynucleotide comprising a fusion gene, the method comprising: i) Binding a blocking element to a non-fusion circular template polynucleotide, wherein the non-fusion circular template does not include the fusion gene; ii) hybridizing a first primer and a second primer to the non-fusion circular template polynucleotide; and hybridizing the first primer and the second primer to a fusion circular template polynucleotide, wherein the fusion circular template polynucleotide comprises the fusion gene; and iii) extending the first primer and the second primer with a non-strand displacement polymerase to produce a fusion polynucleotide amplification product.
Embodiment 31. The method of embodiment 30, wherein binding the blocking element comprises binding the blocking element upstream of the first primer.
Embodiment 32. The method of any one of embodiments 30 to 31, further comprising detecting the fusion polynucleotide amplification product.
Embodiment 33 the method of any one of embodiments 30 to 32, wherein the circular template polynucleotide (e.g., non-fused circular template polynucleotide and/or the fused circular template polynucleotide) comprises DNA, RNA, or cDNA; optionally wherein said DNA or said RNA is a cell-free nucleic acid molecule.
Embodiment 34. The method of any one of embodiments 30 to 32, wherein the circular template polynucleotide (e.g., non-fused circular template polynucleotide and/or the fused circular template polynucleotide) is RNA or cDNA and the fusion gene comprises an exon junction.
Embodiment 35 the method of any one of embodiments 30-32, wherein the circular template polynucleotide (e.g., non-fused circular template polynucleotide and/or the fused circular template polynucleotide) is RNA or cDNA and the fusion gene comprises an exon junction formed by alternative splicing.
Embodiment 36. The method of any one of embodiments 30 to 32, wherein the circular template polynucleotide (e.g., non-fused circular template polynucleotide and/or the fused circular template polynucleotide) is RNA or cDNA and the fusion gene comprises an exon junction formed by a splice defect.
Embodiment 37 the method of any one of embodiments 30-36, wherein the fusion gene comprises an inter-or intra-chromosomal translocation.
Embodiment 38. The method of embodiment 37, wherein the intrachromosomal translocation comprises a partially or fully rearranged B cell or T cell antigen receptor.
Embodiment 39. The method of any one of embodiments 30 to 38, wherein the blocking element comprises an oligonucleotide, a protein, or a combination thereof.
Embodiment 40. The method of any one of embodiments 30 to 39, wherein the blocking element binds to about 1 to 150 nucleotides upstream relative to the first primer.
Embodiment 41. The method of any one of embodiments 30 to 40, wherein the first primer hybridizes to the fusion circular template polynucleotide, i.e., about 1 to 100 nucleotides, downstream of a fusion junction within the fusion gene.
Embodiment 42. The method of any one of embodiments 30 to 40, wherein the first primer and the second primer hybridize to complementary sequences of the fusion circular template polynucleotide and the non-fusion circular template polynucleotide, wherein the first primer and the second primer are about 1 to about 50 nucleotides apart.
Embodiment 43. The method of any one of embodiments 30 to 42, further comprising binding a second blocking element to the non-fusion circular template polynucleotide downstream relative to the second primer.
Embodiment 44. The method of embodiment 43, wherein the second blocking element binds to about 100 to about 300 nucleotides downstream relative to the second primer.
Embodiment 45. The method of any of embodiments 30 to 44 further comprising repeating steps i), ii) and iii).
Embodiment 46. The method of any of embodiments 30 to 45, further comprising: iv) removing the blocking element and amplifying the non-fused circular template polynucleotide to produce a plurality of non-fused polynucleotide amplification products; and amplifying the fused circular template polynucleotide to produce an additional fused polynucleotide amplification product.
Embodiment 47. The method of embodiment 46, wherein amplifying the non-fused circular template polynucleotide comprises hybridizing a third primer and a fourth primer to the non-fused circular template polynucleotide and extending both primers with a polymerase, and wherein amplifying the fused circular template polynucleotide comprises hybridizing a third primer and a fourth primer to the fused circular template polynucleotide and extending both primers with a polymerase.
Embodiment 48. The method of embodiment 47, wherein the third primer hybridizes upstream of the target sequence and the fourth primer hybridizes downstream of the target sequence, wherein the target sequence comprises a single nucleotide variant, an insertion, a deletion, an internal tandem repeat, or a copy number variant.
Embodiment 49 the method of any one of embodiments 30 to 48, further comprising detecting the length of the fusion polynucleotide amplification product, detecting one or more probes that bind to the fusion polynucleotide amplification product, or sequencing the fusion polynucleotide amplification product.
Embodiment 50. The method of embodiment 49, wherein the sequencing of the fusion polynucleotide amplification product results in one or more sequencing reads.
Embodiment 51. The method of embodiment 50, further comprising aligning the substring of one or more sequencing reads to a reference sequence.
Embodiment 52. The method of embodiment 50, further comprising comparing the k-mer substring of the one or more sequencing reads to a k-mer table of a fusion gene reference.
Embodiment 53 the method of embodiment 49, further comprising: grouping one or more sequencing reads based on the barcode sequence and/or the sequence comprising the fusion gene; and within the set, aligning the reads and forming a consensus sequence of reads having the same barcode sequence and/or sequences comprising the fusion gene.
Embodiment 54. The method of embodiment 49, wherein sequencing further comprises: generating one or more sequencing reads, the one or more sequencing reads comprising a circularized junction; and quantifying the number of different circularized junction sequences comprising said fusion gene.
Embodiment 55. The method of any one of embodiments 30 to 49, wherein prior to step i), the method comprises circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises the fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise the fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides.
Examples
Example 1 fusion detection by template cyclization and multiplex PCR
Fusion is a somatic change that can lead to cancer, is associated with up to 20% of cancer incidence, and has carcinogenesis in blood, soft tissue, and solid tumors (Foltz SM et al Nature Comm 2020; 11:2666). Translocation, copy number changes, and inversion may lead to fusion, deregulation of gene expression, and novel molecular functions. Next Generation Sequencing (NGS) methods for gene fusion detection may employ non-targeted sequencing (e.g., whole genome or whole transcriptome sequencing) or targeted sequencing of the fusion gene of interest. Targeting methods for gene fusion detection can simplify analysis and reduce cost, and thus have become the leading method of clinical application. Popular methods for targeted sequencing of gene fusions include multiplex PCR, in which primer sets are designed to generate PCR amplicons spanning known breakpoint junctions (e.g., maher CA et al Nature 2009;458 7234: 97-101 and Oncomine tests); anchored Multiplex PCR (AMP), wherein one or more targeting primers are used in combination with the ligated universal primer aptamer to enable PCR amplification of the breakpoint of interest (e.g., archerDx); and methods for enriching breakpoint regions of interest using hybridization capture. In targeting methods, multiplex PCR provides high sensitivity and sequencing efficiency, but fails to identify fusions involving novel breakpoints and partners; AMPs enable detection of known and new fusions, but with relatively high input requirements and more complex workflows, typically limited to RNA analysis; hybridization capture has a relatively complex workflow and reduced sensitivity compared to PCR-based methods. For targeted and non-targeted approaches, robustness to sample degradation is often critical due to the widespread use of FFPE preserved tissue and cfDNA as input materials. Thus, there is a need for a method that enables highly sensitive targeted analysis of gene fusion with minimal workflow complexity and input requirements and robustness to highly degraded materials.
The compositions and methods described herein provide an effective solution for sequencing to achieve genetic variations such as SNV, insertion/deletion, and gene fusion, including targeted sequencing involving novel partners and genetic variations derived from novel breakpoints. The method enables high detection sensitivity for degraded materials through a simplified workflow. Importantly, the method can be applied to analyze nucleic acids extracted in bulk from a sample source (e.g., cfDNA from plasma, nucleic acids from FFPE preserved tissue samples, or nucleic acids extracted from peripheral blood leukocytes) or materials derived from common single cell library preparation systems. Described in detail herein in various embodiments, the method consists of the steps of: (1) circularizing nucleic acid derived from the sample; (2) Amplifying circularized nucleic acid derived from one or more targets of interest; and (3) analyzing the amplified fragments by Next Generation Sequencing (NGS).
In one embodiment, a workflow is presented to achieve targeted amplification of nucleic acids for analysis of gene fusions, including gene fusions involving novel partners or breakpoints. Briefly, the workflow begins with the extraction of a large amount of nucleic acids from a sample. RNA, DNA, or total nucleic acid (RNA and DNA) may be extracted using methods known in the art. If RNA is extracted, the RNA can be converted to cDNA using methods known in the art (e.g., oligonucleotide-dT cDNA synthesis, cDNA synthesis via random hexamers, targeted cDNA synthesis via gene specific primers). The DNA molecule may optionally be fragmented into an average length of about 150 base pairs. Fragmentation can be achieved by methods known in the art (e.g., enzymatic fragmentation, acoustic fragmentation). Next, the method known in the art (e.g., circLigase TM ) Or the methods described herein, circularizing the ssDNA fragments by enzymatic ligation of the 5 'and 3' ends. In some embodiments, circularization is facilitated by denaturing the double stranded nucleic acid prior to circularization. In an embodiment, the linear DNA fragment is a-tailed (e.g., a-tailed using Taq DNA polymerase) prior to circularization. Residual linear DNA molecules may optionally be digested. This isThis can be accomplished by methods known in the art (e.g., treatment with Exo I and/or Exo III).
Following circularization, nucleic acids are fusion amplified from the gene of interest using an outward-facing oligonucleotide primer that targets the fusion gene partner of interest adjacent to the desired breakpoint location (e.g., similar to an inverse PCR reaction) in combination with a 5' blocking element (e.g., a non-extendible oligonucleotide) that specifically binds to the sequence of the unrearranged fusion gene partner of interest adjacent to and opposite the desired breakpoint junction (fig. 1-3). The blocking element will not bind to the template containing the translocation at the intended breakpoint. Optionally, additional 3' blocking elements targeting the gene of interest distal to the breakpoint junction may be included (fig. 2 and 3). Typically, the blocking element has a Tm similar to or higher than that of the outward facing primer to ensure that it can bind the primer and prevent extension of the primer. The distance of the 5 'block may be within about 50bp of the fusion junction, and in some embodiments, the optional 3' block may be within about 100bp to about 200bp of the fusion junction. Typically, the optional 3 'blocker is farther from the fusion junction than the 5' blocker. PCR achieves preferential amplification of templates containing rearrangements. The resulting amplicon contained a junction derived from template circularization ("circularization junction") and corresponding to the sample breakpoint (fig. 4). The cyclized splice sites can be used to quantify the number of template copies and optionally error correction.
Amplification of unfused genes: as an internal control and to further assess the relative abundance of amplified fusion gene nucleic acids, the amplification of nucleic acids derived from one or more unrearranged (e.g., control) templates of interest can be performed within the same PCR reaction using outward facing primers, but omitting the blocking elements described. Alternatively, in some embodiments, it may be advantageous to include a positive control to avoid false negative results. Furthermore, in some embodiments, outward facing primers are included in a target region of a human genome or cDNA, wherein clinically relevant SNV, insertion/deletion, or copy number variants are known to occur. In some embodiments, the region of interest may comprise a cDNA derived from a gene having a deregulated expression in cancer, and/or a gene whose expression is largely unchanged (e.g., a housekeeping gene), to aid in analysis of gene expression. Analysis of such targets can be performed in the same PCR reaction using outward facing primers, but omitting the blocking oligomers described. In still other embodiments, the outward-facing primer targeting the fusion of interest is used in combination with an inward-facing primer targeting a region of interest of a human genome or cDNA, wherein clinically relevant SNV, insertion/deletion, internal tandem repeat, or copy number variants are known to occur as part of a multiplex PCR set. FIG. 11A shows an example in which two pairs of overlapping inward facing primers (e.g., 1F and 1R and 2F and 2R) are used to amplify a target region, resulting in three amplification products (e.g., three PCR products: amplicon 1 (amplification product of 1F and 1R primer pair), amplicon 2 (amplification product of 2F and 2R primer pair), and largest amplicon (amplification product of 1F and 2R primer pair), as described in U.S. patent publication No. 2016/0340746, which is incorporated herein by reference in its entirety, because of the low efficiency of amplification caused by the stable secondary structure, the minimal amplicon is inhibited from being produced by the 2F and 1R primer pairs.
By "overlapping primers" is meant, for example, that two pairs of primers (e.g., 1F and 1R, and 2F and 2R in fig. 11A) have overlapping target regions of target nucleic acid (e.g., the 1F and 1R amplification products will comprise portions of sequences that are also comprised in the 2F and 2R amplification products). For example, as shown in FIG. 11A, the 2F primer is located upstream of and adjacent to the 1R primer, while the 2R primer is located downstream of the 1R primer, thereby resulting in overlapping amplification products, wherein the region where the 2F and 1R primers are in contact and between will be shared between amplicon 1 and amplicon 2.
FIG. 11B shows expected amplification products from an example of amplification in which internal tandem repeats are performed with the primer pairs of FIG. 11A (e.g., 1F and 1R and 2F and 2R) when using a linear template. The amplification products are identical to those of the non-duplicate template in FIG. 11A (e.g., amplicon 1, amplicon 2, and the largest amplicon), excluding detection of tandem repeat events. FIG. 11C shows the expected amplification products of an example of amplification with the primer pair of FIG. 11A (e.g., 1F and 1R and 2F and 2R) for internal tandem repeat when using a circularized template. The amplification product now comprises duplicate specific amplicons (e.g., the amplification products of the 2R and 1F primer pairs). The duplicate specific amplicon is identified by the presence of a unique primer pair present in the amplicon and a circularized junction within the amplicon (represented by dashed lines). In this scenario, an inverse PCR product can be formed that clearly identifies the replication event.
Inward facing primers: while outward facing primers are particularly useful for determining novel gene fusion partners, it may also be useful to perform targeted gene sequencing to identify somatic mutations (e.g., SNPs associated with disturbed cell status). In particular, inward-facing primers (e.g., standard PCR primers) are used that target a region of interest containing known somatic alterations associated with the diseased state. In embodiments, the outward-facing primers targeting the fusion of interest are used in combination with inward-facing primers targeting regions of the human genome or cDNA, wherein clinically relevant SNVs or SNPs, insertion/deletions or Copy Number Variants (CNVs) are known to occur, e.g., as part of a multiplex PCR set (see, e.g., fig. 10). Like the outward facing primers, the inward facing primers contain target specific sequences, and optionally sequences for downstream library preparation and analysis. In embodiments, the inward-facing primers amplify the region of interest in the absence of the fusion gene (e.g., using the inward-facing primers to target regions other than exon breakpoints and/or fusion gene partners with known somatic mutations). In embodiments, the inward-facing primer targets a region of interest in the fusion gene transcript (e.g., the inward-facing primer targets one or more regions of the fusion gene transcript, wherein the one or more regions may be in different or the same genes). In embodiments, the inward-facing primer targets a different gene than the outward-facing primer (e.g., the inward-facing primer targets one gene of the fusion transcript and the outward-facing primer targets another gene of the fusion transcript). The inward-facing primers and the outward-facing primers may, for example, be contained in the same amplification reaction, or they may be combined into separate reactions (e.g., an amplification reaction consisting of only the inward-facing primers and an amplification reaction consisting of only the outward-facing primers, wherein each amplification reaction uses the same circularized template).
Modification of blocking element: the blocking element selectively binds to the unrearranged template to inhibit extension of the primer sequence by the polymerase. In some embodiments, the blocking element consists of an oligomer with a reverse 3' dt, 3' dideoxycytidine, reversibly terminated 3' modification, or other modification of the 3' strand to prevent 3' extension by a polymerase ("blocking oligomer") and is used in conjunction with a non-strand displacing polymerase. In some embodiments, the blocking oligomer contains one or more non-natural bases (e.g., LNA bases) that facilitate hybridization of the blocking agent to the target sequence. In some embodiments, the blocking oligomer contains additional modified bases to increase resistance to exonuclease digestion (e.g., one or more phosphorothioate linkages). The blocking element need not be an oligomer; in some embodiments, for example, the blocking element is a protein that selectively binds to the target sequence and prevents polymerase extension. In embodiments, the blocking element prevents extension during suitable amplification/extension conditions.
Alternative methods for enriching templates containing fusions: certain amplification reaction conditions can provide variable inhibition of unfused templates with blocking elements described herein, wherein a small proportion of unfused amplification products are generated. Alternative methods that may be implemented or used in addition to blocking elements are contemplated herein, and that selectively eliminate or render any non-fused circular templates prior to amplification.
For example, CRISPR-mediated depletion of unwanted target sequences can be performed, wherein, for example, a CRISPR-Cas9 complex is introduced into a sample containing circularized ssDNA using a guide RNA that specifically targets a non-fusion sequence. The CRISPR-Cas9 complex then targets and cleaves the non-fusion sequences present in any circular ssDNA molecule. After linearizing the non-fused circular ssDNA molecules by CRISPR complexes, an exonuclease digestion can then be performed to digest the linear ssDNA molecules, thereby enriching the circular ssDNA molecules containing the fusion gene (e.g., lacking the non-fused gene sequence targeted by the guide RNA).
Alternatively, biotinylated blocking elements may be employed. After circularization, the biotinylated blocking element is hybridized to a non-fusion gene sequence. The circular ssDNA molecules hybridized to the biotinylated blocking elements are then pulled down using, for example, streptavidin-coated magnetic beads, thereby depleting any sample containing non-fused circular molecules prior to amplification.
As yet another alternative, blocking oligomers may be used as splints to enable restriction enzyme mediated digestion of non-fused circular ssDNA containing molecules into non-amplifiable linear fragments. The methylation-blocking oligomer can be used in combination with a methylation-sensitive restriction enzyme (e.g., notI, naeI, nsbI, salI, hapII or HaeII).
Sequencing of the amplified region of interest was performed by next generation sequencing instruments. In some embodiments, sequencing is accomplished by a single read greater than about 25 base pairs in length. In other embodiments, sequencing is accomplished by paired-end reads, wherein each read within a pair is greater than about 25 bases. After sequencing, error correction can be performed and involves creating a consensus read from sequences with shared circularized junction sequences.
Various suitable sequencing platforms can be used to carry out the methods disclosed herein (e.g., for performing sequencing reactions). Non-limiting examples include SMRT (single molecule real time sequencing), ion semiconductor, pyrosequencing, sequencing by synthesis, combined probe anchoring, SOLiD sequencing (sequencing by ligation), and nanopore sequencing. The sequencing platform comprises a sequencing platform provided by:(e.g., hiSeq TM 、MiSeq TM And/or Genome Analyzer TM Sequencing system), ion Torrent TM (e.g., ion PGM) TM And/or Ion Proton TM Sequencing System), pacific bioscience Co (Pacific Biosciences) (e.g., PACBIO RS II sequencing System), life Technologies TM (e.g., SOLiD sequencing system), roche company (Roche) (e.g., 454GS flx+ and/or GS Junior sequencing system). See, for example, U.S. patent 7,211,390, U.S. patent 7,244,559, U.S. patent 7,264,929, U.S. patent 6,255,475, U.S. patent 6,013,445, U.S. patent 8,882,980, U.S. patent 6,664,079, and U.S. patent 9,416,409.
Next, sequence reads are analyzed to assess the presence of variants of interest. In some embodiments, this may involve using public software to detect gene fusions (e.g., geneFuse; chen S et al J.International bioscience (Int. J. Biol. Sci.) 2018;14 (8): 843-848). In other embodiments, this may be accomplished by mapping reads to the genome and analyzing the localization of the reads (e.g., fig. 5). In still other embodiments, this may include mapping-independent and/or mapping-dependent methods, such as methods involving analysis of k-mer substrings (e.g., FIG. 6). Fig. 7 and 8 provide exemplary bioinformatic workflows for analyzing rearrangements, translocations, and CNVs using the same method.
Additional Fusion detection tools known in the art can be used for analytical sequencing reads, such as TRUP (Fernandez-Cuesta, L., sun, R., menon, R., et al, breakpoint assembly using transcriptome sequencing data to identify novel Fusion genes for lung cancer (Identification of novel Fusion genes in lung cancer using breakpoint assembly of transcriptome sequencing data), genome biology (Genome Biol) 16,7 (2015)), chimerascan (Maher CA, palaniamy N, brenner JC, cao X, kalylana-Sundaram S, luo S et al, chimeric transcripts were found by paired end transcriptome sequencing (Chimeric transcript discovery by paired-end transcriptome sequencing), fusion Hur (Li Y, chien J, smith DI, ma J.fusion Hunter); identification of Fusion transcripts in cancer using paired-end RNA-seq (Fusion Hunter: identifying Fusion transcripts in cancer using paired-end RNA-seq) & Bioinformatics (Bioinformatics) & 2011; 27:1708-10), fusion map (Ge H, liu K, juan T, fang F, newman M, hoeck W.fusion map: detection of Fusion genes from next generation sequencing data at base pair resolution (Fusion map: detecting Fusion genes from next-generation sequencing data at base-pair resolution) & Bioinformatics. 2011; 27:1922-8), topHat-Fusion (Kim D, salzberg SL.TopHat Fusion: algorithms for discovery of novel Fusion transcripts (TopHat-Fusion: an algorithm for discovery of novel Fusion transcripts), "Genome biology (Genome biol.)," 2011; 12:R72), deFuse (McPherson A, hormozdiari F, zayed A, giuliana ny R, ha G, sun MGF et al, deFuse: algorithm for finding gene fusion in tumor RNA-Seq data (deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data) & lture science library: computational biology (PLoS Comp biol.)) 2011; 7:e1001138), SOAPfuse (Jia W, qia K, he M, song P, zhou Q, zhou F et al, SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data (SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data) & genome biology 2013; 14:R12), fusion seq (Sboner A, habegger L, pflueger D, terry S, chen DZ, rozowsky JS et al, fusion seq: modular framework for searching for gene fusions by analyzing paired-end RNA sequencing data (fusion seq: a modular framework for finding genefusions by analyzing paired-end RNA-sequencing data) & genome biology 2010; 11:R104) and Breakfusion (Chen K, wallis JW, kandoth C, kalicki-Veizer JM, mungall KL, mungall AJ et al, breakfusion): identification of gene fusions based on targeted assembly in whole transcriptome paired-end sequencing data (BreakFusion: targeted assembly-basedidentification of gene fusions in whole transcriptome paired-end sequencingdata) & biology, 2012; 28:1923-4).
IGH VDJ rearrangement and easy-to-place analysis: as an exemplary use case, a workflow is proposed to achieve targeted amplification of nucleic acids for simultaneous analysis of IGH V (D) J rearrangements and translocations involving the IGH J gene. Unlike conventional multiplex PCR methods for amplifying VDJ rearrangements, the described methods: (1) Clone loss (dropout) due to somatic hypermutation in the variable gene region is avoided; (2) enabling detection of an igh j translocation; (3) reducing the number of primers required; and (4) enable passage through the ringAnalysis of the chemosynthesis sites was performed for error correction and template quantification (fig. 7). Briefly, the workflow begins with extracting sample gDNA using methods known in the art. The gDNA molecule can optionally be fragmented to an average length of about 200 base pairs, for example, if the gDNA is derived from peripheral blood leukocytes or fresh frozen tumor biopsies. After fragmentation, by CircLigase TM Or a similar approach, circularizing the template, and then selectively amplifying the IGH rearrangement using an IGH j targeting primer binding blocking oligomer. As an example, a suitable primer design strategy for selectively amplifying an igh j rearrangement is presented in fig. 8.
Analysis: fig. 9 shows an overview of the bioinformatics workflow for analyzing B cell rearrangements by the described method. Amplification of the IGH, IGK and IGL loci is followed by next generation sequencing. The resulting reads were filtered to remove short and off-target products, cyclized junctions were identified, unique sequences collapsed, and then annotated for the presence of V (D) J rearrangements by IgBLAST (Ye et al, 2013doi:10.1093/nar/gkt 382) or similar tools. Reads with effective V (D) J rearrangements are used to determine the frequency of rearrangements and estimate the template count as the number of unique circularized junctions associated with a given rearrangement. A panel of identified V (D) J rearrangements is evaluated using methods known in the art (e.g., lay et al, utility laboratories (Practical Laboratory Medicine), vol. 22, 2020, e 00191) to identify cloned rearrangement markers consistent with the presence of B cell malignancies. Such markers may be used for longitudinal monitoring of residual disease. The presence of a translocation of reads lacking identifiable V (D) J rearrangements is assessed using k-mer analysis or methods known in the art (e.g., geneFuse). Finally, a report is generated indicating the V (D) J clonality and translocation status of the sample, or in the case of residual disease monitoring, whether a marker rearrangement is detected in the sample.
Single cell analysis: the compositions and methods described herein are compatible with common single cell barcode approaches, allowing detection of gene fusion events with single cell resolution to potentially reveal clinically relevant tumor heterogeneity. Single cell fusion assays may be part of a broader analysis pipeline to detect and report other cancer variants, such as CNV and SNV.
Single cell nucleic acid preparation: the target polynucleotides are isolated from a population of cells using methods known in the art. For example, a typical workflow comprises the following steps: 1) Single cells are individually divided into droplets (e.g., sub-nanoliter droplets); 2) Introducing bar code encoded beads and amplification reagents; 3) Cell lysis, protease digestion, cell barcode encoding and targeted amplification occur within the droplet; 4) The droplets are then broken and the DNA encoded by the barcode extracted for additional amplification and/or library preparation steps; 5) The final library was purified and ready for sequencing. Single cell library preparation schemes may also be used, including commercial solutions, for example, those offered by 10X Genomics (10X Genomics) and/or by the Mision Bio (Mision Bio).
Circularization of nucleic acid from the sample: during the circularization, the 5 'end of the nucleic acid molecule is linked to the 3' end of the molecule. In one embodiment, the ligase (e.g., circLigase TM Or T4 DNA ligase) for circularization of nucleic acids (DNA or RNA may be circularized). Where RNA (e.g., mRNA) is the target of circularization, the RNA is optionally converted to cDNA by reverse transcription. Optionally, after cyclization, residual linear molecules can be removed by exonuclease treatment. In addition, any circularized fragment containing an undesired sequence can be depleted from a library of circularized fragments, for example, by hybridization-based pull down using a probe targeting the undesired sequence, or CRISPR-mediated linearization of a circularized fragment containing the undesired sequence, followed by exonuclease treatment (see, e.g., U.S. patent publication 2019/0161752). The use of circularized template material may facilitate multiplex PCR even if used in combination with only conventional inward-facing PCR primers, as the circularized material lacks free 3' dna ends that may trigger non-specific amplification. Circularized DNA may enable more targeted amplification when used as templates for inward-facing primers and/or outward-facing primers in a PCR method, as compared to linear DNA.
Sequencing: the amplified nucleic acids are sequenced to determine the presence of one or more gene fusion events. The reading of the sequence may be accomplished using any suitable commercial sequencing mode, for example, in a preferred embodiment, using a next generation sequencing instrument. Reading sequences can also be accomplished using Sanger sequencing or other low throughput methods. The frequency of reads supporting the fusion gene can optionally be compared to the frequency of reads supporting unfused (i.e., wild-type or normal) copies of one or more of the donor or acceptor genes to determine the relative abundance of the gene fusion nucleic acid and whether sufficient read support is present to conclude that the sample contains gene fusion.
Example 2T cell receptor Convergence as biomarkers
An adaptive immune response comprises a selective response of B cells and T cells that recognize an antigen. Immunoglobulin genes encoding antibody (Ab, in B cells) and T cell receptor (TCR, in T cells) antigen receptors comprise complex loci in which extensive receptor diversity occurs due to recombination of the corresponding variable (V), diversity (D) and junction (J) gene fragments, and subsequent somatic hypermutation events during early lymphoid differentiation. After engagement of the TCR by homologous antigen, T lymphocytes upregulate many activation markers and develop a variety of effector functions, including proliferation, cytotoxicity, and cytokine production. Knowledge of TCR amino acid sequences enables tracking of specific T cell clones in circulating and peripheral tissues, which significantly facilitates monitoring, e.g., virus-specific T cell immunity, and enables differential diagnosis and targeted treatment of T cell-related disorders. Thus, a comprehensive assessment of the clonal composition of antigen-specific T cells can provide important information about cellular immunity in the context of vaccination, tumor control, or viral disease, and is of great importance for clinical assessment and management (see, e.g., dziubiana U.S. J.Transmount.) (2013; 13 (11): 2842-54).
Existing NGS methods for identifying TCR sequences include those that rely on comparing each sequencing read to, for example, vβ and jβ reference sequences. Alternatively, antigen-specific TCR convergence can be determined, which does not require the use of a large database to decode TCRs. This approach relies on the observation of TCRs that are similar or identical at the amino acid level but different at the nucleotide level, indicating that multiple T cell clones independently undergo VDJ recombination and expand in response to a common antigen. TCR convergence was observed to be an indication that a given TCR might respond to antigen presented over an extended period of time, giving different T cell clones the opportunity to independently proliferate in response to antigen. In the context of cancer, a converged TCR may be enriched for those that recognize tumor antigens. For example, in one study of dendritic cells to treat melanoma, the frequency of TCR convergence at baseline was observed to be highly predictive of therapeutic response (see Storkus WJ et al, cancer immunotherapy (J. ImmunotherCancer); 2021;9 (11): e 003675), which is incorporated herein by reference in its entirety). Similar findings have been reported (see Naidus E et al Cancer immunology immunotherapy (Cancer immunother.)) 2021;70 (7): 2095-2102) wherein peripheral blood TCR convergence after PD-L1 blockade is directly related to patient outcome in patients with advanced non-small cell lung Cancer. Data from these studies indicate that TCR convergence in peripheral blood T cells can represent an operable biomarker for: (1) Identifying patients most likely to respond to an immunotherapeutic intervention that mechanically requires a T cell response to achieve a preferred clinical outcome; and (2) effective longitudinal monitoring of therapeutically significant T cell responses in a patient receiving the treatment.
As used herein, a "convergent TCR set" is a set of T Cell Receptors (TCRs) that are similar in amino acid sequence and functionally equivalent or identical or hypothesized to be identical in amino acid sequence. Because of amino acid similarity, it is generally assumed that the convergent TCR sets recognize the same antigen. In some embodiments, the converging TCR panel members are identical or assumed to be identical in the variable gene and CDR3 amino acid sequences, despite having different nucleotide sequences. Convergent TCR panel members may be caused by differences in non-templated nucleotide bases at VDJ junctions that occur during the generation of productive TCR gene rearrangements.
Provided herein are methods for performing a multiplex amplification reaction to amplify target immune receptor nucleic acid template molecules (e.g., TCR molecules) derived from a biological sample, wherein the multiplex amplification reaction comprises a plurality of amplification primer pairs comprising a plurality of junction (J) gene primers for a majority of J genes of the target immune receptor, thereby generating target immune receptor amplicon molecules comprising the target immune receptor pool. Using the methods and outward facing J gene targeting primers described herein and in example 1, TCR development at baseline and in response to antigen can be assessed. To assess TCR convergence, for example, it is determined that TCR β chains are identical in amino acid sequence but have different nucleotide sequences.
Such methods further comprise: sequencing the target immune receptor pool amplicon; identifying immunoreceptor clones from said sequencing, and identifying converging immunoreceptor clones in said immunoreceptor clones, wherein said converging immunoreceptor clones have similar or identical amino acid sequences and different nucleotide sequences; and determining the frequency of converging immune receptor clones in the sample. Subsequent clinical decisions may combine the information obtained about TCR convergence with the potential therapeutic approach sought. Additional methods of TCR convergence analysis are described elsewhere, for example, in U.S. patent publication 2021/0108268, which is incorporated herein by reference in its entirety. These methods provide an efficient way to determine TCR convergence using multiplex primers, e.g., outward facing primers as described herein, and allow for determination of T cell clone VDJ recombination and expansion in response to a common antigen across multiple independent T cell clones.
Example 3 fusion detection for Minimal Residual Disease (MRD) monitoring
The use of standardized multi-agent chemotherapy regimens with risk-adapted intensity greatly promotes a gradual increase in survival in children with Acute Lymphoblastic Leukemia (ALL). Initial therapeutic response by continuous quantitative measurement of Minimal Residual Disease (MRD) has been demonstrated to be one of the strongest independent prognostic factors for pediatric ALL, and has been implemented in most of the treatment protocols currently in use. In the netherlands, MRD monitoring forms the main basis of risk component stratification since 2004, and is performed using real-time quantitative polymerase chain reaction (RQ-PCR) analysis of rearranged Immunoglobulin (IG) and T cell receptor (TR) genes. The method is highly standardized in the international union. However, in about 5% of cases, MRD classification is not feasible because PCR detectable targets cannot be identified, or because the targets do not reach the required sensitivity (see Pieters R et al J.Clin. Oncol.) (2016; 34 (22): 2591-601). In addition, IG/TR rearrangements may be oligoclonal and thus may be lost during the disease. Therefore, MRD-based stratification is suboptimal for these patients, with risk of under-or over-treatment (see Szczepanski T et al Blood 2002;99 (7): 2315-23 and van der Velden WHJ et al Leukemia (Leukemia) 2002; 16:928-936). Fusion genes and gene deletions often act as the primary driver of leukemia occurrence and thus may be very stable during disease progression and suitable as alternative genomic MRD PCR targets. In contrast to fusion transcripts, these genomic fusion breakpoints are independent of gene activity and therefore have comparable quantitative kinetics compared to standard IG/TR targets (see Kupper RP et al J.liver disease journal (Br. J. Haemato.)) 2021;194 (5): 888-892, which is incorporated herein by reference in its entirety.
The use of gene fusion or deletion for MRD monitoring requires the identification of genomic breakpoints for these structural variants, which are unique for each patient. These breakpoints can be identified in a straightforward and unbiased manner based on Whole Genome Sequencing (WGS) data. As described in example 1, the targeting method for gene fusion detection can simplify analysis and reduce cost, and thus has become the leading method for clinical application. The compositions and methods described above and herein provide an effective solution for sequencing to achieve genetic variations such as SNV, insertion/deletion and gene fusion, including targeted sequencing of genetic variations involving novel partners and derived from novel breakpoints, particularly for MRD detection. Described in detail herein in various embodiments, the method consists of the steps of: (1) circularizing nucleic acid derived from the sample; (2) Amplifying circularized nucleic acid derived from one or more targets of interest; and (3) analyzing the amplified fragments by Next Generation Sequencing (NGS).
A method called pore occupancy (well occupancy method) has recently been described for estimating the absolute abundance of individual T cell clones or B cell clones and/or nucleic acids encoding individual TCRs and/or IGs in a large number (see U.S. patent No. 10,246,701, which is incorporated herein by reference in its entirety). Briefly, 10,000 PBMCs were allocated to each well of a 96-well plate. In each well, hole specific bar code (which is incorporated into each amplicon by PCR and tail primer) amplification and distribution is performed, then amplified molecules are sequenced together, and sequence reads are matched back to the starting well based on the bar code. Then, it is determined whether each unique sequence (with a specific CDR3 sequence) is present in each well such that each unique CDR3 sequence is assigned a well occupancy pattern. Obtaining maximum likelihood estimates of the number of molecules in the original sample using an occupancy-based method for each individual CDR3 sequence; these estimates are determined based only on the number of wells in which this immunoreceptor sequence was found. Thus, for each individual unique adaptive immune receptor sequence observed, the number of containers in which a particular biological sequence is found is determined.
The methods described herein for detecting gene fusion by circularization and inverse PCR primers can be applied using such pore occupancy methods. Briefly, 10,000 PBMCs (e.g., PBMCs extracted from patients for MRD detection) were dispensed into each well of a 96-well plate. Amplification is performed using inverse PCR primers as described herein, combined with 5' blocking elements (e.g., non-extendable oligonucleotides) that bind specifically to the sequence of the unordered fusion gene partner of interest adjacent to and opposite the intended breakpoint junction, and partitioning of the pore-specific barcodes (which are incorporated into each amplicon by PCR and tailed primers) is performed in each well. The amplified molecules are then sequenced together and sequence reads are matched back to the starting well based on the barcode. Then, it is determined whether each unique sequence (e.g., having a specific gene fusion sequence, such as an IGH locus) is present in each well such that each unique IGH locus sequence is assigned a well occupancy pattern. MRD may be determined based on the presence and/or absence of the unique gene fusion sequence. Combining the methods described herein with occupancy-based methods can achieve significantly higher MRD detection frequencies, e.g., with lower detection limits in conventional practice (e.g., most studies define MRD positives as at 0.01%, which is the detection limit of conventional detection, as described in Rocha JMC et al, journal of mediterranean hematology and infectious diseases (meditermor.j. Heat. Information. Dis.) 2016;8 (1): e2016024, which is incorporated herein by reference).
Circularization of nucleic acid from the sample: during the circularization, the 5 'end of the nucleic acid molecule is linked to the 3' end of the molecule. In one embodiment, the ligase (e.g., circLigase TM Or T4 DNA ligase) for circularization of nucleic acids (DNA or RNA may be circularized). Where RNA (e.g., mRNA) is the target of circularization, the RNA is optionally converted to cDNA by reverse transcription. Optionally, after cyclization, residual linear molecules can be removed by exonuclease treatment. In addition, any circularized fragment containing an undesired sequence can be depleted from a library of circularized fragments, for example, by hybridization-based pull down using a probe targeting the undesired sequence, or CRISPR-mediated linearization of a circularized fragment containing the undesired sequence, followed by exonuclease treatment (see, e.g., U.S. patent publication 2019/0161752). The use of circularized template material may facilitate multiplex PCR even if used in combination with only conventional inward-facing PCR primers, as the circularized material lacks free 3' dna ends that may trigger non-specific amplification. Circularized DNA may enable more targeted amplification when used as templates for inward-facing primers and/or outward-facing primers in a PCR method, as compared to linear DNA.
Sequencing: the amplified nucleic acids are sequenced to determine the presence of one or more gene fusion events. The reading of the sequence may be accomplished using any suitable commercial sequencing mode, for example, in a preferred embodiment, using a next generation sequencing instrument. Reading sequences can also be accomplished using Sanger sequencing or other low throughput methods. The frequency of reads supporting the fusion gene can optionally be compared to the frequency of reads supporting unfused (i.e., wild-type or normal) copies of one or more of the donor or acceptor genes to determine the relative abundance of the gene fusion nucleic acid and whether sufficient read support is present to conclude that the sample contains gene fusion.
Fig. 12 shows the time aspect of the MRD test for Acute Lymphoblastic Leukemia (ALL). Each line represents the level of residual disease over time following therapeutic intervention (e.g., radiation and/or chemotherapy) at different time points monitored by different hypothetical patients after treatment. Response curves include DP (disease duration), VEP (very early relapse), ER (early relapse), LR (late relapse), VLR (very late relapse), and NR (no relapse). 10 -2 Represents the proportion of leukemia cells, which represents the approximate lower limit of detection of VER. Sub-microscopic disease detection (i.e., MRD) can generally detect the condition of VER, ER, and LR, where the proportion of leukemia cells ranges from about 10 -2 To about 10 -5 . The prior art methods are largely limited to detecting about 10 in a sample -6 Leukemia cells, which may be insufficient for patients who will die from VLR. The methods described herein allow for as low as 10 -5 To 10 -7 This facilitates detection of all therapeutic scenarios and in all cases.
The methods described herein enable detection of all frequencies in a sequencing efficient manner (e.g., at about 10 -2 To about 10 -7 Within all ranges of (c) malignancy-associated markers, making them suitable for both disease diagnosis and MRD analysis. The method described herein comprises, relative to existing commercial solutions(i.e., kits provided by adaptive biotechnology Co., ltd. (Adaptive Biotechnology, inc.) and +.>An additional advantage of the kit (provided by InvivoScribe, inc.) is that the methods described herein are capable of assessing IGH, IGK, and IGL locus rearrangements simultaneously in a single reaction. Existing solutions require a separate oneMultiplex PCR reactions, e.g., against IGH, IGK and IGL. The need to isolate PCR reactions increases the test complexity, cost and time associated with each diagnosis.
Example 4 determination of blocking oligomer efficiency
The efficiency of blocking the oligomer to target the region of the unrearranged igh j6 region was determined according to the methods described herein and in example 1. Fig. 13 shows the results of blocking element efficiency, as determined by gel electrophoresis analysis. Synthetic oligomers were generated to represent IGH rearrangements (fusion, F) and unrearranged IGH j6 genes (wild type, W). PCR amplification of each template was performed using reverse PCR primers with or without the presence (indicated by +/-) of a non-extendable blocking oligomer capable of hybridizing to the W template but not to the F template (blocking oligomer as shown in figure 1). The PCR amplification products were then visualized on agarose gels. In the absence of blocking oligomers, equivalent amounts of product were observed for fusion and wild-type templates. As expected, the addition of the blocker selectively reduced the product from the wild-type template.
Example 5 detection of breakpoint regions
Gene fusion is an important type of genetic variation in cancer, is associated with therapy selection, and serves as a marker for Measurable Residual Disease (MRD) monitoring. Conventional multiplex PCR (mPCR) cannot detect gene fusion with a novel partner or breakpoint. A novel mPCR technique is described herein for targeted detection of gene fusions, including gene fusions with unknown partners or breakpoints. Using Singular Genomics G4 TM A sequencing platform, employing the methods described herein, simultaneously identifies clinically relevant IGH locus translocations and V (D) J rearrangements from highly degraded material.
DNA fragmentation and cyclization: the method starts with efficient intramolecular ligation of DNA fragments followed by multiplex inverse PCR that preferentially amplifies breakpoint junctions containing fragments. First, the isolated variable length DNA is sheared to a length of about 200bp using enzymatic fragmentation (e.g., NEBNExt dsDNA fragment enzyme, catalog number M0348) or manually sheared using Covaris ME220, followed by QuantaBio sparQ PureMag, bead cleaning. Then, 50ng of the fragmented and bead purified DNA was heat denatured into single stranded DNA, followed by the use of the CircLigase TM The circularization was performed with ssDNA ligase (Lucigen Co., ltd. (Lucigen) catalog number CL4111K/CL 4115K) using 10pmol of ssDNA per reaction according to the manufacturer's protocol. The ssDNA was incubated at 60℃for 1 hour to circularize the ssDNA, followed by 10 minutes at 80℃to circularize the CircLigase TM Deactivation.
After circularization, some uncycled DNA (both single and double stranded) may remain in each sample. A mixture of exonuclease I (NEB) and exonuclease III (NEB) was used to digest uncyclized DNA by incubation at 37℃for 1 hour. The remaining circular ssDNA was then purified using the Zymo Oligo Clean & Concentrator kit.
Inverse PCR: the purified circular ssDNA template was then amplified using inverse PCR as described herein. PCR conditions were adapted from NEBThe polymerase master mix reaction conditions contained 0.2mM dNTPs (each), 0.1. Mu.M primers (each, e.g., a set of primers 0.1. Mu.M first primer and 0.1. Mu.M second primer), 0.2U/. Mu. L Q5 polymerase, 1. Mu.M blocking oligomer (each) and 500ng to 2ug template. A 2-step amplification protocol was performed in which the initial denaturation step was 96 ℃, followed by cycling between a 96 ℃ denaturation step and an annealing/extension step at 62 ℃. Samples were then obtained by library preparation. For simplicity, the data in Table 1 were generated using a single pair of gene-binding reverse PCR primers and a single blocker. The completed assay (amplified IGH, IGK, IGL locus rearrangement) will have about 22 primers (1F, 6R for IGH locus; 3F, 6R for IGK locus; 1F, 5R for IGL locus) and 18 different blockers.
Sequencing: at G4 TM The amplicon library was sequenced by a 2x150bp read on the platform and analyzed to detect translocation. IGH V from fragmented IVS-0010 and IVS-0030 reference control gDNA (Invivoscribe cat. No. 40880550 and 40881750) and healthy donor PBL gDNA were simultaneously detected using the method D) J rearrangement and BCL1-JH and BCL2-JH translocation.
Results: BCL1-JH and BCL2-JH translocations were detected from 50ng of fragmented gDNA (average template length of 200 bp) from IVS-0010 and IVS-0030 reference controls, respectively. Translocation was also detected from 50ng of samples consisting of fragmented reference control material incorporated into the background of fragmented healthy donor PBLs at a frequency of 1%. Preferential amplification of the translocation-containing template was observed, enabling detection from <1M reads/sample under all test conditions. V (D) J rearrangements were successfully detected from PBL gDNA using the same multiplex inverse PCR reaction (see, e.g., FIG. 14). A summary of pooled sequencing reads can be found in table 1.
Table 1. Detection analysis limits from fragmented material. For simplicity, the data in Table 1 were generated using a single pair of gene-binding reverse PCR primers and a single blocker. In an example, a complete assay (amplification IGH, IGK, IGL locus rearrangement) will have about 22 primers (1F, 6R for IGH locus; 3F, 6R for IGK locus; 1F, 5R for IGL locus) and 18 blockers. Healthy donor PBL gDNA and gDNA from IVS-0030 (catalog number: 40881750) were cut by ultrasound to an average length of about 200bp. 50ng of fragmented PBL gDNA or 50ng of PBL gDNA incorporating 0.5ng of IVS-0030 were subjected to circularization and amplification by the assay described herein. At G4 TM Amplicons were sequenced using a 1x150bp read. Reads were aligned to the genome by bwa, and then read peaks corresponding to translocation junctions were identified by MACS 2. Unique VDJ rearrangements were identified by IgBLAST. The score on the target reads corresponds to reads that map at least in part to the IGH locus.
Conclusion: the methods described herein enable detection of novel gene fusions from highly degraded materials based on mPCR with sequencing efficiency similar to traditional mPCR. As a first application, these methods were applied to simultaneously detect B cell V (D) J rearrangements and clinically relevant JH translocations from a limited amount of degraded gDNA. In this regard, these approaches represent a significant advance over current mPCR-based approaches to antigen receptor sequencing. The methods are expected to have wide applicability in molecular diagnostics and MRD monitoring of disease states such as cancer.
Claims (35)
1. A method of differentially amplifying a polynucleotide comprising a fusion gene relative to a polynucleotide not comprising the fusion gene, the method comprising:
i) Circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises the fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise the fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides;
ii) binding a blocking element to the one or more non-fused circular template polynucleotides; and
iii) Hybridizing a first primer and a second primer to the one or more non-fusion circular template polynucleotides and the one or more fusion circular template polynucleotides and extending with a polymerase to produce a first amount of non-fusion polynucleotide amplification product and a second amount of fusion polynucleotide amplification product, wherein the first amount is detectably less than the second amount; whereby said polynucleotides comprising said fusion gene are differentially amplified.
2. The method of claim 1, wherein binding the blocking element comprises binding the blocking element upstream of the first primer.
3. The method of claim 1, wherein the second amount is about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 75% greater than the first amount.
4. The method of claim 1, wherein the second number is about 2 times, at least about 1.5 times, at least about 2.0 times, at least about 2.5 times, at least about 5 times, at least about 10 times, or more than about 10 times the first number.
5. The method of claim 1, further comprising detecting the first amount of non-fusion polynucleotide amplification product and the second amount of fusion polynucleotide amplification product.
6. The method of claim 1, wherein the one or more linear nucleic acid molecules comprise DNA, RNA, or cDNA; optionally wherein said DNA or said RNA is a cell-free nucleic acid molecule.
7. The method of claim 1, wherein the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction.
8. The method of claim 1, wherein the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction formed by alternative splicing.
9. The method of claim 1, wherein the one or more linear nucleic acid molecules comprise RNA or cDNA and the fusion gene comprises an exon junction formed by a splice defect.
10. The method of claim 1, wherein the fusion gene comprises an interchhromosomal or intrachromosomal translocation.
11. The method of claim 10, wherein the intrachromosomal translocation comprises a partially or fully rearranged B cell or T cell antigen receptor.
12. The method of claim 1, wherein the blocking element comprises an oligonucleotide, a protein, or a combination thereof.
13. The method of claim 1, wherein the one or more linear nucleic acid molecules are about 20 to about 1000 nucleotides in length, about 100 to about 300 nucleotides in length, about 300 to about 500 nucleotides in length, or about 500 to about 1000 nucleotides in length.
14. The method of claim 1, wherein the blocking element binds to about 1 to 150 nucleotides upstream relative to the first primer.
15. The method of claim 1, wherein the first primer hybridizes to the one or more fusion circular template polynucleotides about 1 to 100 nucleotides downstream relative to a fusion junction within the fusion gene.
16. The method of claim 1, wherein the first primer and the second primer hybridize to complementary sequences of the one or more fused circular template polynucleotides and the one or more non-fused circular template polynucleotides, wherein the first primer and the second primer are about 1 to about 50 nucleotides apart.
17. The method of claim 1, further comprising binding a second blocking element to the one or more non-fusion circular template polynucleotides downstream relative to the second primer.
18. The method of claim 17, wherein the second blocking element binds to about 100 to about 300 nucleotides downstream relative to the second primer.
19. The method of claim 1, further comprising repeating steps ii) and iii).
20. The method as recited in claim 1, further comprising:
iv) amplifying the one or more non-fused circular template polynucleotides to produce a third amount of non-fused polynucleotide amplification products; and amplifying the one or more fusion circular template polynucleotides to produce a fourth quantity of fusion polynucleotide amplification products, wherein the third quantity and the fourth quantity are substantially the same.
21. The method of claim 20, wherein amplifying the one or more non-fused circular template polynucleotides comprises hybridizing a third primer and a fourth primer to the one or more non-fused circular template polynucleotides and extending both primers with a polymerase, and wherein amplifying the one or more fused circular template polynucleotides comprises hybridizing a third primer and a fourth primer to the one or more fused circular template polynucleotides and extending both primers with a polymerase.
22. The method of claim 21, wherein the third primer hybridizes upstream of a target sequence and the fourth primer hybridizes downstream of a target sequence, wherein the target sequence comprises a single nucleotide variant, an insertion, a deletion, an internal tandem repeat, or a copy number variant.
23. The method as recited in claim 1, further comprising: detecting the length of the non-fusion polynucleotide amplification product and the length of the fusion polynucleotide amplification product; detecting one or more probes bound to the non-fusion polynucleotide amplification product and the fusion polynucleotide amplification product; or sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product.
24. The method of claim 23, wherein sequencing the non-fused polynucleotide amplification product and the fused polynucleotide amplification product produces one or more sequencing reads.
25. The method of claim 24, further comprising aligning the substring of one or more sequencing reads with a reference sequence.
26. The method of claim 24, further comprising comparing the k-mer substring of the one or more sequencing reads to a k-mer table of a fusion gene reference.
27. The method as recited in claim 24, further comprising: grouping one or more sequencing reads based on the barcode sequence and/or the sequence comprising the fusion gene; and within the set, aligning the reads and forming a consensus sequence of reads having the same barcode sequence and/or sequences comprising the fusion gene.
28. The method of claim 23, wherein sequencing further comprises: generating one or more sequencing reads comprising a circularized junction formed between the 5 'and 3' ends of the linear nucleic acid molecule; and quantifying the number of different circularized junction sequences comprising said fusion gene.
29. A kit, comprising:
a circularizing agent, wherein the circularizing agent is capable of binding the 5 'and 3' ends of a linear nucleic acid molecule;
a blocking element capable of binding to one or more circular polynucleotides;
a first primer and a second primer; and
a polymerase.
30. A method of amplifying a polynucleotide comprising a fusion gene, the method comprising:
i) Binding a blocking element to a non-fusion circular template polynucleotide, wherein the non-fusion circular template does not include the fusion gene;
ii) hybridizing a first primer and a second primer to the non-fusion circular template polynucleotide; and hybridizing the first primer and the second primer to a fusion circular template polynucleotide, wherein the fusion circular template polynucleotide comprises the fusion gene; and
iii) Extending the first primer and the second primer with a non-strand displacement polymerase to produce a fusion polynucleotide amplification product.
31. The method of claim 30, wherein binding the blocking element comprises binding the blocking element upstream of the first primer.
32. The method of claim 31, wherein prior to step i), the method comprises circularizing a plurality of linear nucleic acid molecules to form a plurality of circular template polynucleotides, wherein one or more of the linear nucleic acid molecules comprises the fusion gene, thereby forming one or more fusion gene circular template polynucleotides, and wherein one or more of the linear nucleic acid molecules does not comprise the fusion gene, thereby forming one or more non-fusion gene circular template polynucleotides.
33. The method of claim 32, further comprising binding a second blocking element to the non-fusion circular template polynucleotide downstream relative to the second primer.
34. The method of claim 33, further comprising detecting the fusion polynucleotide amplification product.
35. The method of claim 33, further comprising sequencing the fusion polynucleotide amplification product.
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US63/218,794 | 2021-07-06 | ||
US63/297,078 | 2022-01-06 | ||
US202263348939P | 2022-06-03 | 2022-06-03 | |
US63/348,939 | 2022-06-03 | ||
PCT/US2022/035579 WO2023283090A1 (en) | 2021-07-06 | 2022-06-29 | Compositions and methods for detecting genetic features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117897502A true CN117897502A (en) | 2024-04-16 |
Family
ID=90649373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202280058627.5A Pending CN117897502A (en) | 2021-07-06 | 2022-06-29 | Compositions and methods for detecting genetic features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117897502A (en) |
-
2022
- 2022-06-29 CN CN202280058627.5A patent/CN117897502A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11519029B2 (en) | Linked paired strand sequencing | |
JP7467118B2 (en) | Compositions and methods for identifying nucleic acid molecules | |
JP6905934B2 (en) | Multiple gene analysis of tumor samples | |
US11155858B2 (en) | Polynucleotide barcodes for long read sequencing | |
CN111344416A (en) | Compositions and methods for immunohistorian sequencing | |
US11390905B2 (en) | Methods of nucleic acid sample preparation for analysis of DNA | |
CN112654720A (en) | Compositions and methods for immunohistorian sequencing | |
JP2023511200A (en) | Immune repertoire biomarkers in autoimmune and immunodeficiency diseases | |
US20220282305A1 (en) | Methods of nucleic acid sample preparation | |
CN117897502A (en) | Compositions and methods for detecting genetic features | |
US20230212689A1 (en) | Compositions and methods for detecting genetic features | |
US20230287515A1 (en) | Compositions and methods for detecting fusion genes | |
WO2022272150A2 (en) | Linked transcript sequencing | |
WO2023196983A2 (en) | Methods for polynucleotide sequencing | |
CN115698337A (en) | Methods and compositions for detecting structural rearrangements in a genome |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |