WO2023280156A1 - 修饰的核苷或核苷酸 - Google Patents
修饰的核苷或核苷酸 Download PDFInfo
- Publication number
- WO2023280156A1 WO2023280156A1 PCT/CN2022/103895 CN2022103895W WO2023280156A1 WO 2023280156 A1 WO2023280156 A1 WO 2023280156A1 CN 2022103895 W CN2022103895 W CN 2022103895W WO 2023280156 A1 WO2023280156 A1 WO 2023280156A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- compound
- nucleic acid
- base
- group
- alkyl
- Prior art date
Links
- 125000003729 nucleotide group Chemical class 0.000 title claims abstract description 131
- 239000002773 nucleotide Chemical class 0.000 title claims abstract description 104
- 239000002777 nucleoside Substances 0.000 title claims abstract description 18
- 150000003833 nucleoside derivatives Chemical class 0.000 title claims abstract description 10
- 238000012163 sequencing technique Methods 0.000 claims abstract description 17
- 229910052757 nitrogen Inorganic materials 0.000 claims abstract description 15
- 150000001875 compounds Chemical class 0.000 claims description 132
- 150000007523 nucleic acids Chemical class 0.000 claims description 113
- 102000039446 nucleic acids Human genes 0.000 claims description 88
- 108020004707 nucleic acids Proteins 0.000 claims description 88
- 230000000903 blocking effect Effects 0.000 claims description 75
- 150000003839 salts Chemical class 0.000 claims description 75
- 125000005647 linker group Chemical group 0.000 claims description 65
- 230000002441 reversible effect Effects 0.000 claims description 65
- 125000004169 (C1-C6) alkyl group Chemical group 0.000 claims description 49
- 238000000034 method Methods 0.000 claims description 44
- 102000040430 polynucleotide Human genes 0.000 claims description 41
- 108091033319 polynucleotide Proteins 0.000 claims description 41
- 239000002157 polynucleotide Substances 0.000 claims description 41
- 238000003776 cleavage reaction Methods 0.000 claims description 35
- 238000006116 polymerization reaction Methods 0.000 claims description 35
- 230000007017 scission Effects 0.000 claims description 35
- 239000003153 chemical reaction reagent Substances 0.000 claims description 29
- PEHVGBZKEYRQSX-UHFFFAOYSA-N 7-deaza-adenine Chemical compound NC1=NC=NC2=C1C=CN2 PEHVGBZKEYRQSX-UHFFFAOYSA-N 0.000 claims description 24
- 230000000295 complement effect Effects 0.000 claims description 24
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 claims description 24
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 claims description 24
- 239000000543 intermediate Substances 0.000 claims description 24
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 claims description 24
- 125000002264 triphosphate group Chemical group [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 claims description 17
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 claims description 16
- 229910052717 sulfur Inorganic materials 0.000 claims description 15
- 238000006243 chemical reaction Methods 0.000 claims description 14
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 claims description 14
- 238000010348 incorporation Methods 0.000 claims description 13
- LOSIULRWFAEMFL-UHFFFAOYSA-N 7-deazaguanine Chemical compound O=C1NC(N)=NC2=C1CC=N2 LOSIULRWFAEMFL-UHFFFAOYSA-N 0.000 claims description 12
- 229930024421 Adenine Natural products 0.000 claims description 12
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 claims description 12
- 229960000643 adenine Drugs 0.000 claims description 12
- 229940104302 cytosine Drugs 0.000 claims description 12
- 239000000243 solution Substances 0.000 claims description 12
- 229940113082 thymine Drugs 0.000 claims description 12
- 125000003903 2-propenyl group Chemical group [H]C([*])([H])C([H])=C([H])[H] 0.000 claims description 10
- 239000012071 phase Substances 0.000 claims description 10
- XPPKVPWEQAFLFU-UHFFFAOYSA-N diphosphoric acid Chemical group OP(O)(=O)OP(O)(O)=O XPPKVPWEQAFLFU-UHFFFAOYSA-N 0.000 claims description 9
- 108020004635 Complementary DNA Proteins 0.000 claims description 8
- YDHWWBZFRZWVHO-UHFFFAOYSA-H [oxido-[oxido(phosphonatooxy)phosphoryl]oxyphosphoryl] phosphate Chemical group [O-]P([O-])(=O)OP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O YDHWWBZFRZWVHO-UHFFFAOYSA-H 0.000 claims description 8
- 239000002253 acid Substances 0.000 claims description 8
- 150000004712 monophosphates Chemical group 0.000 claims description 8
- 125000003835 nucleoside group Chemical group 0.000 claims description 8
- 229910052760 oxygen Inorganic materials 0.000 claims description 8
- 229940035893 uracil Drugs 0.000 claims description 8
- 102100034343 Integrase Human genes 0.000 claims description 6
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims description 6
- 238000003379 elimination reaction Methods 0.000 claims description 5
- 229910052739 hydrogen Inorganic materials 0.000 claims description 5
- 230000000269 nucleophilic effect Effects 0.000 claims description 5
- 108010008286 DNA nucleotidylexotransferase Proteins 0.000 claims description 4
- 102100029764 DNA-directed DNA/RNA polymerase mu Human genes 0.000 claims description 4
- 230000008030 elimination Effects 0.000 claims description 4
- 239000007790 solid phase Substances 0.000 claims description 4
- 238000005406 washing Methods 0.000 claims description 4
- 108091023037 Aptamer Proteins 0.000 claims description 3
- 108010025905 Cystine-Knot Miniproteins Proteins 0.000 claims description 3
- 108091008108 affimer Proteins 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000001590 oxidative effect Effects 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 3
- 238000000137 annealing Methods 0.000 claims description 2
- 239000007853 buffer solution Substances 0.000 claims description 2
- 239000007850 fluorescent dye Substances 0.000 claims description 2
- 230000000977 initiatory effect Effects 0.000 claims description 2
- 238000001668 nucleic acid synthesis Methods 0.000 claims description 2
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 claims 4
- 210000002700 urine Anatomy 0.000 claims 4
- 150000001336 alkenes Chemical class 0.000 claims 2
- 238000007481 next generation sequencing Methods 0.000 abstract description 12
- 125000004434 sulfur atom Chemical group 0.000 abstract description 7
- 125000004433 nitrogen atom Chemical group N* 0.000 abstract description 6
- 239000002585 base Substances 0.000 description 71
- 108020004414 DNA Proteins 0.000 description 16
- 238000012360 testing method Methods 0.000 description 16
- -1 Nucleoside triphosphate Chemical class 0.000 description 11
- 125000006239 protecting group Chemical group 0.000 description 10
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 8
- 102000053602 DNA Human genes 0.000 description 8
- 125000003396 thiol group Chemical group [H]S* 0.000 description 8
- 239000001226 triphosphate Substances 0.000 description 7
- 235000011178 triphosphate Nutrition 0.000 description 7
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 6
- 238000001514 detection method Methods 0.000 description 6
- 125000003277 amino group Chemical group 0.000 description 5
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical group OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 4
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 4
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 4
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 3
- 125000000217 alkyl group Chemical group 0.000 description 3
- 150000001540 azides Chemical group 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 3
- 150000002148 esters Chemical class 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000003647 oxidation Effects 0.000 description 3
- 238000007254 oxidation reaction Methods 0.000 description 3
- 238000007480 sanger sequencing Methods 0.000 description 3
- 239000011593 sulfur Substances 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- CIWBSHSKHKDKBQ-JLAZNSOCSA-N Ascorbic acid Chemical compound OC[C@H](O)[C@H]1OC(=O)C(O)=C1O CIWBSHSKHKDKBQ-JLAZNSOCSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 2
- KDLHZDBZIXYQEI-UHFFFAOYSA-N Palladium Chemical compound [Pd] KDLHZDBZIXYQEI-UHFFFAOYSA-N 0.000 description 2
- XYFCBTPGUUZFHI-UHFFFAOYSA-N Phosphine Chemical compound P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 2
- PXIPVTKHYLBLMZ-UHFFFAOYSA-N Sodium azide Chemical compound [Na+].[N-]=[N+]=[N-] PXIPVTKHYLBLMZ-UHFFFAOYSA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 150000007513 acids Chemical class 0.000 description 2
- DZBUGLKDJFMEHC-UHFFFAOYSA-N acridine Chemical compound C1=CC=CC2=CC3=CC=CC=C3N=C21 DZBUGLKDJFMEHC-UHFFFAOYSA-N 0.000 description 2
- 150000008052 alkyl sulfonates Chemical class 0.000 description 2
- 125000005228 aryl sulfonate group Chemical group 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- ISAOCJYIOMOJEB-UHFFFAOYSA-N benzoin Chemical compound C=1C=CC=CC=1C(O)C(=O)C1=CC=CC=C1 ISAOCJYIOMOJEB-UHFFFAOYSA-N 0.000 description 2
- 125000001797 benzyl group Chemical group [H]C1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])* 0.000 description 2
- 125000004432 carbon atom Chemical group C* 0.000 description 2
- 239000003054 catalyst Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000010511 deprotection reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 150000002739 metals Chemical class 0.000 description 2
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 2
- 239000000178 monomer Substances 0.000 description 2
- 239000002077 nanosphere Substances 0.000 description 2
- QJGQUHMNIGDVPM-UHFFFAOYSA-N nitrogen group Chemical group [N] QJGQUHMNIGDVPM-UHFFFAOYSA-N 0.000 description 2
- 239000012038 nucleophile Substances 0.000 description 2
- 150000007524 organic acids Chemical class 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- BBEAQIROQSPTKN-UHFFFAOYSA-N pyrene Chemical compound C1=CC=C2C=CC3=CC=CC4=CC=C1C2=C43 BBEAQIROQSPTKN-UHFFFAOYSA-N 0.000 description 2
- 150000003254 radicals Chemical class 0.000 description 2
- 239000002994 raw material Substances 0.000 description 2
- 238000006894 reductive elimination reaction Methods 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 229910052711 selenium Inorganic materials 0.000 description 2
- 239000011669 selenium Substances 0.000 description 2
- 125000001424 substituent group Chemical group 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 125000000999 tert-butyl group Chemical group [H]C([H])([H])C(*)(C([H])([H])[H])C([H])([H])[H] 0.000 description 2
- ILMRJRBKQSSXGY-UHFFFAOYSA-N tert-butyl(dimethyl)silicon Chemical compound C[Si](C)C(C)(C)C ILMRJRBKQSSXGY-UHFFFAOYSA-N 0.000 description 2
- VZCYOOQTPOCHFL-UHFFFAOYSA-N trans-butenedioic acid Natural products OC(=O)C=CC(O)=O VZCYOOQTPOCHFL-UHFFFAOYSA-N 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- ZGYICYBLPGRURT-UHFFFAOYSA-N tri(propan-2-yl)silicon Chemical group CC(C)[Si](C(C)C)C(C)C ZGYICYBLPGRURT-UHFFFAOYSA-N 0.000 description 2
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 2
- VLRCFULRQZKFRM-KQYNXXCUSA-N (2r,3r,4r,5r)-5-(6-aminopurin-9-yl)-2-(hydroxymethyl)-4-sulfanyloxolan-3-ol Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1S VLRCFULRQZKFRM-KQYNXXCUSA-N 0.000 description 1
- 125000003088 (fluoren-9-ylmethoxy)carbonyl group Chemical group 0.000 description 1
- DXBHBZVCASKNBY-UHFFFAOYSA-N 1,2-Benz(a)anthracene Chemical compound C1=CC=C2C3=CC4=CC=CC=C4C=C3C=CC2=C1 DXBHBZVCASKNBY-UHFFFAOYSA-N 0.000 description 1
- 125000006017 1-propenyl group Chemical group 0.000 description 1
- UFBJCMHMOXMLKC-UHFFFAOYSA-N 2,4-dinitrophenol Chemical compound OC1=CC=C([N+]([O-])=O)C=C1[N+]([O-])=O UFBJCMHMOXMLKC-UHFFFAOYSA-N 0.000 description 1
- 125000001731 2-cyanoethyl group Chemical group [H]C([H])(*)C([H])([H])C#N 0.000 description 1
- KPGXRSRHYNQIFN-UHFFFAOYSA-N 2-oxoglutaric acid Chemical compound OC(=O)CCC(=O)C(O)=O KPGXRSRHYNQIFN-UHFFFAOYSA-N 0.000 description 1
- GOLORTLGFDVFDW-UHFFFAOYSA-N 3-(1h-benzimidazol-2-yl)-7-(diethylamino)chromen-2-one Chemical compound C1=CC=C2NC(C3=CC4=CC=C(C=C4OC3=O)N(CC)CC)=NC2=C1 GOLORTLGFDVFDW-UHFFFAOYSA-N 0.000 description 1
- JOOXCMJARBKPKM-UHFFFAOYSA-M 4-oxopentanoate Chemical compound CC(=O)CCC([O-])=O JOOXCMJARBKPKM-UHFFFAOYSA-M 0.000 description 1
- ZCYVEMRRCGMTRW-UHFFFAOYSA-N 7553-56-2 Chemical compound [I] ZCYVEMRRCGMTRW-UHFFFAOYSA-N 0.000 description 1
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical class C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 1
- QTBSBXVTEAMEQO-UHFFFAOYSA-M Acetate Chemical compound CC([O-])=O QTBSBXVTEAMEQO-UHFFFAOYSA-M 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-M Bicarbonate Chemical compound OC([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-M 0.000 description 1
- 125000000882 C2-C6 alkenyl group Chemical group 0.000 description 1
- 241000282465 Canis Species 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-L Carbonate Chemical compound [O-]C([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-L 0.000 description 1
- 108020004998 Chloroplast DNA Proteins 0.000 description 1
- KRKNYBCHXYNGOX-UHFFFAOYSA-K Citrate Chemical compound [O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O KRKNYBCHXYNGOX-UHFFFAOYSA-K 0.000 description 1
- 241000938605 Crocodylia Species 0.000 description 1
- PCDQPRRSZKQHHS-CCXZUQQUSA-N Cytarabine Triphosphate Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 PCDQPRRSZKQHHS-CCXZUQQUSA-N 0.000 description 1
- FEWJPZIEWOKRBE-JCYAYHJZSA-N Dextrotartaric acid Chemical compound OC(=O)[C@H](O)[C@@H](O)C(O)=O FEWJPZIEWOKRBE-JCYAYHJZSA-N 0.000 description 1
- BDAGIHXWWSANSR-UHFFFAOYSA-M Formate Chemical compound [O-]C=O BDAGIHXWWSANSR-UHFFFAOYSA-M 0.000 description 1
- VZCYOOQTPOCHFL-OWOJBTEDSA-N Fumaric acid Chemical compound OC(=O)\C=C\C(O)=O VZCYOOQTPOCHFL-OWOJBTEDSA-N 0.000 description 1
- XKMLYUALXHKNFT-UUOKFMHZSA-N Guanosine-5'-triphosphate Chemical class C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XKMLYUALXHKNFT-UUOKFMHZSA-N 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 1
- CPELXLSAUQHCOX-UHFFFAOYSA-N Hydrogen bromide Chemical compound Br CPELXLSAUQHCOX-UHFFFAOYSA-N 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- AFVFQIVMOAPDHO-UHFFFAOYSA-M Methanesulfonate Chemical group CS([O-])(=O)=O AFVFQIVMOAPDHO-UHFFFAOYSA-M 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- 229910002651 NO3 Inorganic materials 0.000 description 1
- NHNBFGGVMKEFGY-UHFFFAOYSA-N Nitrate Chemical compound [O-][N+]([O-])=O NHNBFGGVMKEFGY-UHFFFAOYSA-N 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- XBDQKXXYIPTUBI-UHFFFAOYSA-M Propionate Chemical compound CCC([O-])=O XBDQKXXYIPTUBI-UHFFFAOYSA-M 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- BUGBHKTXTAQXES-UHFFFAOYSA-N Selenium Chemical compound [Se] BUGBHKTXTAQXES-UHFFFAOYSA-N 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 244000028419 Styrax benzoin Species 0.000 description 1
- 235000000126 Styrax benzoin Nutrition 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 235000008411 Sumatra benzointree Nutrition 0.000 description 1
- PGAVKCOVUIYSFO-XVFCMESISA-N UTP Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-XVFCMESISA-N 0.000 description 1
- KLPJDUFWHNTACS-UUOKFMHZSA-N [[(2r,3r,4r,5r)-5-(2-amino-6-oxo-3h-purin-9-yl)-3-hydroxy-4-sulfanyloxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1S KLPJDUFWHNTACS-UUOKFMHZSA-N 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- DHKHKXVYLBGOIT-UHFFFAOYSA-N acetaldehyde Diethyl Acetal Natural products CCOC(C)OCC DHKHKXVYLBGOIT-UHFFFAOYSA-N 0.000 description 1
- 150000001241 acetals Chemical class 0.000 description 1
- 125000006307 alkoxy benzyl group Chemical group 0.000 description 1
- AWUCVROLDVIAJX-UHFFFAOYSA-N alpha-glycerophosphate Natural products OCC(O)COP(O)(O)=O AWUCVROLDVIAJX-UHFFFAOYSA-N 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- 150000001412 amines Chemical class 0.000 description 1
- 238000003975 animal breeding Methods 0.000 description 1
- 239000004410 anthocyanin Substances 0.000 description 1
- 229930002877 anthocyanin Natural products 0.000 description 1
- 235000010208 anthocyanin Nutrition 0.000 description 1
- 150000004636 anthocyanins Chemical class 0.000 description 1
- 229940072107 ascorbate Drugs 0.000 description 1
- 235000010323 ascorbic acid Nutrition 0.000 description 1
- 239000011668 ascorbic acid Substances 0.000 description 1
- 125000000852 azido group Chemical group *N=[N+]=[N-] 0.000 description 1
- QVZZPLDJERFENQ-NKTUOASPSA-N bassianolide Chemical compound CC(C)C[C@@H]1N(C)C(=O)[C@@H](C(C)C)OC(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C(C)C)OC(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C(C)C)OC(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C(C)C)OC1=O QVZZPLDJERFENQ-NKTUOASPSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- SRSXLGNVWSONIS-UHFFFAOYSA-M benzenesulfonate Chemical group [O-]S(=O)(=O)C1=CC=CC=C1 SRSXLGNVWSONIS-UHFFFAOYSA-M 0.000 description 1
- 229940077388 benzenesulfonate Drugs 0.000 description 1
- WPYMKLBDIGXBTP-UHFFFAOYSA-N benzoic acid Chemical compound OC(=O)C1=CC=CC=C1 WPYMKLBDIGXBTP-UHFFFAOYSA-N 0.000 description 1
- 229960002130 benzoin Drugs 0.000 description 1
- 125000001584 benzyloxycarbonyl group Chemical group C(=O)(OCC1=CC=CC=C1)* 0.000 description 1
- 238000005842 biochemical reaction Methods 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 239000011203 carbon fibre reinforced carbon Substances 0.000 description 1
- 238000009903 catalytic hydrogenation reaction Methods 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 239000003638 chemical reducing agent Substances 0.000 description 1
- 229940001468 citrate Drugs 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 238000004163 cytometry Methods 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 150000002019 disulfides Chemical class 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 239000012039 electrophile Substances 0.000 description 1
- 238000005886 esterification reaction Methods 0.000 description 1
- CCIVGXIOQKPBKL-UHFFFAOYSA-M ethanesulfonate Chemical compound CCS([O-])(=O)=O CCIVGXIOQKPBKL-UHFFFAOYSA-M 0.000 description 1
- 238000012869 ethanol precipitation Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- GVEPBJHOBDJJJI-UHFFFAOYSA-N fluoranthrene Natural products C1=CC(C2=CC=CC=C22)=C3C2=CC=CC3=C1 GVEPBJHOBDJJJI-UHFFFAOYSA-N 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 1
- 238000001506 fluorescence spectroscopy Methods 0.000 description 1
- 229940050411 fumarate Drugs 0.000 description 1
- 229930182470 glycoside Natural products 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 235000019382 gum benzoic Nutrition 0.000 description 1
- 125000004051 hexyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- XMBWDFGMSWQBCA-UHFFFAOYSA-N hydrogen iodide Chemical compound I XMBWDFGMSWQBCA-UHFFFAOYSA-N 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000003402 intramolecular cyclocondensation reaction Methods 0.000 description 1
- 229910052740 iodine Inorganic materials 0.000 description 1
- 239000011630 iodine Substances 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 125000000959 isobutyl group Chemical group [H]C([H])([H])C([H])(C([H])([H])[H])C([H])([H])* 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 229940058352 levulinate Drugs 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- VZCYOOQTPOCHFL-UPHRSURJSA-N maleic acid Chemical compound OC(=O)\C=C/C(O)=O VZCYOOQTPOCHFL-UPHRSURJSA-N 0.000 description 1
- QSHDDOUJBYECFT-UHFFFAOYSA-N mercury Chemical compound [Hg] QSHDDOUJBYECFT-UHFFFAOYSA-N 0.000 description 1
- 229910052753 mercury Inorganic materials 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 239000011325 microbead Substances 0.000 description 1
- 239000011859 microparticle Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 125000004108 n-butyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000001280 n-hexyl group Chemical group C(CCCCC)* 0.000 description 1
- 125000000740 n-pentyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000004123 n-propyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 238000011017 operating method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 235000005985 organic acids Nutrition 0.000 description 1
- 239000007800 oxidant agent Substances 0.000 description 1
- 229910052763 palladium Inorganic materials 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 238000003976 plant breeding Methods 0.000 description 1
- 238000009609 prenatal screening Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 125000002914 sec-butyl group Chemical group [H]C([H])([H])C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 239000002924 silencing RNA Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- AWUCVROLDVIAJX-GSVOUGTGSA-N sn-glycerol 3-phosphate Chemical compound OC[C@@H](O)COP(O)(O)=O AWUCVROLDVIAJX-GSVOUGTGSA-N 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 230000003335 steric effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 229940086735 succinate Drugs 0.000 description 1
- KDYFGRWQOYBRFD-UHFFFAOYSA-L succinate(2-) Chemical compound [O-]C(=O)CCC([O-])=O KDYFGRWQOYBRFD-UHFFFAOYSA-L 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 229940095064 tartrate Drugs 0.000 description 1
- MPLHNVLQVRSVEE-UHFFFAOYSA-N texas red Chemical compound [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 1
- 150000003555 thioacetals Chemical class 0.000 description 1
- 238000007671 third-generation sequencing Methods 0.000 description 1
- JOXIMZWYDAKGHI-UHFFFAOYSA-N toluene-4-sulfonic acid Chemical compound CC1=CC=C(S(O)(=O)=O)C=C1 JOXIMZWYDAKGHI-UHFFFAOYSA-N 0.000 description 1
- 125000002221 trityl group Chemical group [H]C1=C([H])C([H])=C([H])C([H])=C1C([*])(C1=C(C(=C(C(=C1[H])[H])[H])[H])[H])C1=C([H])C([H])=C([H])C([H])=C1[H] 0.000 description 1
- 229950010342 uridine triphosphate Drugs 0.000 description 1
- PGAVKCOVUIYSFO-UHFFFAOYSA-N uridine-triphosphate Natural products OC1C(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-UHFFFAOYSA-N 0.000 description 1
- 125000000391 vinyl group Chemical group [H]C([*])=C([H])[H] 0.000 description 1
- 229920002554 vinyl polymer Polymers 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07H—SUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
- C07H19/00—Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
- C07H19/02—Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
- C07H19/04—Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
- C07H19/06—Pyrimidine radicals
- C07H19/10—Pyrimidine radicals with the saccharide radical esterified by phosphoric or polyphosphoric acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P20/00—Technologies relating to chemical industry
- Y02P20/50—Improvements relating to the production of bulk chemicals
- Y02P20/55—Design of synthesis routes, e.g. reducing the use of auxiliary or protecting groups
Definitions
- the invention relates to the field of nucleic acid sequencing. Specifically, the present invention relates to modified nucleosides or nucleotides, more specifically, the present invention relates to non-natural nucleotide analogues containing N, S atoms at the 3' position for NGS sequencing.
- NGS sequencing overcomes the shortcomings of Sanger sequencing, such as high cost and long sequencing time, and greatly promotes the application of gene sequencing technology.
- NGS sequencing has been deeply applied in the fields of prenatal screening, tumor diagnosis, tumor treatment, animal and plant breeding, etc., driving the progress of science and technology and medicine.
- Nucleoside triphosphate (dNTP) analogs with reversible blocking groups are key raw materials for NGS sequencing. Due to the introduction of the reversible blocking group, the 3'-OH group in the dNTP can be retained, which overcomes the shortcomings in Sanger sequencing and ensures the accuracy of base recognition. It can be said that nucleoside triphosphate analogs (dNTP) with reversible blocking groups are the most critical technology in NGS sequencing.
- the realization of dNTP reversible blocking is mainly realized through two kinds of ideas.
- the first type of thinking is to directly introduce a reversible blocking group at the 3'-OH of the dNTP.
- the advantage of this type of modified dNTP is that the blocking of the 3'-OH ensures the blocking efficiency in sequencing.
- Another type of thinking is that the 3'-OH is not blocked, but the polymerase is blocked by base modification.
- the advantage of this strategy is that the modification selectivity of the blocking group is wider and is not limited to the polymerase. .
- dNTP is mainly used as a monomer, and DNA is synthesized after phosphorylation with the 5-position triphosphate under the catalysis of polymerase. From a biochemical point of view, this is an enzyme-catalyzed esterification reaction. With the participation of polymerase and ions, under the premise of base pairing, the 3'-OH of the nucleic acid undergoes a phospholipidation reaction with the 5-position triphosphate bond of the monomer dNTP, releasing pyrophosphate to obtain a product with extended chain length, as shown in Figure 2 .
- the present invention aims to develop a class of non-natural nucleotide analogs for NGS sequencing containing N and S atoms at the 3' position.
- the general structural formula of such dNTP analogs is also shown in Figure 4, the reversible blocking group protects the 3'-S, including 2'deoxy 3'thiouridine triphosphate, 2'deoxy 3'sulfur
- the present invention provides the use of the following compounds or salts thereof for determining the sequence of a target single-stranded polynucleotide,
- the present invention provides a compound or salt thereof represented by formula (A) or (B),
- R is selected from -N 3 , -NR a R b , -SR c ;
- R a , R b are each independently selected from H, N 3 -C1-C6 alkyl (such as -CH 2 -N 3 ), C1-C6 alkyl-SS-C1-C6 alkyl (such as C1-C6 alkyl -SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iPr or -CH 2 -SS-t-Bu), C2-C6 alkenyl- C1-C6 alkyl (such as allyl), and R a and R b are not H at the same time;
- N 3 -C1-C6 alkyl such as -CH 2 -N 3
- C1-C6 alkyl-SS-C1-C6 alkyl such as C1-C6 alkyl -SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-
- R c is selected from N 3 -C1-C6 alkyl (such as -CH 2 -N 3 ), C1-C6 alkyl-SS-C1-C6 alkyl (such as C1-C6 alkyl-SS-CH 2 -, specifically Such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iPr or -CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (such as alkenyl Propyl);
- n is selected from 1, 2, 3, 4;
- R ' is selected from H, a monophosphate group diphosphate group triphosphate group tetraphosphate group
- Each Z is independently selected from O, S, BH;
- Base 1 and Base 2 are independently selected from bases, deaza bases or tautomers thereof;
- R is -N3
- -N3 is a reversible blocking group
- R a and R b are reversible blocking groups
- R c is a reversible blocking group
- R 0 is a reversible blocking group.
- any one of R a and R b is selected from N 3 -C1-C6 alkyl (such as -CH 2 -N 3 ), C1-C6 alkyl-SS-C1-C6 alkyl (such as C1-C6 alkyl-SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iPr or -CH 2 -SS-t-Bu), C2 -C6 alkenyl-C1-C6 alkyl (such as allyl), the other is H.
- N 3 -C1-C6 alkyl such as -CH 2 -N 3
- C1-C6 alkyl-SS-C1-C6 alkyl such as C1-C6 alkyl-SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iP
- any one of R a and R b is selected from N 3 -C1-C6 alkyl, and the other is H.
- either one of R a and R b is -CH 2 -N 3 , and the other is H.
- R c is selected from N 3 -C1-C6 alkyl.
- R c is -CH 2 -N 3 .
- n 1
- R ' is a triphosphate group
- Base 1 and Base 2 are each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine, or tautomers thereof body.
- Z is O.
- Base 1 is selected from
- Base 2 is selected from
- the present invention provides a compound or a salt thereof represented by formula (A-1) or (B-1),
- n is selected from 1, 2, 3, 4;
- R ' is selected from H, a monophosphate group diphosphate group triphosphate group tetraphosphate group
- Each Z is independently selected from O, S, BH;
- Base 1 and Base 2 are independently selected from bases, deaza bases or tautomers thereof;
- -N 3 is a reversible blocking group
- R 0 is a reversible blocking group.
- n 1
- R ' is a triphosphate group
- Z is O.
- Base 1 and Base 2 are each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine, or tautomers thereof body.
- Base 1 is selected from
- Base 2 is selected from
- the present invention provides a compound represented by formula (A-2) or (B-2) or a salt thereof,
- R a , R b are each independently selected from H, N 3 -C1-C6 alkyl (such as -CH 2 -N 3 ), C1-C6 alkyl-SS-C1-C6 alkyl (such as C1-C6 alkyl -SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iPr or -CH 2 -SS-t-Bu), C2-C6 alkenyl- C1-C6 alkyl (such as allyl), and R a and R b are not H at the same time;
- N 3 -C1-C6 alkyl such as -CH 2 -N 3
- C1-C6 alkyl-SS-C1-C6 alkyl such as C1-C6 alkyl -SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-
- n is selected from 1, 2, 3, 4;
- R ' is selected from H, a monophosphate group diphosphate group triphosphate group tetraphosphate group
- Each Z is independently selected from O, S, BH;
- Base 1 and Base 2 are independently selected from bases, deaza bases or tautomers thereof;
- R a and R b are reversible blocking groups
- R 0 is a reversible blocking group.
- any one of R a and R b is selected from N 3 -C1-C6 alkyl (such as -CH 2 -N 3 ), C1-C6 alkyl-SS-C1-C6 alkyl (such as C1-C6 alkyl-SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iPr or -CH 2 -SS-t-Bu), C2 -C6 alkenyl-C1-C6 alkyl (such as allyl), the other is H.
- N 3 -C1-C6 alkyl such as -CH 2 -N 3
- C1-C6 alkyl-SS-C1-C6 alkyl such as C1-C6 alkyl-SS-CH 2 -, specifically such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iP
- any one of R a and R b is selected from N 3 -C1-C6 alkyl, and the other is H.
- either one of R a and R b is -CH 2 -N 3 , and the other is H.
- n 1
- R ' is a triphosphate group
- Z is O.
- Base 1 and Base 2 are each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine, or tautomers thereof body.
- Base 1 is selected from
- Base 2 is selected from
- the present invention provides a compound represented by formula (A-3) or (B-3) or a salt thereof,
- R c is selected from N 3 -C1-C6 alkyl (such as -CH 2 -N 3 ), C1-C6 alkyl-SS-C1-C6 alkyl (such as C1-C6 alkyl-SS-CH 2 -, specifically Such as -CH 2 -SS-Me, -CH 2 -SS-Et, -CH 2 -SS-iPr or -CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (such as alkenyl Propyl);
- n is selected from 1, 2, 3, 4;
- R ' is selected from H, a monophosphate group diphosphate group triphosphate group tetraphosphate group
- Each Z is independently selected from O, S, BH;
- Base 1 and Base 2 are independently selected from bases, deaza bases or tautomers thereof;
- R c is a reversible blocking group
- R 0 is a reversible blocking group.
- R c is selected from N 3 -C1-C6 alkyl.
- R c is -CH 2 -N 3 .
- n 1
- R' is a triphosphate group
- Z is O.
- Base 1 and Base 2 are each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine, or tautomers thereof body.
- Base 1 is selected from
- Base 2 is selected from
- the present invention provides the following compounds or salts thereof:
- the aforementioned compound or salt thereof bears an additional detectable label.
- the additional detectable label carried by the compound or its salt is introduced by an affinity reagent (such as antibody, aptamer, Affimer, Knottin) that carries the detectable label, And the affinity reagent can specifically recognize and bind to the epitope of the compound or its salt.
- an affinity reagent such as antibody, aptamer, Affimer, Knottin
- the additional detectable label is attached to the compound or salt thereof, optionally via a linking group.
- the additional detectable label is optionally attached to Base 1 or R 0 of the compound or salt thereof via a linking group.
- the additional detectable label is attached to the terminal amino group in R 0 of the compound or salt thereof, optionally via a linking group.
- the linking group is a cleavable linking group or a non-cleavable linking group.
- the cleavable linking group is selected from an electrophilic cleavable linking group, a nucleophilic cleavable linking group, a photocleavable linking group, a linking group cleavable under reducing conditions, an oxidative A linker that is cleaved under conditions, a safe-handle linker, a linker that is cleaved by an elimination mechanism, or any combination thereof.
- the Base 1 is different and the compound of formula A is different in the additional detectable label carried.
- the Base 2 is different, and the compound of formula B carries an additional detectable label.
- the detectable label is a fluorescent label.
- the detectable label is selected from the following:
- the present invention provides a method for terminating nucleic acid synthesis, which comprises: incorporating the aforementioned compound or a salt thereof into the nucleic acid molecule to be terminated.
- incorporation of the compound or salt thereof is accomplished by terminal transferase, terminal polymerase, or reverse transcriptase.
- the method includes incorporating the compound, or a salt thereof, into the nucleic acid molecule to be terminated using a polymerase.
- the method comprises: using a polymerase to polymerize nucleotides under conditions that allow the polymerase to polymerize the nucleotides, thereby incorporating the compound or salt thereof into the nucleic acid molecule to be terminated the 3' end.
- the present invention provides a method for preparing a growing polynucleotide that is complementary to a target single-stranded polynucleotide in a sequencing reaction, comprising incorporating the aforementioned compound or a salt thereof into the growing complementary A polynucleotide, wherein incorporation of said compound or salt thereof prevents any subsequent incorporation of nucleotides into said growing complementary polynucleotide.
- incorporation of the compound or salt thereof is accomplished by terminal transferase, terminal polymerase, or reverse transcriptase.
- the method comprises incorporating the compound or salt thereof into the growing complementary polynucleotide using a polymerase.
- the method comprises: using a polymerase to polymerize nucleotides under conditions that allow the polymerase to polymerize the nucleotides, thereby incorporating the compound or salt thereof into the growing complementary the 3' end of the polynucleotide.
- the present invention provides a nucleic acid intermediate formed in determining the sequence of a target single-stranded polynucleotide, wherein the nucleic acid intermediate is formed through the following steps:
- the nucleic acid intermediate is formed by incorporating a nucleotide complementary to the target single-stranded polynucleotide into a growing nucleic acid chain, wherein the incorporated one complementary nucleotide is the aforementioned compound or a salt thereof.
- the present invention provides a nucleic acid intermediate formed in determining the sequence of a target single-stranded polynucleotide, wherein the nucleic acid intermediate is formed through the following steps:
- the present invention provides a method for determining the sequence of a target single-stranded polynucleotide, comprising:
- nucleotide complementary to a target single-stranded polynucleotide in a growing nucleic acid strand wherein at least one complementary nucleotide incorporated is the aforementioned compound or a salt thereof, and,
- the reversible blocking group and optional detectable label are removed prior to the introduction of the next complementary nucleotide.
- the reversible blocking group and the detectable label are removed simultaneously.
- the reversible blocking group and the detectable label are removed sequentially; for example, the reversible blocking group is removed after the detectable label is removed, or, after the After the reversible blocking group is removed, the detectable label is removed.
- the method of determining the sequence of a target single-stranded polynucleotide comprises the steps of:
- nucleotide is the aforementioned compound or a salt thereof, optionally the remaining nucleotides are the aforementioned compound or a salt thereof;
- the method of determining the sequence of a target single-stranded polynucleotide comprises the steps of:
- nucleotide (1) providing a first nucleotide, a second nucleotide, a third nucleotide and a fourth nucleotide, at least one of the four nucleotides is the aforementioned compound or a salt thereof, and optionally the rest
- the nucleotide is the aforementioned compound or a salt thereof;
- (3) repeating (1)-(2) one or more times.
- the method of determining the sequence of a target single-stranded polynucleotide comprises the steps of:
- the cleavage of the reversible blocking group and the cleavage of the detectable label are performed simultaneously, or, the cleavage of the reversible blocking group and the cleavage of the detectable label are performed separately. Step by step (eg, first cleave the reversible blocking group, or first cleave the detectable label).
- the cleavage reagent used for the cleavage of the reversible blocking group and the cleavage of the detectable label is the same reagent.
- the cleavage reagents used for the cleavage of the reversible blocking group and the cleavage of the detectable label are different reagents.
- the duplex is attached to a support.
- the growing nucleic acid strand is a primer.
- the primer forms the duplex by annealing to the nucleic acid strand to be sequenced.
- the duplex, the compound or salt thereof, and the polymerase together form a reaction system comprising a solution phase and a solid phase.
- the compound or salt thereof is incorporated into a growing nucleic acid strand using a polymerase under conditions that permit nucleotide polymerization by the polymerase to form a nucleic acid comprising a reversible blocking group and optionally a Detection of labeled nucleic acid intermediates.
- the polymerase is selected from KOD polymerase or mutants thereof (e.g., KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, KOD POL391).
- the solution phase of the reaction system in the previous step is removed, and the duplex attached to the support is retained.
- the excision reagent is contacted with the duplex or the growing nucleic acid strand in a reaction system comprising a solution phase and a solid phase.
- the cleavage reagent is capable of cleaving a reversible blocking group and optionally a detectable label carried by a compound incorporated into a growing nucleic acid strand, without affecting the phosphate dichotomy on the duplex backbone. ester bond.
- the solution phase of the reaction system of this step is removed.
- a washing operation is performed after any one step comprising a removing operation.
- step (ii) further comprising: according to the signal detected in step (ii), determine the type of compound incorporated into the growing nucleic acid chain in step (i), and based on the principle of complementary base pairing, The type of nucleotide at the corresponding position in the nucleic acid strand to be sequenced is determined.
- the present invention provides a kit comprising at least one of the aforementioned compounds or salts thereof.
- the kit comprises first, second, third, and fourth compounds, each of the first, second, third, and fourth compounds independently being the aforementioned compound or a salt thereof.
- Base 1 in the first compound, Base 1 is selected from adenine, 7-deazaadenine or tautomers thereof (such as ); in the second compound, Base 1 is selected from thymine, uracil or its tautomer (such as ); In the third compound, Base 1 is selected from cytosine or its tautomer (such as ); In the fourth compound, Base 1 is selected from guanine, 7-deazaguanine or its tautomer (such as ).
- Base 2 in the first compound, Base 2 is selected from adenine, 7-deazaadenine or tautomers thereof (such as ); In the second compound, Base 2 is selected from thymine, uracil or its tautomer (such as ); In the third compound, Base 2 is selected from cytosine or its tautomer (such as ); in the fourth compound, Base 2 is selected from guanine, 7-deazaguanine or its tautomer (such as ).
- the first, second, third and fourth compounds comprise Base 1 or Base 2 that are different from each other.
- the first, second, third and fourth compounds carry additional detectable labels that are different from each other.
- the kit further comprises: reagents for pretreating nucleic acid molecules; supports for linking nucleic acid molecules to be sequenced; valent or non-covalent linkage); primers for initiating nucleotide polymerization; polymerases for performing nucleotide polymerization; one or more buffer solutions; one or more washing solutions; or any combination thereof.
- the present invention provides the use of the aforementioned compound or its salt or the aforementioned kit for determining the sequence of a target single-stranded polynucleotide.
- Figure 1 represents base-modified non-natural nucleosides for NGS sequencing
- FIG. 1 shows the biochemical reaction of NGS sequencing
- Figure 3 shows the 3'-N substitution reversible blocking nucleotide analogue of the embodiment of the present invention
- Fig. 4 shows the 3'-S substitution reversible blocking nucleotide analogue of the embodiment of the present invention.
- C1-C6 alkyl specifically refers to independently disclosed methyl, ethyl, C3 alkyl, C4 alkyl, C5 alkyl and C6 alkyl.
- C1-C6 alkyl refers to any straight-chain or branched saturated group containing 1-6 carbon atoms, such as methyl (Me), ethyl (Et), n-propyl, isopropyl (iPr), n-butyl, isobutyl, tert-butyl (t-Bu), sec-butyl, n-pentyl, tert-amyl, n-hexyl, etc.
- C2-C6 alkenyl refers to any straight-chain or branched group containing 2-6 carbon atoms and at least one carbon-carbon double bond, such as vinyl, 1-propenyl, 2-propenyl Base etc.
- salts of compounds represented by formula A, formula A-1, formula A-2, formula A-3, formula B, formula B-1, formula B-2 or formula B-3 examples are organic acid addition salts formed from anion-forming organic acids, including but not limited to formate, acetate, propionate, benzoate, maleate, fumarate, succinate , tartrate, citrate, ascorbate, ⁇ -ketoglutarate, ⁇ -glycerophosphate, alkylsulfonate or arylsulfonate; preferably, the alkylsulfonate is methyl Sulfonate or ethylsulfonate; the arylsulfonate is benzenesulfonate or p-toluenesulfonate.
- Suitable inorganic salts may also be formed including, but not limited to, hydrochloride, hydrobromide, hydroiodide, nitrate, bicarbonate, and carbonate, s
- the nucleic acid strand to be sequenced can be longer than the growing nucleic acid strand.
- the nucleic acid molecule to be sequenced can be any nucleic acid molecule of interest.
- the nucleic acid molecule to be sequenced comprises deoxyribonucleotides, ribonucleotides, modified deoxyribonucleotides, modified ribonucleotides, or any combination thereof.
- the nucleic acid molecule to be sequenced is not limited by its type.
- the nucleic acid molecule to be sequenced is DNA or RNA.
- the nucleic acid molecule to be sequenced may be genomic DNA, mitochondrial DNA, chloroplast DNA, mRNA, cDNA, miRNA, or siRNA.
- the nucleic acid molecule to be sequenced is linear or circular. In certain preferred embodiments, the nucleic acid molecule to be sequenced is double-stranded or single-stranded.
- the nucleic acid molecule to be sequenced may be single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), single-stranded RNA (ssRNA), double-stranded RNA (dsRNA), or a hybrid of DNA and RNA.
- the nucleic acid molecule to be sequenced is single-stranded DNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is double-stranded DNA.
- nucleic acid molecule to be sequenced is not limited by its source.
- nucleic acid molecules to be sequenced can be obtained from any source, eg, any cell, tissue or organism (eg, viruses, bacteria, fungi, plants and animals).
- the nucleic acid molecules to be sequenced are derived from mammals (e.g., humans, non-human primates, rodents, or canines), plants, birds, reptiles, fish, Fungus, bacteria or virus.
- nucleic acid molecules from cells, tissues or organisms are well known to those skilled in the art. Suitable methods include, but are not limited to, ethanol precipitation, chloroform extraction, and the like. A detailed description of such methods can be found, for example, in J. Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, 1989, and F.M. Ausubel et al., Essays in Molecular Biology Laboratory Manual , 3rd Edition, John Wiley & Sons, Inc., 1995. In addition, various commercial kits can be used to extract nucleic acid molecules from various sources such as cells, tissues or organisms.
- the nucleic acid molecule to be sequenced is not limited by its length.
- the length of the nucleic acid molecule to be sequenced may be at least 10bp, at least 20bp, at least 30bp, at least 40bp, at least 50bp, at least 100bp, at least 200bp, at least 300bp, at least 400bp, at least 500bp, at least 1000bp , or at least 2000bp.
- the length of the nucleic acid molecule to be sequenced can be 10-20bp, 20-30bp, 30-40bp, 40-50bp, 50-100bp, 100-200bp, 200-300bp, 300-400bp, 400-500bp, 500-1000bp, 1000-2000bp, or more than 2000bp.
- the nucleic acid molecule to be sequenced may have a length of 10-1000 bp, so as to facilitate high-throughput sequencing.
- a suitable polymerase can be used to perform nucleotide polymerization.
- the polymerase is capable of using DNA as a template to synthesize a new DNA strand (eg, DNA polymerase).
- the polymerase is capable of using RNA as a template to synthesize a new DNA strand (eg, reverse transcriptase).
- the polymerase is capable of synthesizing a new RNA strand (eg, RNA polymerase) using DNA or RNA as a template.
- the polymerase is selected from DNA polymerase, RNA polymerase, and reverse transcriptase. According to actual needs, an appropriate polymerase can be selected to carry out the nucleotide polymerization reaction.
- the polymerization reaction is polymerase chain reaction (PCR). In certain preferred embodiments, the polymerization reaction is a reverse transcription reaction.
- KOD polymerase or a mutant thereof may be used for nucleotide polymerization.
- KOD polymerase or mutants thereof e.g., KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, KOD POL391
- KOD POL391 and KOD POL171 had acceptable polymerization efficiencies for the modified nucleotides of the invention.
- the polymerization efficiency of KOD POL391 or KOD POL171 for the modified nucleotides of the invention is above 70%, such as 70%-80%, 80%-90%, or 90%-100%.
- the polymerization reaction of nucleotides is carried out under suitable conditions.
- suitable polymerization conditions include the composition of the solution phase and the concentration of each component, the pH of the solution phase, the polymerization temperature, and the like. Polymerization under suitable conditions is beneficial to obtain acceptable or even high polymerization efficiency.
- the nitrogen atom or sulfur atom at the 3' position of deoxyribose is protected, so they can terminate the polymerization of polymerase (such as DNA polymerase) .
- polymerase such as DNA polymerase
- the compound represented by formula A or formula B is introduced into the 3' end of the growing nucleic acid chain, since there is no free amino group (-NH 2 ) or sulfhydryl group (-SH) at the 3' position of the deoxyribose sugar of the compound , the polymerase will not be able to proceed to the next round of polymerization, so the polymerization will be terminated. In this case, in each round of polymerization, one and only one base will be incorporated into the growing nucleic acid strand.
- the protective group of the nitrogen atom or sulfur atom at the 3' position of deoxyribose in the compound represented by formula A or formula B can be removed and transformed into a free amino group (-NH 2 ) or mercapto group (-SH) . Subsequently, the polymerase and the compound represented by formula A or formula B can be used to carry out the next round of polymerization reaction on the growing nucleic acid chain, and introduce a base again.
- the nitrogen atom or sulfur atom at the 3' position of deoxyribose in the compound represented by formula A or formula B can be reversibly blocked, specifically, in the compound represented by formula A or formula B, when R is -N3 , -N 3 is a reversible blocking group; when R is -NR a R b , R a and R b are reversible blocking groups; when R is -SR c , R c is a reversible blocking group.
- the polymerase will be able to continue to polymerize the growing nucleic acid chain and continue to extend the nucleic acid chain.
- bases can also be simultaneously protected (eg, protected by R 0 ), which are also capable of terminating polymerization by a polymerase (eg, DNA polymerase).
- a polymerase eg, DNA polymerase
- the protecting group (R 0 ) at the base of the compound represented by formula B can also be removed. Subsequently, the polymerase and the compound represented by formula B can be used to carry out the next round of polymerization reaction on the growing nucleic acid chain, and introduce a base again.
- the base of the compound shown in the formula B is reversibly blocked: when the compound shown in the formula B is incorporated into the 3' end of the growing nucleic acid chain, they will stop the polymerase from continuing to polymerize and terminate the growing nucleic acid Further extension of the chain; and, after the blocking group contained in the compound represented by formula B is removed, the polymerase will be able to continue to polymerize the growing nucleic acid chain and continue to extend the nucleic acid chain.
- Detection can be by any suitable method, including fluorescence spectroscopy or other optical means.
- Preferred labels are fluorescent labels, ie fluorophores, which emit radiation of a defined wavelength upon absorption of energy.
- fluorescent labels ie fluorophores, which emit radiation of a defined wavelength upon absorption of energy.
- a wide variety of suitable fluorescent labels are known. For example, Welch et al. (Chem. Eur. J. 5(3):951-960, 1999) disclose dansyl-functionalized fluorescent moieties that can be used in the present invention. Zhu et al. (Cytometry 28:206-211, 1997) describe the use of fluorescent labels Cy3 and Cy5, which can also be used in the present invention. (Science238:336-341,1987), Connell et al.
- fluorescent labels include, but are not limited to, fluorescein, rhodamine (including TMR, Texas Red, and Rox), alexa, flubororin, acridine, coumarin, pyrene, benzanthracene, and anthocyanins.
- Multiple labeling can also be used in this application, such as dual fluorophore FRET cassettes (Tet.Let.46:8867-8871, 2000), multi-fluorophore dendritic systems (J.Am.Chem.Soc.123:8101 -8108, 2001). While fluorescent labels are preferred, it will be apparent to those of ordinary skill in the art that other forms of detectable labels are suitable. Examples include microparticles, including quantum dots (Empodocles et al., Nature 399:126-130, 1999), gold nanoparticles (Reichert et al., Anal. Chem. 72:6025-6029, 2000) and microbeads (Lacoste et al., Proc. Natl. Acad. Sci USA 97(17): 9461-9466, 2000) can also be used.
- quantum dots Epodocles et al., Nature 399:126-130, 1999
- gold nanoparticles Reichert et al., An
- Multicomponent labels are labels that rely on the interaction with additional compounds for detection.
- the most commonly used multicomponent label in biology is the biotin-streptavidin system. Biotin is used as a label attached to nucleotides or modified nucleotides. Streptavidin is then added alone to allow detection to occur.
- Other multi-component systems can be used. For example, dinitrophenol has commercially available fluorescent antibodies that can be used for detection.
- the modified nucleotide or nucleoside molecule can be made to carry the detectable label described above through the introduction of an affinity reagent (such as antibody, aptamer, Affimer, Knottin), which The reagent can specifically recognize and bind to the epitope of the modified nucleotide or nucleoside molecule, and the specific principle can be found in WO2018129214A1. All relevant content in WO2018129214A1 is incorporated into this application.
- an affinity reagent such as antibody, aptamer, Affimer, Knottin
- a modified nucleotide or nucleoside molecule can be linked to a detectable label as described above.
- the linking group employed is cleavable. The use of a cleavable linking group ensures that the label can be removed after detection if desired, avoiding any interfering signal with any subsequently incorporated labeled nucleotides or nucleosides.
- the linking group used is non-cleavable. Since in each instance where a labeled nucleotide of the invention is incorporated, subsequent incorporation of the nucleotide is not required, and therefore the label need not be removed from the nucleotide.
- Cleavable linking groups are well known in the art, and conventional chemistry can be used to attach the linking group to the nucleotide or modified nucleotide and label.
- the linking group can be cleaved by any suitable method, including exposure to acids, bases, nucleophiles, electrophiles, free radicals, metals, reducing or oxidizing agents, light, temperature, enzymes, and the like.
- the linking groups discussed herein can also be cleaved using the same catalysts used to cleave protecting groups at bases.
- Suitable linking groups may be adapted from standard chemical protecting groups as disclosed in Greene & Wuts, Protective Groups in Organic Synthesis, John Wiley & Sons.
- Suitable cleavable linking groups for solid phase synthesis are also disclosed in Guillier et al. (Chem. Rev. 100:2092-2157, 2000).
- nucleoside cleavage site may be located at a position on the linking group that ensures that a portion of the linking group remains attached to the nucleoside after cleavage. Acids or modified nucleotides remain attached.
- the linking group can be attached anywhere on the nucleotide or modified nucleotide as long as Watson-Crick base pairing is still possible.
- Electrophilic cleavage linking groups are typically cleaved by protons and include acid-sensitive cleavage.
- Suitable linking groups include modified benzyl systems such as trityl, p-oxybenzyl ester and p-oxybenzyl amide.
- Other suitable linking groups include t-butoxycarbonyl (Boc) groups and acetal systems.
- thiophilic metals such as nickel, silver or mercury in the cleavage of thioacetals or other sulfur-containing protecting groups can also be considered.
- Nucleophilic cleavage is also a well-established method in the preparation of linker molecules.
- Groups that are labile in water ie, capable of facile cleavage at basic pH
- esters can be used, as well as groups that are labile to non-aqueous nucleophiles.
- Fluoride ions can be used to cleave silicon-oxygen bonds in groups such as triisopropylsilane (TIPS) or tert-butyldimethylsilane (TBDMS).
- Photolyzable linking groups are widely used in carbohydrate chemistry.
- the light required to activate cleavage does not affect other components in the modified nucleotide.
- a fluorophore is used as a label, preferably the fluorophore absorbs light of a different wavelength than that required to cleave the linker molecule.
- Suitable linking groups include those based on O-nitrobenzyl compounds and nitroveratryl compounds. Linking groups based on benzoin chemistry can also be used (Lee et al., J. Org. Chem. 64:3454-3460, 1999).
- linking groups susceptible to reductive cleavage are known.
- Catalytic hydrogenation using palladium-based catalysts has been used to cleave benzyl and benzyloxycarbonyl groups.
- Disulfide bond reduction is also known in the art.
- Oxidation based methods are well known in the art. These methods include oxidation of alkoxybenzyl groups and oxidation of sulfur and selenium linking groups. It is also within the scope of the invention to use aqueous iodine to cleave disulfides and other sulfur- or selenium-based linkers.
- Safety-catch linkers are those that are cleaved in two steps.
- the first step is the generation of a reactive nucleophilic center, followed by a second step involving intramolecular cyclization, which results in cleavage.
- levulinate linkages can be treated with hydrazine or photochemically to release reactive amines, which are then cyclized to cleave esters elsewhere in the molecule (Burgess et al., J. Org. Chem. 62:5165 -5168, 1997).
- Elimination reactions can also be used.
- Base-catalyzed elimination of groups such as fluorenylmethoxycarbonyl and cyanoethyl as well as palladium-catalyzed reductive elimination of allyl systems can be used.
- linking groups may comprise spacer units.
- the length of the linking group is not critical so long as the label is kept at a sufficient distance from the nucleotide so as not to interfere with the interaction between the nucleotide and the enzyme.
- linking groups may consist of similar functionality to base protecting groups. This makes the deprotection and deprotection methods more efficient since only a single treatment is required to remove the label and protecting group.
- Particularly preferred linking groups are azide-containing linking groups which are cleavable by phosphine.
- Sanger and Sanger-type methods are generally performed by performing an experiment in which eight classes of nucleotides are provided, four classes of nucleotides containing 3'- NH2 groups or 3'-SH groups; four classes of nucleosides
- the acid omits the NH2 group or the SH group and the nucleotides are labeled differently from each other.
- the nucleotides used without the 3'- NH2 group or the 3'-SH group were dideoxynucleotides (ddNTPs).
- the sequence of the target oligonucleotide can be determined, as is well known to those skilled in the art.
- nucleotides in this application have utility in the Sanger method and related protocols, since the same effect achieved by using ddNTPs can be achieved by using the nucleotide analogs described herein.
- nucleotides in this application also have utility in second-generation sequencing (NGS sequencing) and third-generation sequencing (single-molecule sequencing), because the same effect achieved by using dNTPs can be achieved by using the described nucleotide analogs.
- NGS sequencing second-generation sequencing
- third-generation sequencing single-molecule sequencing
- radioactive32P in the attached phosphate group, monitoring of the incorporation of a protected nucleotide at the base can be determined. These can be present either in the ddNTPs themselves or in the primers used for extension.
- each of the following examples only lists the preparation method of nucleotide analogues of one of the four bases, and those skilled in the art can refer to this method to prepare and synthesize the cores of the remaining three bases. nucleotide analogues.
- each raw material is commercially available.
- the conformational inversion product is produced under alkaline conditions, the upper azide product is obtained by adding sodium azide, and then the Tr protecting group is removed to complete the triphosphorylation to obtain the product.
- the intermediate is obtained, and the TBS protecting group is removed, and then the product is obtained by triphosphorylation.
- Embodiment 5 The above-mentioned nucleotide analogs are evaluated on MGISEQ2000
- Nucleotide substrates fluorescently labeled standard hot dNTPs (four types) and standard cold dNTPs (four types), both from MGISEQ-2000RS high-throughput sequencing kit (FCL SE50), Shenzhen MGI Technology Co., Ltd. , Cat. No. 1000012551; nucleotide analogue cold dNTP of the present invention (including four types of dTTP, dATP, dCTP, and dGTP, the structure is as follows, provided by Okainas; only one cold dNTP of the present invention is used for each test):
- the above-mentioned nucleotide substrates and the MGISEQ-2000RS high-throughput sequencing reagent set (FCL SE50) were used for sequencing.
- DNA nanospheres were prepared using Ecoli sequencing library.
- the first round of computer testing polymerize standard hot dNTP, take pictures and record the signal value, then use thpp reagent to cut off the blocking group, 65°C for 1min.
- the fourth round of on-machine testing polymerize the nucleotide analogue cold dNTP of the present invention (only one kind of cold dNTP is polymerized in each test), then polymerize standard hot dNTP, take pictures and record the signal value. Then use thpp reagent to remove the blocking group, 65°C for 1min.
- the fifth round of on-machine testing Aggregate standard hot dNTP, take pictures and record signal values.
- EI incorporation efficiency
- C1 is the signal value of the first round of computer testing
- C2 is the signal value of the second round of computer testing
- C3 is the signal value of the third round of computer testing
- C4 is the signal value of the fourth round of computer testing
- Ec is the ratio of test nucleotide and contrast nucleotide excision efficiency
- EI is the ratio of the polymerization efficiency of test nucleotides and comparison nucleotides
- C3 is the signal value of the third round of computer testing
- C5 is the fifth round of machine test signal value
- CGT is the signal of C base, G base and T base in the third round.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Physics & Mathematics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Saccharide Compounds (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
提供修饰的核苷或核苷酸,所述修饰的核苷或核苷酸的糖环3'位含有N或S原子,可用于NGS测序。还提供包含所述修饰的核苷或核苷酸的试剂盒,以及基于所述修饰的核苷或核苷酸的测序方法。
Description
本发明涉及核酸测序领域。具体地,本发明涉及修饰的核苷或核苷酸,更具体地,本发明涉及3’位含有N,S原子的用于NGS测序的非天然核苷酸类似物。
NGS测序的出现克服了Sanger测序成本高、测序时间长等缺点,极大地推动了基因测序技术的应用。目前,NGS测序已经在产前筛查、肿瘤诊断、肿瘤治疗、动植物育种等领域深度应用,带动了科技与医学的进步。
带有可逆阻断基团的核苷三磷酸(dNTP)类似物是NGS测序中关键原材料。由于可逆阻断基团的引入,使得dNTP中3’-OH基团可以保留,克服了Sanger测序中的缺点,同时保证了碱基识别的准确性。可以说带有可逆阻断基团的核苷三磷酸类似物(dNTP)是NGS测序中最为关键的技术。
目前已经有众多的带有可逆阻断基团的dNTP化合物被报道。实现dNTP可逆阻断主要通过两大类思路来实现。第一类思路是在dNTP的3’-OH直接引入可逆阻断基团,这类修饰的dNTP具有的优势是3’-OH的阻断保证了测序中的阻断效率。另一类思路是3’-OH不阻断,而是靠碱基的修饰进行聚合酶的阻断,这类策略的优势在阻断基团的修饰选择性更广,不受限于聚合酶。
两种思路中都使用到了非天然的核苷三磷酸作为探针实现NGS测序。现有的报道中所有的对天然核苷酸的修饰集中在碱基,主要思路是在碱基中引入连接linker,以便引入荧光标记或者在嘌呤类碱基中,将其7位N替代位C,进而方便引入连接linker,如图1所示。
在NGS测序中,dNTP主要是作为单体,在聚合酶的催化下与5位三磷酸发生磷酸化反应后进行DNA的合成。以生物化学角度看,这就是一个酶催化的酯化反应。聚合酶以及离子的参与,在碱基配对的前提下,核酸3’-OH与单体dNTP的5位三磷酸键发生磷脂化反应,放出焦磷酸得到延长链长度的产物,如图2所示。
发明内容
本发明旨在开发一类在3’位含有N,S原子的用于NGS测序的非天然核苷酸类似物。
一方面,这类dNTP类似物的结构通式如图3所示,可逆阻断基团保护3’-N,包括2’脱氧3’氮取代尿苷三磷酸、2’脱氧3’氮取代胞苷三磷酸、7-脱氮-2’脱氧3’氮取代腺苷三磷酸与7-脱氮-2’脱氧3’氮取代鸟苷三磷酸的主体结构。
另一方面,这类dNTP类似物的结构通式还如图4所示,可逆阻断基团保护3’-S,包括2’脱氧3’硫代尿苷三磷酸、2’脱氧3’硫代胞苷三磷酸、7-脱氮-2’脱氧3’硫代腺苷三磷酸与7-脱氮-2’脱氧3’硫代鸟苷三磷酸的主体结构。
为此,在本发明的第一方面,本发明提供了以下化合物或其盐用于测定目标单链多核苷酸序列的用途,
在本发明的第二方面,本发明提供了式(A)或(B)所示的化合物或其盐,
其中:
R选自-N
3、-NR
aR
b、-SR
c;
R
a、R
b各自独立地选自H、N
3-C1-C6烷基(如-CH
2-N
3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH
2-,具体如-CH
2-SS-Me、-CH
2-SS-Et、-CH
2-SS-iPr或-CH
2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),且R
a和R
b不同时为H;
R
c选自N
3-C1-C6烷基(如-CH
2-N
3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH
2-,具体如-CH
2-SS-Me、-CH
2-SS-Et、-CH
2-SS-iPr或-CH
2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基);
n选自1、2、3、4;
各Z独立地选自O,S,BH;
Base
1、Base
2各自独立地选自碱基、脱氮碱基或其互变异构体;
R为-N
3时,-N
3为可逆阻断基团;
R为-NR
aR
b时,R
a和R
b为可逆阻断基团;
R为-SR
c时,R
c为可逆阻断基团;
R
0为可逆阻断基团。
在一些实施方案中,R
a、R
b中,任一个选自N
3-C1-C6烷基(如-CH
2-N
3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH
2-,具体如-CH
2-SS-Me、-CH
2-SS-Et、-CH
2-SS-iPr或-CH
2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),另一个为H。
在一些实施方案中,R
a、R
b中,任一个选自N
3-C1-C6烷基,另一个为H。
在一些实施方案中,R
a、R
b中,任一个为-CH
2-N
3,另一个为H。
在一些实施方案中,R
c选自N
3-C1-C6烷基。
在一些实施方案中,R
c为-CH
2-N
3。
在一些实施方案中,n为1。
在一些实施方案中,Base
1、Base
2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体。
在一些实施方案中,Z为O。
在本发明的第三方面,本发明提供了式(A-1)或(B-1)所示的化合物或其盐,
其中:
n选自1、2、3、4;
各Z独立地选自O,S,BH;
Base
1、Base
2各自独立地选自碱基、脱氮碱基或其互变异构体;
3’位的-N
3中,-N
3为可逆阻断基团;
R
0为可逆阻断基团。
在一些实施方案中,n为1。
在一些实施方案中,Z为O。
在一些实施方案中,Base
1、Base
2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体。
在本发明的第四方面,本发明提供了式(A-2)或(B-2)所示化合物或其盐,
其中:
R
a、R
b各自独立地选自H、N
3-C1-C6烷基(如-CH
2-N
3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH
2-,具体如-CH
2-SS-Me、-CH
2-SS-Et、-CH
2-SS-iPr或-CH
2-SS-t-Bu)、 C2-C6烯基-C1-C6烷基(如烯丙基),且R
a和R
b不同时为H;
n选自1、2、3、4;
各Z独立地选自O,S,BH;
Base
1、Base
2各自独立地选自碱基、脱氮碱基或其互变异构体;
R
a和R
b为可逆阻断基团;
R
0为可逆阻断基团。
在一些实施方案中,R
a、R
b中,任一个选自N
3-C1-C6烷基(如-CH
2-N
3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH
2-,具体如-CH
2-SS-Me、-CH
2-SS-Et、-CH
2-SS-iPr或-CH
2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),另一个为H。
在一些实施方案中,R
a、R
b中,任一个选自N
3-C1-C6烷基,另一个为H。
在一些实施方案中,R
a、R
b中,任一个为-CH
2-N
3,另一个为H。
在一些实施方案中,n为1。
在一些实施方案中,Z为O。
在一些实施方案中,Base
1、Base
2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体。
在本发明的第五方面,本发明提供了式(A-3)或(B-3)所示的化合物或其盐,
其中:
R
c选自N
3-C1-C6烷基(如-CH
2-N
3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH
2-,具体如-CH
2-SS-Me、-CH
2-SS-Et、-CH
2-SS-iPr或-CH
2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基);
n选自1、2、3、4;
各Z独立地选自O,S,BH;
Base
1、Base
2各自独立地选自碱基、脱氮碱基或其互变异构体;
R
c为可逆阻断基团;
R
0为可逆阻断基团。
在一些实施方案中,R
c选自N
3-C1-C6烷基。
在一些实施方案中,R
c为-CH
2-N
3。
在一些实施方案中,n为1。
在一些实施方案中,Z为O。
在一些实施方案中,Base
1、Base
2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体。
在本发明的第六方面,本发明提供了以下化合物或其盐:
在一些实施方案中,前述的化合物或其盐携带额外的可检测标记。
在一些实施方案中,所述化合物或其盐携带的额外的可检测标记是通过亲和试剂(如抗体、适体、Affimer、Knottin)引入的,所述亲和试剂携带所述可检测标记,且所述亲和试剂可以特异性识别并结合所述化合物或其盐的表位。
在一些实施方案中,所述额外的可检测标记任选地通过连接基团与所述化合物或其盐连接。
在一些实施方案中,所述额外的可检测标记任选地通过连接基团与所述化合物或其盐的Base
1或R
0连接。
在一些实施方案中,所述额外的可检测标记任选地通过连接基团与所述化合物或其盐的R
0中的末端氨基连接。
在一些实施方案中,所述连接基团为可裂解的连接基团或不可裂解的连接基团。
在一些实施方案中,所述可裂解的连接基团选自亲电裂解的连接基团、亲核裂解的连接基团、可光解的连接基团、还原条件下裂解的连接基团、氧化条件下裂解的连接基团、安全 拉手型连接基团、经消除机理裂解的连接基团,或其任何组合。
在一些实施方案中,Base
1不同,式A所示化合物携带的额外的可检测标记不同。
在一些实施方案中,Base
2不同,式B所示化合物携带的额外的可检测标记不同。
在一些实施方案中,所述可检测标记为荧光标记。
在本发明的第七方面,本发明提供了终止核酸合成的方法,其包括:将前述的化合物或其盐掺入待终止的核酸分子中。
在一些实施方案中,所述化合物或其盐的掺入通过末端转移酶、末端聚合酶或逆转录酶来实现。
在一些实施方案中,所述方法包括:使用聚合酶,将所述化合物或其盐掺入待终止的核酸分子中。
在一些实施方案中,所述方法包括:在允许聚合酶进行核苷酸聚合反应的条件下,使用聚合酶进行核苷酸聚合反应,从而将所述化合物或其盐掺入待终止的核酸分子的3'端。
在本发明的第八方面,本发明提供了制备在测序反应中与目标单链多核苷酸互补的生长的多核苷酸的方法,其包括将前述的化合物或其盐掺入所述生长的互补多核苷酸,其中,所 述化合物或其盐的掺入防止了任何后续的核苷酸引入所述生长的互补多核苷酸中。
在一些实施方案中,所述化合物或其盐的掺入通过末端转移酶、末端聚合酶或逆转录酶来实现。
在一些实施方案中,所述方法包括:使用聚合酶,将所述化合物或其盐掺入所述生长的互补多核苷酸。
在一些实施方案中,所述方法包括:在允许聚合酶进行核苷酸聚合反应的条件下,使用聚合酶进行核苷酸聚合反应,从而将所述化合物或其盐掺入所述生长的互补多核苷酸的3'端。
在本发明的第九方面,本发明提供了核酸中间体,其是在测定目标单链多核苷酸的序列中形成的,其中,所述核酸中间体是通过以下步骤形成的:
向生长的核酸链中掺入一个与目标单链多核苷酸互补的核苷酸,形成所述核酸中间体,其中,掺入的一个互补核苷酸是前述的化合物或其盐。
在本发明的第十方面,本发明提供了核酸中间体,其是在测定目标单链多核苷酸的序列中形成的,其中,所述核酸中间体是通过以下步骤形成的:
向生长的核酸链中掺入一个与目标单链多核苷酸互补的核苷酸,形成所述核酸中间体,其中,掺入的一个互补核苷酸是前述的化合物或其盐,且所述生长的核酸链中预先掺入至少一个与目标单链多核苷酸互补的核苷酸,预先掺入的至少一个与目标单链多核苷酸互补的核苷酸是已被除去可逆阻断基团和任选的可检测标记的前述的化合物或其盐。
在本发明的第十一方面,本发明提供了测定目标单链多核苷酸的序列的方法,其包括:
1)监测生长的核酸链中与目标单链多核苷酸互补的核苷酸的掺入,其中,掺入的至少一个互补核苷酸是前述的化合物或其盐,以及,
2)确定掺入的核苷酸的类型。
在一些实施方案中,在引入下一个互补核苷酸之前,将所述可逆阻断基团和任选的可检测标记除去。
在一些实施方案中,所述可逆阻断基团和所述可检测标记被同时除去。
在一些实施方案中,所述可逆阻断基团和所述可检测标记被先后除去;例如,在所述可检测标记被除去之后,所述可逆阻断基团被除去,或者,在所述可逆阻断基团被除去之后,所述可检测标记被除去。
在一些实施方案中,所述测定目标单链多核苷酸的序列的方法包括以下步骤:
(a)提供多种不同的核苷酸,其中至少一种核苷酸是前述的化合物或其盐,任选地其余的核苷酸是前述的化合物或其盐;
(b)将所述多种不同的核苷酸掺入目标单链多核苷酸的互补序列中,其中,所述多种不同的核苷酸在检测时可以相互区分开;
(c)检测(b)的核苷酸,从而确定掺入的核苷酸的类型;
(d)除去(b)的核苷酸中的可逆阻断基团和任选的其携带的可检测标记;和
(e)任选地重复步骤(a)-(d)一次或多次;
从而确定所述目标单链多核苷酸的序列。
在一些实施方案中,所述测定目标单链多核苷酸的序列的方法包括以下步骤:
(1)提供第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,四种核苷酸中的至少一种是前述的化合物或其盐,任选地其余的核苷酸是前述的化合物或其盐;
(2)将所述四种核苷酸与目标单链多核苷酸进行接触;除去未掺入生长的核酸链中的所述核苷酸;检测掺入生长的核酸链中的所述核苷酸;除去掺入生长的核酸链中的所述核苷酸中的所述可逆阻断基团和任选的其携带的所述可检测标记;
任选地,还包括(3):重复(1)-(2)一次或多次。
在一些实施方案中,所述测定目标单链多核苷酸的序列的方法包括以下步骤:
(a)提供包含双链体、包含至少一种前述的化合物或其盐的核苷酸、聚合酶和切除试剂的混合物;所述双链体包含生长的核酸链以及待测序的核酸链;
(b)进行包含以下步骤(i)、(ii)和(iii)的反应,任选地,重复一次或多次:
步骤(i):使用聚合酶,使所述化合物或其盐掺入生长的核酸链,形成包含可逆阻断基团和任选的可检测标记的核酸中间体:
步骤(ii):对所述核酸中间体进行检测;
步骤(iii):使用切除试剂将所述核酸中间体所包含的可逆阻断基团和任选的所述可检测标记切除。
在一些实施方案中,对所述可逆阻断基团的切除和对所述可检测标记的切除同时进行,或者,对所述可逆阻断基团的切除和对所述可检测标记的切除分步进行(例如,先切除所述可逆阻断基团,或者先切除所述可检测标记)。
在一些实施方案中,对所述可逆阻断基团的切除和对所述可检测标记的切除使用的切除试剂是同样的试剂。
在一些实施方案中,对所述可逆阻断基团的切除和对所述可检测标记的切除使用的切除试剂是不同的试剂。
在一些实施方案中,所述双链体连接于支持物上。
在一些实施方案中,所述生长的核酸链为引物。
在一些实施方案中,所述引物通过退火至待测序的核酸链上,形成所述双链体。
在一些实施方案中,所述双链体、所述化合物或其盐、以及所述聚合酶一起形成含有溶液相和固相的反应体系。
在一些实施方案中,在允许聚合酶进行核苷酸聚合反应的条件下,使用聚合酶,使所述化合物或其盐掺入生长的核酸链,形成包含可逆阻断基团和任选的可检测标记的核酸中间体。
在一些实施方案中,所述聚合酶选自KOD聚合酶或其突变体(例如KOD POL151、KOD POL157、KOD POL171、KOD POL174、KOD POL376、KOD POL391)。
在一些实施方案中,在任意一个检测所述核酸中间体的步骤前,移除前一步骤的反应体系的溶液相,保留连接于支持物上的双链体。
在一些实施方案中,所述切除试剂与所述双链体或所述生长的核酸链在含有溶液相和固相的反应体系中接触。
在一些实施方案中,所述切除试剂能够切除掺入生长的核酸链的化合物中的可逆阻断基团和任选的其携带的可检测标记,并且不会影响双链体骨架上的磷酸二酯键。
在一些实施方案中,在任意一个切除所述核酸中间体所包含的可逆阻断基团和任选的可检测标记的步骤后,移除这一步骤反应体系的溶液相。
在一些实施方案中,在任意一个包含移除操作的步骤之后,进行洗涤操作。
在一些实施方案中,步骤(ii)之后,进一步包括:根据步骤(ii)检测得到的信号,确定步骤(i)中掺入生长的核酸链的化合物的类型,并基于碱基互补配对原则,确定待测序的核酸链中相应位置处的核苷酸类型。
在本发明的第十二方面,本发明提供了试剂盒,其包含至少一个前述的化合物或其盐。
在一些实施方案中,所述试剂盒包含第一、第二、第三和第四化合物,所述第一、第二、第三和第四化合物各自独立地为前述的化合物或其盐。
在一些实施方案中,所述第一化合物中,Base
1选自腺嘌呤、7-脱氮腺嘌呤或其互变异构体(例如
);所述第二化合物中,Base
1选自胸腺嘧啶、尿嘧啶或其互变异构体(例如
);所述第三化合物中,Base
1选自胞嘧啶或其互变异构体(例如
);所述第四化合物中,Base
1选自鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体(例如
)。
在一些实施方案中,所述第一化合物中,Base
2选自腺嘌呤、7-脱氮腺嘌呤或其互变异构体(例如
);所述第二化合物中,Base
2选自胸腺嘧啶、尿嘧啶或其互变异构体(例如
);所述第三化合物中,Base
2选自胞嘧啶或其互变异构体(例如
);所述第四化合物中,Base
2选自鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体(例如
)。
在一些实施方案中,所述第一、第二、第三和第四化合物包含的Base
1或Base
2互不相同。
在一些实施方案中,所述第一、第二、第三和第四化合物携带的额外的可检测标记互不相同。
在一些实施方案中,所述试剂盒还包含:用于预处理核酸分子的试剂;用于连接待测序的核酸分子的支持物;用于将待测序的核酸分子与支持物连接(例如,共价或非共价连接)的试剂;用于起始核苷酸聚合反应的引物;用于进行核苷酸聚合反应的聚合酶;一种或多种缓冲溶液;一种或多种洗涤溶液;或其任何组合。
在本发明的第十三方面,本发明提供了前述的化合物或其盐或者前述的试剂盒用于测定目标单链多核苷酸的序列的用途。
图1表示用于NGS测序的碱基修饰的非天然核苷;
图2表示NGS测序生化反应;
图3表示本发明实施例的3’-N取代可逆阻断核苷酸类似物;
图4表示本发明实施例的3’-S取代可逆阻断核苷酸类似物。
下面通过具体实施例详细描述本发明的实施方式,但是无论如何它们不能解释为对本发明的限制。
除非特殊说明,上述基团和取代基具有药物化学领域的普通含义。
在本说明书的各部分,本发明公开化合物的取代基按照基团种类或范围公开。特别指出,本发明包括这些基团种类和范围的各个成员的每一个独立的次级组合。例如,术语“C1-C6烷基”特别指独立公开的甲基、乙基、C3烷基、C4烷基、C5烷基和C6烷基。
另外,需要说明的是,除非以其他方式明确指出,在本文中通篇采用的描述方式“各…独立地为/选自”和“…各自独立地为/选自”可以互换,均应做广义理解,其既可以是指在不同基团中,相同或不同的符号之间所表达的具体选项之间互相不影响,也可以表示在相同的基团中,相同或不同的符号之间所表达的具体选项之间互相不影响。
术语“C1-C6烷基”指的是任意的含有1-6个碳原子的直链或支链饱和基团,例如甲基(Me)、乙基(Et)、正丙基、异丙基(iPr)、正丁基、异丁基、叔丁基(t-Bu)、仲丁基、正戊基、叔戊基、正己基等。
术语“C2-C6烯基”指的是任意的含有2-6个碳原子,且含有至少一个碳碳双键的直链或支链基团,例如乙烯基、1-丙烯基、2-丙烯基等。
从所有上述描述中,对本领域技术人员显而易见的是,其名称是复合名称的任意基团, 例如“N
3-C1-C6烷基”,应该指的是常规地从其衍生的部分例如从被叠氮基(-N
3)取代的C1-C6烷基来构建,其中C1-C6烷基如上文所定义。
如本文所使用,术语“式A、式A-1、式A-2、式A-3、式B、式B-1、式B-2或式B-3所示的化合物的盐”的例子是由形成阴离子的有机酸形成的有机酸加合盐,包括但不限于甲酸盐、乙酸盐、丙酸盐、苯甲酸盐、马来酸盐、富马酸盐、琥珀酸盐、酒石酸盐、柠檬酸盐、抗坏血酸盐、α-酮戊二酸盐、α-甘油磷酸盐、烷基磺酸盐或芳基磺酸盐;优选地,所述烷基磺酸盐为甲基磺酸盐或乙基磺酸盐;所述芳基磺酸盐为苯磺酸盐或对甲苯磺酸盐。也可形成合适的无机盐,包括但不限于盐酸盐、氢溴酸盐、氢碘酸盐、硝酸盐、碳酸氢盐和碳酸盐、硫酸盐或磷酸盐等。
在本发明的方法中,只要是由生长的核酸链和待测序的核酸链这两条链组成的物质,均称其为“双链体”,与生长的核酸链或待测序的核酸链的链长无关,待测序的核酸链可以比生长的核酸链的链长更长。
在本发明的方法中,待测序的核酸分子可以是任何目的核酸分子。在某些优选的实施方案中,所述待测序的核酸分子包含脱氧核糖核苷酸、核糖核苷酸、经修饰的脱氧核糖核苷酸、经修饰的核糖核苷酸、或其任何组合。在本发明的方法中,待测序的核酸分子不受其类型的限制。在某些优选的实施方案中,所述待测序的核酸分子为DNA或RNA。在某些优选的实施方案中,所述待测序的核酸分子可以为基因组DNA,线粒体DNA,叶绿体DNA,mRNA,cDNA,miRNA,或siRNA。在某些优选的实施方案中,所述待测序的核酸分子为线性的或者环状的。在某些优选的实施方案中,所述待测序的核酸分子为双链的或者单链的。例如,所述待测序的核酸分子可以为单链DNA(ssDNA),双链DNA(dsDNA),单链RNA(ssRNA),双链RNA(dsRNA),或者DNA和RNA的杂合体。在某些优选的实施方案中,所述待测序的核酸分子为单链DNA。在某些优选的实施方案中,所述待测序的核酸分子为双链DNA。
在本发明的方法中,待测序的核酸分子不受其来源的限制。在某些优选的实施方案中,待测序的核酸分子可以获自任何来源,例如,任何细胞、组织或生物体(例如,病毒,细菌,真菌,植物和动物)。在某些优选的实施方案中,待测序的核酸分子源自哺乳动物(例如,人、非人灵长类动物、啮齿类动物或犬科动物)、植物、鸟类、爬行类、鱼类、真菌、细菌或病毒。
从细胞、组织或生物体中提取或获得核酸分子的方法是本领域技术人员公知的。合适的方法包括但不限于乙醇沉淀法,氯仿抽提法等。关于此类方法的详细描述可参见例如,J.Sambrook等人,分子克隆:实验室手册,第2版,冷泉港实验室出版社,1989,以及F.M.Ausubel等人,精编分子生物学实验指南,第3版,John Wiley&Sons,Inc.,1995。另外,还可使用各种商业化的试剂盒来从各种来源(例如细胞、组织或生物体)提取核酸分子。
在本发明的方法中,待测序的核酸分子不受其长度的限制。在某些优选的实施方案中,待测序的核酸分子的长度可以为至少10bp,至少20bp,至少30bp,至少40bp,至少50bp,至少100bp,至少200bp,至少300bp,至少400bp,至少500bp,至少1000bp,或者至少2000bp。在某些优选的实施方案中,待测序的核酸分子的长度可以为10-20bp,20-30bp,30-40bp,40-50bp,50-100bp,100-200bp,200-300bp,300-400bp,400-500bp,500-1000bp,1000-2000bp,或者超过2000bp。在某些优选的实施方案中,待测序的核酸分子可具有10-1000bp的长度,以利于进行高通量测序。
在本发明的制备多核苷酸的方法或测序方法中,可使用合适的聚合酶来进行核苷酸聚合反应。在一些示例性实施方案中,所述聚合酶能够以DNA为模板合成新的DNA链(例如DNA聚合酶)。在一些示例性实施方案中,所述聚合酶能够以RNA为模板合成新的DNA链(例如反转录酶)。在一些示例性实施方案中,所述聚合酶能够以DNA或RNA为模板合成新的RNA链(例如RNA聚合酶)。因此,在某些优选的实施方案中,所述聚合酶选自DNA聚合酶,RNA聚合酶,和反转录酶。可根据实际需要,选择合适的聚合酶来进行核苷酸聚合反应。在某些优选的实施方案中,所述聚合反应为聚合酶链式反应(PCR)。在某些优选的实施方案中,所述聚合反应为反转录反应。
在本发明的方法中,可以使用KOD聚合酶或其突变体进行核苷酸聚合反应。KOD聚合酶或其突变体(例如KOD POL151、KOD POL157、KOD POL171、KOD POL174、KOD POL376、KOD POL391)对本发明的修饰的核苷或核苷酸具有可接受的聚合效率。KOD POL391和KOD POL171对本发明的修饰的核苷酸的具有可接受的聚合效率。在某些 实施方案中,KOD POL391或KOD POL171对本发明的修饰的核苷酸的聚合效率在70%以上,例如70%-80%、80%-90%或90%-100%。
在本发明的制备多核苷酸的方法或测序方法中,核苷酸的聚合反应在适宜的条件下进行。适宜的聚合条件包括溶液相的组成以及各成分的浓度、溶液相的pH、聚合温度等。在适宜的条件下进行聚合,有利于获得可接受的、甚至高的聚合效率。
在本发明的一些实施方案中,式A或式B所示化合物中,脱氧核糖3'位置处的氮原子或硫原子受保护,因此,它们能够终止聚合酶(例如DNA聚合酶)的聚合作用。例如,当式A或式B所示化合物被引入生长的核酸链的3'端时,由于该化合物的脱氧核糖的3'位置处不存在游离的氨基(-NH
2)或巯基(-SH),聚合酶将无法继续进行下一轮的聚合反应,从而聚合反应将被终止。在这种情况下,在每一轮的聚合反应中,将有且只有一个碱基被掺入生长的核酸链。
此外,所述式A或式B所示化合物的脱氧核糖3'位置处的氮原子或硫原子的保护基团能够被去除,并转变为游离的氨基(-NH
2)或巯基(-SH)。随后,可使用聚合酶和式A或式B所示化合物对生长的核酸链进行下一轮的聚合反应,并再次引入一个碱基。
因此,所述式A或式B所示化合物的脱氧核糖3'位置处的氮原子或硫原子可以被可逆阻断,具体地,式A或式B所示化合物中,R为-N
3时,-N
3为可逆阻断基团;R为-NR
aR
b时,R
a和R
b为可逆阻断基团;R为-SR
c时,R
c为可逆阻断基团。当式A或式B所示化合物被掺入生长的核酸链的3'端时,它们将终止聚合酶继续进行聚合作用,终止生长的核酸链的进一步延伸;并且,在式A或式B所示化合物所包含的阻断基团被去除后,3'位置处将存在游离的氨基(-NH
2)或巯基(-SH),聚合酶将能够继续对生长的核酸链进行聚合作用,继续延伸核酸链。特别需要说明的是,R为-N
3时,-N
3为可逆阻断基团,在一定的条件下,当可逆阻断基团被脱除后,3'位置处将存在游离的氨基(-NH
2),聚合酶将能够继续对生长的核酸链进行聚合作用,继续延伸核酸链。
另外,在一些实施方案中,碱基处也可以同时受保护(例如,被R
0保护),它们也能够终止聚合酶(例如DNA聚合酶)的聚合作用。例如,当式B所示化合物被引入生长的核酸链的3'端时,不仅3'位置处氮原子或硫原子受保护,而且由于该化合物的碱基处R
0的位阻效应或氢键相互作用等而导致聚合酶将无法继续进行下一轮的聚合反应,从而聚合反应将被终止。在这种情况下,在每一轮的聚合反应中,将有且只有一个碱基被掺入生长的核酸链。
同时,式B所示化合物的碱基处的保护基团(R
0)也能够被去除。随后,可使用聚合酶和式B所示化合物对生长的核酸链进行下一轮的聚合反应,并再次引入一个碱基。
因此,所述式B所示化合物的碱基被可逆阻断:当式B所示化合物被掺入生长的核酸链的3'端时,它们将终止聚合酶继续进行聚合作用,终止生长的核酸链的进一步延伸;并且,在式B所示化合物所包含的阻断基团被去除后,聚合酶将能够继续对生长的核酸链进行聚合作用,继续延伸核酸链。
本文描述的某些实施方案涉及常规可检测标记的使用。可通过任何适合的方法进行检测,包括荧光光谱学或其他光学手段。优选的标记为荧光标记即荧光团,该荧光团在吸收能量后发出限定波长的辐射。已知许多种适合的荧光标记。例如,Welch等人(Chem.Eur.J.5(3):951-960,1999)公开了丹酰基-功能化的荧光部分,其可在本发明中使用。Zhu等人(Cytometry28:206-211,1997)描述了荧光标记Cy3和Cy5的使用,其也可以在本发明中使用。Prober等人(Science238:336-341,1987)、Connell等人(BioTechniques5(4):342-384,1987)、Ansorge等人(Nucl.AcidsRes.15(11):4593-4602,1987)和Smith等人(Nature321:674,1986)也公开了适合使用的标记。其他可商业购得的荧光标记包括但不限于荧光素、若丹明(包括TMR、德克萨斯红和Rox)、alexa、氟硼荧、吖啶、香豆素、芘、苯并蒽和花青苷。
本申请中也可以使用多重标记,例如双荧光团FRET盒(Tet.Let.46:8867-8871,2000)、也可以使用多荧光体树枝状系统(J.Am.Chem.Soc.123:8101-8108,2001)。虽然优选荧光标记,对于本领域的普通技术人员来说其他形式的可检测标记也明显适用。例如微颗粒,包括量子点(Empodocles等人,Nature 399:126-130,1999)、金纳米颗粒(Reichert等人,Anal.Chem.72:6025-6029,2000)和微珠(Lacoste等人,Proc.Natl.Acad.Sci USA 97(17):9461-9466,2000)也都可以使用。
本申请也可以使用多组分标记。多组分标记是依赖于与用于检测的另外化合物的相互作 用的标记。在生物学中最常用的多组分标记是生物素-链霉亲和素系统。生物素用作与核苷酸或修饰的核苷酸相连接的标记。然后单独加入链霉亲和素使检测发生。可以使用其他多组分系统。例如,二硝基苯酚具有可商业购得的可用于检测的荧光抗体。
在本文描述的某些实施方案中,可以通过亲和试剂(如抗体、适体、Affimer、Knottin)的引入使得修饰的核苷酸或核苷分子携带上文描述的可检测标记,所述亲和试剂可以特异性识别并结合所述修饰的核苷酸或核苷分子的表位,具体原理详见WO2018129214A1。WO2018129214A1中的全部相关内容引入本申请中。
在本文描述的另外一些实施方案中,可以将修饰的核苷酸或核苷分子与上文描述的可检测标记相连接。在某些这类实施方案中,所用的连接基团可裂解。使用可裂解的连接基团确保了在需要时所述标记能够在检测后被除去,这避免了与随后并入的任何标记的核苷酸或核苷的任何干扰信号。
在另一些实施方案中,所使用的连接基团是不可裂解的。因为在并入了本发明的标记核苷酸的每个情况中,不需要随后并入核苷酸,因此不需要将标记从核苷酸中除去。
可裂解的连接基团是本领域中熟知的,并且可应用常规化学将连接基团与核苷酸或修饰的核苷酸和标记相连。可通过任何适合的方法裂解所述连接基团,包括暴露于酸、碱、亲核试剂、亲电试剂、自由基、金属、还原剂或氧化剂、光、温度、酶等。还可以使用用于断裂碱基处的保护基的相同催化剂裂解本文中讨论的连接基团。如Greene&Wuts,Protective Groups in Organic Synthesis(有机合成中的保护基),John Wiley&Sons中所公开的,合适的连接基团可修改自标准的化学保护基。Guillier等人(Chem.Rev.100:2092-2157,2000)中还公开了用于固相合成的合适的可裂解的连接基团。
使用术语“可裂解的连接基团”并非意味着需要除去整个连接基团,例如,从核苷酸或修饰的核苷酸中除去。当可检测标记与核苷酸或修饰的核苷酸相连接时,核苷裂解位点可位于连接基团上的位置,该位置能够确保在裂解后一部分的连接基团仍与所述核苷酸或修饰的核苷酸保持连接。
当可检测标记与核苷酸或修饰的核苷酸相连接时,连接基团可以连接在核苷酸或修饰的核苷酸上的任何位置上,只要Watson-Crick碱基配对仍然能够进行。
A.亲电裂解的连接基团
亲电裂解的连接基团典型地被质子所裂解,并包括对酸敏感的裂解。合适的连接基团包括修饰的苄基系统,诸如三苯甲基、对烃氧基苄基酯和对烃氧基苄基酰胺。其他适合的连接基团包括叔丁氧羰基(Boc)基团和缩醛系统。
为制备合适的连接分子,还可以考虑在硫缩醛或其他含硫保护基的裂解中使用诸如镍、银或汞的亲硫金属。
B.亲核裂解的连接基团
在连接分子的制备中,亲核裂解也是被公认的方法。可以使用在水中不稳定的基团(即,能够在碱性pH值下简单地裂解),例如酯类,以及对非水性亲核试剂不稳定的基团。氟离子可用于裂解诸如三异丙基硅烷(TIPS)或叔丁基二甲基硅烷(TBDMS)的基团中的硅氧键。
C.可光解的连接基团
可光解的连接基团在糖化学中被广泛使用。优选地,激活裂解所需的光不影响修饰的核苷酸中的其他组分。例如,如果使用荧光团作为标记,优选地,该荧光团吸收与裂解所述连接分子所需的光不同波长的光。适合的连接基团包括那些基于O-硝基苄基化合物和硝基藜芦基化合物的连接基团。也可以使用基于安息香化学的连接基团(Lee等人,J.Org.Chem.64:3454-3460,1999)。
D.还原条件下的裂解
已知多种对还原裂解敏感的连接基团。使用基于钯催化剂的催化氢化已用于裂解苄基和苄氧羰基基团。二硫键还原也为本领域所知。
E.氧化条件下的裂解
基于氧化的方法为本领域所公知。这些方法包括对烃氧基苄基的氧化以及硫和硒连接基团的氧化。使用碘溶液(aqueous iodine)来使二硫化物和其他基于硫或硒的连接基团裂解也在本发明的范围内。
F.安全拉手型连接基团
安全拉手型连接基团(safety-catch linker)为那些在两步中裂解的连接基团。在优选的系 统中,第一步是反应性亲核中心的产生,随后的第二步涉及分子内环化,这导致裂解。例如,可以用肼或光化学方法处理乙酰丙酸酯连接来释放活性的胺,然后所述胺被环化以使分子中其他位置的酯裂解(Burgess等人,J.Org.Chem.62:5165-5168,1997)。
G.经消除机理裂解
也可以使用消除反应。可以使用诸如芴甲氧羰基和氰基乙基的基团的碱催化的消除以及烯丙基系统的钯催化的还原消除。
在某些实施方案中,连接基团可包含间隔单元。连接基团的长度并不重要,只要所述标记与核苷酸保持足够的距离,以免干扰核苷酸与酶之间的相互作用。
在某些实施方案中,连接基团可由与碱基保护基类似的功能组成。这会使脱保护和脱保护方法更加有效,因为仅需要单一处理就除去标记和保护基。特别优选的连接基团是可通过膦裂解的含叠氮化物的连接基团。
本领域技术人员了解双脱氧核苷三磷酸酯在所谓的Sanger测序法及相关方案(Sanger型)中的效用,其依赖于在特定类型核苷酸处的随机链终止。Sanger型测序方案的一个实例是由Metzker描述的BASS方法。
通常通过实施如下实验来操作Sanger及Sanger型方法,在该实验中提供八类核苷酸,其中四类核苷酸包含3’-NH
2基团或3’-SH基团;四类核苷酸遗漏了NH
2基团或SH基团且该核苷酸彼此不同地被标记。所用的遗漏了3’-NH
2基团或3’-SH基团的核苷酸为双脱氧核苷酸(ddNTPs)。如本领域技术人员所熟知的,当所述ddNTPs被不同地标记时,通过测定被并入的末端核苷酸的位置,并结合此信息,可以测定目标寡核苷酸的序列。
应认识到本申请中的核苷酸在Sanger法及相关方案中具有效用,因为通过使用ddNTPs实现的相同效果可以通过使用本文描述的核苷酸类似物来实现。
另外,还应该认识到,本申请中的核苷酸在第二代测序(NGS测序)和第三代测序(单分子测序)中也具有效用,因为通过使用dNTP实现的相同效果可以通过使用本文描述的核苷酸类似物来实现。
此外,应领会到通过在连接的磷酸酯基团中使用放射性的
32P,可以测定对碱基处被保护的核苷酸的并入的监测。这些可以或者存在于ddNTPs自身中或者存在于用于延伸的引物中。
下面结合具体实施例对本发明进行进一步的解释说明。需要说明的是,以下各实施例仅列举了四种碱基中的一种碱基的核苷酸类似物的制备方法,本领域技术人员可以参照该方法,制备合成剩余三种碱基的核苷酸类似物。另外,除非特别说明,否则各原料均可以市购获得。
实施例1
化合物3位羟基进行Ms活化后,在碱性条件下生产构象反转产物,加入叠氮化钠得到上叠氮产品,随后脱去Tr保护基,完成三磷酸化得到产品。
实施例2
化合物进行叠氮化后得到中间体,脱去TBS保护基然后再用三磷酸化得到产品。
实施例3
实施例4
实施例5上述核苷酸类似物在MGISEQ2000上机评估方法
1、阻断效果评估
核苷酸底物:荧光标记的standard hot dNTP(四种)和standard cold dNTP(四种),均来自于MGISEQ-2000RS高通量测序试剂套装(FCL SE50),深圳华大智造科技股份有限公司,货号1000012551;本发明核苷酸类似物cold dNTP(包括dTTP、dATP、dCTP、dGTP四种,结构如下,由欧凯纳斯公司提供;每次测试只使用一种本发明cold dNTP):
按照MGISEQ2000测序仪操作规程,使用上述核苷酸底物以及MGISEQ-2000RS高通量测序试剂套装(FCL SE50)进行测序。
(1)利用Ecoli测序文库制备DNA纳米球。
(2)将DNA纳米球装载到MGISEQ2000测序芯片上。
(3)将装载完成的测序芯片加载至MGISEQ2000测序仪上,设置测序流程。
(4)第一轮上机测试:聚合standard hot dNTP,拍照记录信号值,然后用thpp试剂切除阻断基团,65℃1min。
(5)第二轮上机测试:聚合standard cold dNTP,然后聚合standard hot dNTP,拍照记录信号值,然后用thpp试剂切除阻断基团,65℃ 1min。
(6)第三轮上机测试:聚合standard hot dNTP,拍照记录信号值。后用thpp试剂切除阻断基团,65℃1min。
(7)第四轮上机测试:聚合本发明核苷酸类似物cold dNTP(每次测试只聚合一种cold dNTP),然后聚合standard hot dNTP,拍照记录信号值。然后用thpp试剂切除阻断基团,65℃1min。
(8)第五轮上机测试:聚合standard hot dNTP,拍照记录信号值。
(9)评估聚合效率和切除效率,结果见表1。
其中:EI(Incorporation efficiency),为测试核苷酸与对比核苷酸聚合效率比值;
C1为第一轮上机测试信号值;
C2为第二轮上机测试信号值;
C3为第三轮上机测试信号值;
C4为第四轮上机测试信号值;
其中:
Ec(Cleavage efficiency),为测试核苷酸与对比核苷酸切除效率比值;
EI为测试核苷酸与对比核苷酸聚合效率比值;
C3为第三轮上机测试信号值;
C5为第五轮上机测试信号值;
CGT为第三轮中C碱基、G碱基与T碱基的信号。
表1 本发明核苷酸类似物的聚合效率和切除效率
Claims (18)
- 式(A)或(B)所示的化合物或其盐,其中:R选自-N 3、-NR aR b、-SR c;R a、R b各自独立地选自H、N 3-C1-C6烷基(如-CH 2-N 3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH 2-,具体如-CH 2-SS-Me、-CH 2-SS-Et、-CH 2-SS-iPr或-CH 2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),且R a和R b不同时为H;优选地,R a、R b中,任一个选自N 3-C1-C6烷基(如-CH 2-N 3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH 2-,具体如-CH 2-SS-Me、-CH 2-SS-Et、-CH 2-SS-iPr或-CH 2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),另一个为H;更优选地,R a、R b中,任一个选自N 3-C1-C6烷基,另一个为H;最优选地,R a、R b中,任一个为-CH 2-N 3,另一个为H;R c选自N 3-C1-C6烷基(如-CH 2-N 3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH 2-,具体如-CH 2-SS-Me、-CH 2-SS-Et、-CH 2-SS-iPr或-CH 2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基);优选地,R c选自N 3-C1-C6烷基;更优选地,R c为-CH 2-N 3;n选自1、2、3、4;优选地,n为1;各Z独立地选自O,S,BH;优选地,Z为O;Base 1、Base 2各自独立地选自碱基、脱氮碱基或其互变异构体,例如Base 1、Base 2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体;R为-N 3时,-N 3为可逆阻断基团;R为-NR aR b时,R a和R b为可逆阻断基团;R为-SR c时,R c为可逆阻断基团;R 0为可逆阻断基团。
- 权利要求2的化合物或其盐,其中,所述化合物具有式(A-2)或(B-2)所示结构,其中:R a、R b各自独立地选自H、N 3-C1-C6烷基(如-CH 2-N 3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH 2-,具体如-CH 2-SS-Me、-CH 2-SS-Et、-CH 2-SS-iPr或-CH 2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),且R a和R b不同时为H;优选地,R a、R b中,任一个选自N 3-C1-C6烷基(如-CH 2-N 3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH 2-,具体如-CH 2-SS-Me、-CH 2-SS-Et、-CH 2-SS-iPr或-CH 2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基),另一个为H;更优选地,R a、R b中,任一个选自N 3-C1-C6烷基,另一个为H;最优选地,R a、R b中,任一个为-CH 2-N 3,另一个为H;n选自1、2、3、4;优选地,n为1;各Z独立地选自O,S,BH;优选地,Z为O;Base 1、Base 2各自独立地选自碱基、脱氮碱基或其互变异构体,例如Base 1、Base 2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体;
- 权利要求2的化合物或其盐,其中,所述化合物具有式(A-3)或(B-3)所示结构,其中:R c选自N 3-C1-C6烷基(如-CH 2-N 3)、C1-C6烷基-S-S-C1-C6烷基(如C1-C6烷基-S-S-CH 2-,具体如-CH 2-SS-Me、-CH 2-SS-Et、-CH 2-SS-iPr或-CH 2-SS-t-Bu)、C2-C6烯基-C1-C6烷基(如烯丙基);优选地,R c选自N 3-C1-C6烷基;更优选地,R c为-CH 2-N 3;n选自1、2、3、4;优选地,n为1;各Z独立地选自O,S,BH;优选地,Z为O;Base 1、Base 2各自独立地选自碱基、脱氮碱基或其互变异构体,例如Base 1、Base 2各自独立地选自腺嘌呤、7-脱氮腺嘌呤、胸腺嘧啶、尿嘧啶、胞嘧啶、鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体;
- 权利要求2-6任一项所述的化合物或其盐,其中,所述化合物或其盐携带额外的可检测标记;优选地,所述化合物或其盐携带的额外的可检测标记是通过亲和试剂(如抗体、适体、Affimer、Knottin)引入的,所述亲和试剂携带所述可检测标记,且所述亲和试剂可以特异性识别并结合所述化合物或其盐的表位;优选地,所述额外的可检测标记任选地通过连接基团与所述化合物或其盐连接;优选地,所述额外的可检测标记任选地通过连接基团与所述化合物或其盐的Base 1或R 0连接;优选地,所述连接基团为可裂解的连接基团或不可裂解的连接基团;优选地,所述可裂解的连接基团选自亲电裂解的连接基团、亲核裂解的连接基团、可光解的连接基团、还原条件下裂解的连接基团、氧化条件下裂解的连接基团、安全拉手型连接基团、经消除机理裂解的连接基团,或其任何组合;优选地,Base 1不同,式A所示化合物携带的额外的可检测标记不同;优选地,Base 2不同,式B所示化合物携带的额外的可检测标记不同;优选地,所述可检测标记为荧光标记;优选地,所述可检测标记选自以下:
- 终止核酸合成的方法,其包括:将权利要求2-7任一项所述的化合物或其盐掺入待终止的核酸分子中;优选地,所述化合物或其盐的掺入通过末端转移酶、末端聚合酶或逆转录酶来实现;优选地,所述方法包括:使用聚合酶,将所述化合物或其盐掺入待终止的核酸分子中;优选地,所述方法包括:在允许聚合酶进行核苷酸聚合反应的条件下,使用聚合酶进行核苷酸聚合反应,从而将所述化合物或其盐掺入待终止的核酸分子的3'端。
- 制备在测序反应中与目标单链多核苷酸互补的生长的多核苷酸的方法,其包括将权利要求2-7中任一项所述的化合物或其盐掺入所述生长的互补多核苷酸,其中,所述化合物或其盐的掺入防止了任何后续的核苷酸引入所述生长的互补多核苷酸中;优选地,所述化合物或其盐的掺入通过末端转移酶、末端聚合酶或逆转录酶来实现;优选地,所述方法包括:使用聚合酶,将所述化合物或其盐掺入所述生长的互补多核苷酸;优选地,所述方法包括:在允许聚合酶进行核苷酸聚合反应的条件下,使用聚合酶进行核苷酸聚合反应,从而将所述化合物或其盐掺入所述生长的互补多核苷酸的3'端。
- 核酸中间体,其是在测定目标单链多核苷酸的序列中形成的,其中,所述核酸中间体是通过以下步骤形成的:向生长的核酸链中掺入一个与目标单链多核苷酸互补的核苷酸,形成所述核酸中间体,其中,掺入的一个互补核苷酸是权利要求2-7中任一项所述的化合物或其盐;或者,所述核酸中间体是通过以下步骤形成的:向生长的核酸链中掺入一个与目标单链多核苷酸互补的核苷酸,形成所述核酸中间体,其中,掺入的一个互补核苷酸是权利要求2-7中任一项所述的化合物或其盐,且所述生长的核酸链中预先掺入至少一个与目标单链多核苷酸互补的核苷酸,预先掺入的至少一个与目标单链多核苷酸互补的核苷酸是已被除去可逆阻断基团和任选的可检测标记的权利要求2-7中任一项所述的化合物或其盐。
- 测定目标单链多核苷酸的序列的方法,其包括:1)监测生长的核酸链中与目标单链多核苷酸互补的核苷酸的掺入,其中,掺入的至少一个互补核苷酸是权利要求2-7中任一项所述的化合物或其盐,以及,2)确定掺入的核苷酸的类型;优选地,在引入下一个互补核苷酸之前,将所述可逆阻断基团和任选的可检测标记除去;优选地,所述可逆阻断基团和所述可检测标记被同时除去;优选地,所述可逆阻断基团和所述可检测标记被先后除去;例如,在所述可检测标记被除去之后,所述可逆阻断基团被除去,或者,在所述可逆阻断基团被除去之后,所述可检测标记被除去。
- 权利要求11的方法,其包括以下步骤:(a)提供多种不同的核苷酸,其中至少一种核苷酸是权利要求7所述的化合物或其盐,任选地其余的核苷酸是权利要求2-7任一项所述的化合物或其盐;(b)将所述多种不同的核苷酸掺入目标单链多核苷酸的互补序列中,其中,所述多种不同的核苷酸在检测时可以相互区分开;(c)检测(b)的核苷酸,从而确定掺入的核苷酸的类型;(d)除去(b)的核苷酸中的可逆阻断基团和任选的其携带的可检测标记;和(e)任选地重复步骤(a)-(d)一次或多次;从而确定所述目标单链多核苷酸的序列。
- 权利要求11的方法,其包括以下步骤:(1)提供第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,四种核苷酸中的至少一种是权利要求7所述的化合物或其盐,任选地其余的核苷酸是权利要求2-7任一项所述的化合物或其盐;(2)将所述四种核苷酸与目标单链多核苷酸进行接触;除去未掺入生长的核酸链中的所述核苷酸;检测掺入生长的核酸链中的所述核苷酸;除去掺入生长的核酸链中的所述核苷酸中的所述可逆阻断基团和任选的其携带的所述可检测标记;任选地,还包括(3):重复(1)-(2)一次或多次。
- 权利要求11的方法,其包括以下步骤:(a)提供包含双链体、包含至少一种权利要求7所述的化合物或其盐的核苷酸、聚合酶和切除试剂的混合物;所述双链体包含生长的核酸链以及待测序的核酸链;(b)进行包含以下步骤(i)、(ii)和(iii)的反应,任选地,重复一次或多次:步骤(i):使用聚合酶,使所述化合物或其盐掺入生长的核酸链,形成包含可逆阻断基团和任选的可检测标记的核酸中间体:步骤(ii):对所述核酸中间体进行检测;步骤(iii):使用切除试剂将所述核酸中间体所包含的可逆阻断基团和任选的可检测标记切除;优选地,对所述可逆阻断基团的切除和对所述可检测标记的切除同时进行,或者,对所述可逆阻断基团的切除和对所述可检测标记的切除分步进行(例如,先切除所述可逆阻断基团,或者先切除所述可检测标记);优选地,对所述可逆阻断基团的切除和对所述可检测标记的切除使用的切除试剂是同样的试剂;优选地,对所述可逆阻断基团的切除和对所述可检测标记的切除使用的切除试剂是不同的试剂。
- 权利要求14的方法,其中,所述双链体连接于支持物上;优选地,所述生长的核酸链为引物;优选地,所述引物通过退火至待测序的核酸链上,形成所述双链体;优选地,所述双链体、所述化合物或其盐、以及所述聚合酶一起形成含有溶液相和固相的反应体系;优选地,在允许聚合酶进行核苷酸聚合反应的条件下,使用聚合酶,使所述化合物或其盐掺入生长的核酸链,形成包含可逆阻断基团和任选的可检测标记的核酸中间体;优选地,所述聚合酶选自KOD聚合酶或其突变体(例如KOD POL151、KOD POL157、KOD POL171、KOD POL174、KOD POL376、KOD POL391);优选地,在任意一个检测所述核酸中间体的步骤前,移除前一步骤的反应体系的溶液相,保留连接于支持物上的双链体;优选地,所述切除试剂与所述双链体或所述生长的核酸链在含有溶液相和固相的反应体系中接触;优选地,所述切除试剂能够切除掺入生长的核酸链的化合物中的可逆阻断基团和任选的其携带的可检测标记,并且不会影响双链体骨架上的磷酸二酯键;优选地,在任意一个切除所述核酸中间体所包含的可逆阻断基团和任选的可检测标记的步骤后,移除这一步骤反应体系的溶液相;优选地,在任意一个包含移除操作的步骤之后,进行洗涤操作;优选地,步骤(ii)之后,进一步包括:根据步骤(ii)检测得到的信号,确定步骤(i)中掺入生长的核酸链的化合物的类型,并基于碱基互补配对原则,确定待测序的核酸链中相应位置处的核苷酸类型。
- 试剂盒,其包含至少一个权利要求2-7任一项所述的化合物或其盐;优选地,所述试剂盒包含第一、第二、第三和第四化合物,所述第一、第二、第三和第四化合物各自独立地为权利要求2-7任一项所述的化合物或其盐;优选地,所述第一化合物中,Base 1选自腺嘌呤、7-脱氮腺嘌呤或其互变异构体(例如 );所述第二化合物中,Base 1选自胸腺嘧啶、尿嘧啶或其互变异构体(例如 );所述第三化合物中,Base 1选自胞嘧啶或其互变异构体(例如 );所述第四化合物中,Base 1选自鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体(例如 );优选地,所述第一化合物中,Base 2选自腺嘌呤、7-脱氮腺嘌呤或其互变异构体(例如 );所述第二化合物中,Base 2选自胸腺嘧啶、尿嘧啶或其互变异构体(例如 );所述第三化合物中,Base 2选自胞嘧啶或其互变异构体(例如 );所述第四化合物中,Base 2选自鸟嘌呤、7-脱氮鸟嘌呤或其互变异构体(例如 );优选地,所述第一、第二、第三和第四化合物包含的Base 1或Base 2互不相同;优选地,所述第一、第二、第三和第四化合物携带的额外的可检测标记互不相同。
- 权利要求16的试剂盒,其中,所述试剂盒还包含:用于预处理核酸分子的试剂;用于连接待测序的核酸分子的支持物;用于将待测序的核酸分子与支持物连接(例如,共价或非共价连接)的试剂;用于起始核苷酸聚合反应的引物;用于进行核苷酸聚合反应的聚合酶;一种或多种缓冲溶液;一种或多种洗涤溶液;或其任何组合。
- 权利要求2-7任一项所述的化合物或其盐或者权利要求16-17任一项所述的试剂盒用于测定目标单链多核苷酸的序列的用途。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280045385.6A CN117561269A (zh) | 2021-07-08 | 2022-07-05 | 修饰的核苷或核苷酸 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110774801 | 2021-07-08 | ||
CN202110774801.3 | 2021-07-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023280156A1 true WO2023280156A1 (zh) | 2023-01-12 |
Family
ID=84801271
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/103895 WO2023280156A1 (zh) | 2021-07-08 | 2022-07-05 | 修饰的核苷或核苷酸 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117561269A (zh) |
WO (1) | WO2023280156A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004074503A2 (en) * | 2003-02-21 | 2004-09-02 | Hoser Mark J | Nucleic acid sequencing methods, kits and reagents |
WO2011005762A1 (en) * | 2009-07-06 | 2011-01-13 | Trilink Biotechnologies | Chemically modified ligase cofactors, donors and acceptors |
CN108239669A (zh) * | 2016-12-23 | 2018-07-03 | 深圳华大智造科技有限公司 | 检测可逆性阻断dNTP中非阻断杂质含量的方法和试剂盒 |
WO2019071474A1 (zh) * | 2017-10-11 | 2019-04-18 | 深圳华大智造科技有限公司 | 修饰的核苷或核苷酸 |
WO2021130151A1 (en) * | 2019-12-23 | 2021-07-01 | Baseclick Gmbh | Method of amplifying mrnas and for preparing full length mrna libraries |
-
2022
- 2022-07-05 CN CN202280045385.6A patent/CN117561269A/zh active Pending
- 2022-07-05 WO PCT/CN2022/103895 patent/WO2023280156A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004074503A2 (en) * | 2003-02-21 | 2004-09-02 | Hoser Mark J | Nucleic acid sequencing methods, kits and reagents |
WO2011005762A1 (en) * | 2009-07-06 | 2011-01-13 | Trilink Biotechnologies | Chemically modified ligase cofactors, donors and acceptors |
CN108239669A (zh) * | 2016-12-23 | 2018-07-03 | 深圳华大智造科技有限公司 | 检测可逆性阻断dNTP中非阻断杂质含量的方法和试剂盒 |
WO2019071474A1 (zh) * | 2017-10-11 | 2019-04-18 | 深圳华大智造科技有限公司 | 修饰的核苷或核苷酸 |
WO2021130151A1 (en) * | 2019-12-23 | 2021-07-01 | Baseclick Gmbh | Method of amplifying mrnas and for preparing full length mrna libraries |
Non-Patent Citations (7)
Title |
---|
CARROLL STEVEN S., GEIB JAMES, OLSEN DAVID B., STAHLHUT MARK, SHAFER JULES A., KUO LAWRENCE C.: "Sensitivity of HIV-1 Reverse Transcriptase and Its Mutants to Inhibition by Azidothymidine Triphosphate", BIOCHEMISTRY, vol. 33, no. 8, 31 December 1994 (1994-12-31), pages 2113 - 2120, XP093020475, ISSN: 0006-2960, DOI: 10.1021/bi00174a018 * |
FARAJ A, AGROFOGLIO L A, WAKEFIELD J K, MCPHERSON S, MORROW C D, GOSSELIN G, MATHE C, IMBACH J L, SCHINAZI R F, SOMMADOSSI J P: "Inhibition of human immunodeficiency virus type 1 reverse transcriptase by the 5'-triphosphate beta enantiomers of cytidine analogs", ANTIMICROBIAL AGENTS AND CHEMOTHERAPY, vol. 38, no. 10, 1 October 1994 (1994-10-01), US , pages 2300 - 2305, XP093020474, ISSN: 0066-4804, DOI: 10.1128/AAC.38.10.2300 * |
HERRLEIN M K, KONRAD R E, ENGELS J W: "57. 3 -Amino-Modified Nucleotides Useful as Potent Chain Terminators f", HELVETICA CHIMICA ACTA, vol. 77, no. 2, 23 March 1994 (1994-03-23), Hoboken, USA, pages 586 - 596, XP002183276, ISSN: 0018-019X, DOI: 10.1002/hlca.19940770222 * |
KIM B, HATHAWAY T R, LOEB L A: "Human Immunodeficiency Virus Reverse Transcriptase", JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 271, no. 9, 1 March 1996 (1996-03-01), US , pages 4872 - 4878, XP000942157, ISSN: 0021-9258, DOI: 10.1074/jbc.271.9.4872 * |
MAKHIJA HARSHYAA, ROY SUKI, HOON SHAWN, GHADESSY FARID JOHN, WONG DESMOND, JAISWAL RAHUL, CAMPANA DARIO, DRÖGE PETER: "A novel λ integrase-mediated seamless vector transgenesis platform for therapeutic protein expression", NUCLEIC ACIDS RESEARCH, vol. 46, no. 16, 19 September 2018 (2018-09-19), GB , pages e99 - e99, XP055969882, ISSN: 0305-1048, DOI: 10.1093/nar/gky500 * |
PENG HUANG, FARQUHAR DAVID, PLUNKETTS WILLIAM: "Selective action of 3'-azido-3'-deoxythymidine 5'-triphosphate on viral reverse transcriptases and human DNA polymerases", THE JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 265, no. 20, 15 July 1990 (1990-07-15), pages 11914 - 11918, XP055166476, DOI: 10.1016/S0021-9258(19)38487-X * |
TABOR S., RICHARDSON C. C.: "Effect of Manganese Ions on the Incorporation of Dideoxynucleotides by Bacteriophage T7 DNA Polymerase and Escherichia Coli DNA Polymerase I.", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, vol. 86., no. 11., 1 June 1989 (1989-06-01), pages 4076 - 4081., XP000307484, ISSN: 0027-8424, DOI: 10.1073/pnas.86.11.4076 * |
Also Published As
Publication number | Publication date |
---|---|
CN117561269A (zh) | 2024-02-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10808244B2 (en) | Method of normalizing biological samples | |
US20200216891A1 (en) | Nucleosides and nucleotides with 3'-hydroxy blocking groups | |
US8212015B2 (en) | Modified nucleosides and nucleotides and uses thereof | |
JP3013156B2 (ja) | 3’−固有の校正活性を有するdnaポリメラーゼの使用 | |
JPH08500722A (ja) | ポリヌクレオチド固定化担体 | |
CN113748216B (zh) | 一种基于自发光的单通道测序方法 | |
US20060183204A1 (en) | Process for manufacturing morpholino-nucleotides, and use thereof for the analysis of and labelling of nucleic acid sequences | |
US20180371008A1 (en) | Methods of synthesizing and labeling nucleic acid molecules | |
US20240067673A1 (en) | Modified nucleoside or nucleotide | |
WO2021031109A1 (zh) | 一种基于发光标记物光信号动力学及二次发光信号对多核苷酸进行测序的方法 | |
WO2023280156A1 (zh) | 修饰的核苷或核苷酸 | |
WO2022206922A1 (zh) | 用于测序的核苷酸类似物 | |
WO2024123866A1 (en) | Nucleosides and nucleotides with 3´ blocking groups and cleavable linkers | |
US20230332197A1 (en) | Nucleosides and nucleotides with 3' vinyl blocking group | |
US20240240217A1 (en) | Nucleosides and nucleotides with 3' blocking groups and cleavable linkers | |
WO2023122499A1 (en) | Periodate compositions and methods for chemical cleavage of surface-bound polynucleotides |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22836905 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280045385.6 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22836905 Country of ref document: EP Kind code of ref document: A1 |