CN117561269A - Modified nucleosides or nucleotides - Google Patents

Modified nucleosides or nucleotides Download PDF

Info

Publication number
CN117561269A
CN117561269A CN202280045385.6A CN202280045385A CN117561269A CN 117561269 A CN117561269 A CN 117561269A CN 202280045385 A CN202280045385 A CN 202280045385A CN 117561269 A CN117561269 A CN 117561269A
Authority
CN
China
Prior art keywords
group
compound
nucleic acid
base
alkyl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280045385.6A
Other languages
Chinese (zh)
Inventor
滕波
徐讯
章文蔚
赵杰
廖莎
陈奥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huada Technology Holding Group Co ltd
BGI Shenzhen Co Ltd
Original Assignee
Shenzhen Huada Technology Holding Group Co ltd
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huada Technology Holding Group Co ltd, BGI Shenzhen Co Ltd filed Critical Shenzhen Huada Technology Holding Group Co ltd
Publication of CN117561269A publication Critical patent/CN117561269A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/06Pyrimidine radicals
    • C07H19/10Pyrimidine radicals with the saccharide radical esterified by phosphoric or polyphosphoric acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P20/00Technologies relating to chemical industry
    • Y02P20/50Improvements relating to the production of bulk chemicals
    • Y02P20/55Design of synthesis routes, e.g. reducing the use of auxiliary or protecting groups

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Saccharide Compounds (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Modified nucleosides or nucleotides are provided that contain an N or S atom at the 3' position of the sugar ring of the modified nucleoside or nucleotide, which are useful for NGS sequencing. Kits comprising the modified nucleosides or nucleotides are also provided, as well as sequencing methods based on the modified nucleosides or nucleotides.

Description

Modified nucleosides or nucleotides Technical Field
The present invention relates to the field of nucleic acid sequencing. In particular, the present invention relates to modified nucleosides or nucleotides, more particularly, the present invention relates to non-natural nucleotide analogs for NGS sequencing containing an N, S atom in the 3' position.
Background
The appearance of NGS sequencing overcomes the defects of high Sanger sequencing cost, long sequencing time and the like, and greatly promotes the application of the gene sequencing technology. Currently, NGS sequencing has been deeply applied in the fields of prenatal screening, tumor diagnosis, tumor treatment, animal and plant breeding, etc., and has brought about advances in science and technology and medicine.
Nucleoside triphosphate (dNTP) analogs with reversible blocking groups are key raw materials in NGS sequencing. Due to the introduction of the reversible blocking group, the 3' -OH group in dNTP can be reserved, the defect in Sanger sequencing is overcome, and the accuracy of base recognition is ensured. Nucleoside triphosphate analogs (dntps) with reversible blocking groups can be said to be the most critical technique in NGS sequencing.
Numerous dNTP compounds with reversible blocking groups have been reported. The realization of dNTP reversible blocking is realized mainly through two main ideas. The first idea is to introduce a reversible blocking group directly into the 3'-OH of dNTPs, and the modified dNTPs have the advantage that the blocking efficiency in sequencing is ensured by blocking the 3' -OH. The other thinking is that 3' -OH is not blocked, but the polymerase is blocked by base modification, and the strategy has the advantages of wider modification selectivity of blocking groups and is not limited by the polymerase.
In both concepts, non-natural nucleoside triphosphates were used as probes to achieve NGS sequencing. All modifications to natural nucleotides in the prior report are concentrated on bases, and the main idea is to introduce a connecting linker into the bases so as to introduce fluorescent markers or replace C at the 7-position N in purine bases, thereby facilitating the introduction of the connecting linker, as shown in figure 1.
In NGS sequencing, dntps are mainly used as monomers, and are phosphorylated with 5-triphosphate under the catalysis of polymerase to synthesize DNA. From a biochemical point of view, this is an enzyme-catalyzed esterification reaction. And under the premise of base pairing, the polymerase and ions participate, the 3' -OH of the nucleic acid and the 5-position triphosphates of the monomer dNTP undergo phosphating reaction, and pyrophosphoric acid is released to obtain a product with a long chain length, as shown in figure 2.
Disclosure of Invention
The present invention aims to develop a class of non-natural nucleotide analogs containing an N, S atom at the 3' position for NGS sequencing.
In one aspect, the structural general formula of the dNTP analogues is shown in figure 3, and the reversible blocking group protects 3' -N and comprises a main body structure of 2' -deoxidized 3' -nitrogen substituted uridine triphosphate, 2' -deoxidized 3' -nitrogen substituted cytidine triphosphate, 7-denitrogenated 2' -deoxidized 3' -nitrogen substituted adenosine triphosphate and 7-denitrogenated 2' -deoxidized 3' -nitrogen substituted guanosine triphosphate.
On the other hand, the structural general formula of the dNTP analogues is also shown in figure 4, and the reversible blocking group protects 3' -S and comprises a main body structure of 2' -deoxidized 3' -thiouridine triphosphate, 2' -deoxidized 3' -thiocytidine triphosphate, 7-denitrogenated 2' -deoxidized 3' -thioadenosine triphosphate and 7-denitrogenated 2' -deoxidized 3' -thioguanosine triphosphate.
To this end, in a first aspect of the invention, the invention provides the use of a compound or salt thereof for determining a single stranded polynucleotide sequence of interest,
in a second aspect of the present invention, there is provided a compound represented by the formula (A) or (B) or a salt thereof,
wherein:
r is selected from-N 3 、-NR a R b 、-SR c
R a 、R b Each independently selected from H, N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), and R a And R is b Not simultaneously H;
R c selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl);
R 0 is that
n is selected from 1, 2, 3, 4;
R selected from H, monophosphate groupsDiphosphate groupTriphosphate groupTetraphosphoric acid group
Each Z is independently selected from O, S, BH;
Base 1 、Base 2 each independently selected from the group consisting of a base, a deammoniated base, or a tautomer thereof;
r is-N 3 When, -N 3 Is a reversible blocking group;
r is-NR a R b When R is a And R is b Is a reversible blocking group;
r is-SR c When R is c Is a reversible blocking group;
R 0 is a reversible blocking group.
In some embodiments, R a 、R b Any one selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), the other being H.
In some embodiments, R a 、R b Any one selected from N 3 -C1-C6 alkyl, the other being H.
In some embodiments, R a 、R b Any one of them is-CH 2 -N 3 The other is H.
In some embodiments, R c Selected from N 3 -C1-C6-alkanesA base.
In some embodiments, R c is-CH 2 -N 3
In some embodiments, n is 1.
In some embodiments, R Is a triphosphate group
In some embodiments, the Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof.
In some embodiments, Z is O.
In some embodiments, the Base 1 Selected from the group consisting of
In some embodiments, the Base 2 Selected from the group consisting of
In a third aspect of the present invention, there is provided a compound represented by the formula (A-1) or (B-1) or a salt thereof,
wherein:
R 0 is that
n is selected from 1, 2, 3, 4;
R selected from H, monophosphate groupsDiphosphate groupTriphosphate groupTetraphosphoric acid group
Each Z is independently selected from O, S, BH;
Base 1 、Base 2 each independently selected from the group consisting of a base, a deammoniated base, or a tautomer thereof;
3' -position-N 3 In, -N 3 Is a reversible blocking group;
R 0 is a reversible blocking group.
In some embodiments, n is 1.
In some embodiments, R Is a triphosphate group
In some embodiments, Z is O.
In some embodiments, the Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof.
In some embodimentsIn Base 1 Selected from the group consisting of
In some embodiments, the Base 2 Selected from the group consisting of
In a fourth aspect of the present invention, there is provided a compound represented by the formula (A-2) or (B-2) or a salt thereof,
wherein:
R a 、R b each independently selected from H, N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), and R a And R is b Not simultaneously H;
R 0 is that
n is selected from 1, 2, 3, 4;
R selected from H, monophosphate groupsDiphosphate groupTriphosphate groupTetraphosphoric acid group
Each Z is independently selected from O, S, BH;
Base 1 、Base 2 each independently selected from the group consisting of a base, a deammoniated base, or a tautomer thereof;
R a and R is b Is a reversible blocking group;
R 0 is a reversible blocking group.
In some embodiments, R a 、R b Any one selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), the other being H.
In some embodiments, R a 、R b Any one selected from N 3 -C1-C6 alkyl, the other being H.
In some embodiments, R a 、R b Any one of them is-CH 2 -N 3 The other is H.
In some embodiments, n is 1.
In some embodiments, R Is a triphosphate group
In some embodiments, Z is O.
In some embodiments, the Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof.
In some embodiments, the Base 1 Selected from the group consisting of
In some embodiments, the Base 2 Selected from the group consisting of
In a fifth aspect of the present invention, there is provided a compound represented by the formula (A-3) or (B-3) or a salt thereof,
wherein:
R c selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl);
R 0 is that
n is selected from 1, 2, 3, 4;
R selected from H, monophosphate groupsDiphosphate groupTriphosphate groupTetraphosphoric acid group
Each Z is independently selected from O, S, BH;
Base 1 、Base 2 each independently selected from the group consisting of a base, a deammoniated base, or a tautomer thereof;
R c is a reversible blocking group;
R 0 is a reversible blocking group.
In some embodiments, R c Selected from N 3 -C1-C6 alkyl.
In some embodiments, R c is-CH 2 -N 3
In some embodiments, n is 1.
In some embodiments, R' is a triphosphate group
In some embodiments, Z is O.
In some embodiments, the Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, and urinePyrimidine, cytosine, guanine, 7-deazaguanine or a tautomer thereof.
In some embodiments, the Base 1 Selected from the group consisting of
In some embodiments, the Base 2 Selected from the group consisting of
In a sixth aspect of the invention, the invention provides the following compounds or salts thereof:
in some embodiments, the foregoing compounds or salts thereof carry additional detectable labels.
In some embodiments, the additional detectable label carried by the compound or salt thereof is introduced by an affinity reagent (e.g., antibody, aptamer, affimer, knottin) that carries the detectable label and that can specifically recognize and bind to an epitope of the compound or salt thereof.
In some embodiments, the additional detectable label is attached to the compound or salt thereof, optionally through a linking group.
In some embodiments, the additional detectable label is optionally attached to the Base of the compound or salt thereof via a linking group 1 Or R is 0 And (5) connection.
In one placeIn some embodiments, the additional detectable label is optionally attached to R of the compound or salt thereof via a linking group 0 Is linked to the terminal amino group of (a).
In some embodiments, the linking group is a cleavable linking group or a non-cleavable linking group.
In some embodiments, the cleavable linking group is selected from the group consisting of an electrophilically cleavable linking group, a nucleophilic cleavable linking group, a photolyzable linking group, a linking group cleaved under reducing conditions, a linking group cleaved under oxidizing conditions, a safety handle linking group, a linking group cleaved via an elimination mechanism, or any combination thereof.
In some embodiments, the Base 1 In contrast, the compounds of formula A carry additional detectable labels that are different.
In some embodiments, the Base 2 In contrast, the compounds of formula B carry additional detectable labels that are different.
In some embodiments, the detectable label is a fluorescent label.
In some embodiments, the detectable label is selected from the group consisting of:
in a seventh aspect of the invention, the invention provides a method of terminating nucleic acid synthesis comprising: the aforementioned compounds or salts thereof are incorporated into the nucleic acid molecule to be terminated.
In some embodiments, the incorporation of the compound or salt thereof is achieved by a terminal transferase, terminal polymerase or reverse transcriptase.
In some embodiments, the method comprises: the compound or salt thereof is incorporated into the nucleic acid molecule to be terminated using a polymerase.
In some embodiments, the method comprises: the nucleotide polymerization reaction is performed using a polymerase under conditions allowing the polymerase to perform the nucleotide polymerization reaction, thereby incorporating the compound or a salt thereof into the 3' end of the nucleic acid molecule to be terminated.
In an eighth aspect of the invention, the invention provides a method of preparing a growing polynucleotide complementary to a single stranded polynucleotide of interest in a sequencing reaction, comprising incorporating into said growing complementary polynucleotide a compound or salt of the foregoing, wherein the incorporation of said compound or salt thereof prevents the introduction of any subsequent nucleotide into said growing complementary polynucleotide.
In some embodiments, the incorporation of the compound or salt thereof is achieved by a terminal transferase, terminal polymerase or reverse transcriptase.
In some embodiments, the method comprises: the compound or salt thereof is incorporated into the growing complementary polynucleotide using a polymerase.
In some embodiments, the method comprises: the compound or salt thereof is incorporated into the 3' end of the growing complementary polynucleotide by nucleotide polymerization using a polymerase under conditions that allow the polymerase to perform nucleotide polymerization.
In a ninth aspect of the invention, the invention provides a nucleic acid intermediate formed in determining the sequence of a single stranded polynucleotide of interest, wherein the nucleic acid intermediate is formed by:
incorporating into the growing nucleic acid strand a nucleotide complementary to the single stranded polynucleotide of interest, forming said nucleic acid intermediate, wherein the incorporated one complementary nucleotide is a compound of the foregoing or a salt thereof.
In a tenth aspect of the invention, the invention provides a nucleic acid intermediate formed in determining the sequence of a single stranded polynucleotide of interest, wherein the nucleic acid intermediate is formed by:
incorporating into the growing nucleic acid strand a nucleotide complementary to the target single-stranded polynucleotide, wherein the incorporated one complementary nucleotide is a compound as described above or a salt thereof, and wherein the growing nucleic acid strand is pre-incorporated with at least one nucleotide complementary to the target single-stranded polynucleotide, the pre-incorporated at least one nucleotide complementary to the target single-stranded polynucleotide being a compound as described above or a salt thereof from which the reversible blocking group and optionally the detectable label have been removed.
In an eleventh aspect of the invention, the invention provides a method of determining the sequence of a single stranded polynucleotide of interest comprising:
1) Monitoring incorporation of nucleotides complementary to the target single stranded polynucleotide in the growing nucleic acid strand, wherein at least one complementary nucleotide incorporated is a compound as described above or a salt thereof, and,
2) The type of nucleotide incorporated is determined.
In some embodiments, the reversible blocking group and optional detectable label are removed prior to the introduction of the next complementary nucleotide.
In some embodiments, the reversible blocking group and the detectable label are removed simultaneously.
In some embodiments, the reversible blocking group and the detectable label are sequentially removed; for example, the reversible blocking group is removed after the detectable label is removed, or the detectable label is removed after the reversible blocking group is removed.
In some embodiments, the method of determining the sequence of a single stranded polynucleotide of interest comprises the steps of:
(a) Providing a plurality of different nucleotides, wherein at least one nucleotide is a compound or salt thereof, optionally the remaining nucleotides are compounds or salts thereof;
(b) Incorporating the plurality of different nucleotides into a complementary sequence of a single stranded polynucleotide of interest, wherein the plurality of different nucleotides are distinguishable from one another upon detection;
(c) Detecting the nucleotides of (b) to thereby determine the type of nucleotide incorporated;
(d) Removing the reversible blocking group and optionally the detectable label carried thereby from the nucleotide of (b); and
(e) Optionally repeating steps (a) - (d) one or more times;
thereby determining the sequence of the single stranded polynucleotide of interest.
In some embodiments, the method of determining the sequence of a single stranded polynucleotide of interest comprises the steps of:
(1) Providing a first nucleotide, a second nucleotide, a third nucleotide and a fourth nucleotide, at least one of the four nucleotides being a compound or salt thereof, optionally the remaining nucleotides being a compound or salt thereof;
(2) Contacting the four nucleotides with a single-stranded polynucleotide of interest; removing the nucleotides not incorporated into the growing nucleic acid strand; detecting the nucleotide incorporated into the growing nucleic acid strand; removing the reversible blocking group and optionally the detectable label carried thereby in the nucleotide incorporated into the growing nucleic acid strand;
Optionally, further comprising (3): repeating (1) - (2) one or more times.
In some embodiments, the method of determining the sequence of a single stranded polynucleotide of interest comprises the steps of:
(a) Providing a mixture comprising a duplex, a nucleotide comprising at least one of the foregoing compounds or salts thereof, a polymerase, and a excision reagent; the duplex comprises a growing nucleic acid strand and a nucleic acid strand to be sequenced;
(b) Carrying out a reaction comprising the following steps (i), (ii) and (iii), optionally repeated one or more times:
step (i): incorporating the compound or salt thereof into a growing nucleic acid strand using a polymerase to form a nucleic acid intermediate comprising a reversible blocking group and optionally a detectable label:
step (ii): detecting the nucleic acid intermediate;
step (iii): the reversible blocking group comprised by the nucleic acid intermediate and optionally the detectable label are excised using an excision reagent.
In some embodiments, the cleavage of the reversible blocking group and the cleavage of the detectable label are performed simultaneously, or the cleavage of the reversible blocking group and the cleavage of the detectable label are performed stepwise (e.g., cleavage of the reversible blocking group first, or cleavage of the detectable label first).
In some embodiments, the cleavage of the reversible blocking group and the cleavage reagent used for the cleavage of the detectable label are the same reagent.
In some embodiments, the cleavage reagent used for cleavage of the reversible blocking group and for cleavage of the detectable label is a different reagent.
In some embodiments, the duplex is attached to a support.
In some embodiments, the growing nucleic acid strand is a primer.
In some embodiments, the primer forms the duplex by annealing to a nucleic acid strand to be sequenced.
In some embodiments, the duplex, the compound or salt thereof, and the polymerase together form a reaction system containing a solution phase and a solid phase.
In some embodiments, the compound or salt thereof is incorporated into a growing nucleic acid strand using a polymerase under conditions that allow the polymerase to undergo nucleotide polymerization to form a nucleic acid intermediate comprising a reversible blocking group and optionally a detectable label.
In some embodiments, the polymerase is selected from KOD polymerase or a mutant thereof (e.g., KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, KOD POL 391).
In some embodiments, prior to any step of detecting the nucleic acid intermediate, the solution phase of the reaction system of the previous step is removed, leaving the duplex attached to the support.
In some embodiments, the excision reagent is contacted with the duplex or the growing nucleic acid strand in a reaction system comprising a solution phase and a solid phase.
In some embodiments, the excision reagent is capable of excising the reversible blocking group and optionally the detectable label carried thereby in the compound incorporating the growing nucleic acid strand, without affecting the phosphodiester bond on the duplex backbone.
In some embodiments, after any step of cleaving the reversible blocking groups and optionally the detectable label comprised by the nucleic acid intermediate, the solution phase of the reaction system of this step is removed.
In some embodiments, the washing operation is performed after any one of the steps comprising the removal operation.
In some embodiments, after step (ii), further comprising: determining the type of compound incorporated into the growing nucleic acid strand in step (i) from the signal detected in step (ii), and determining the type of nucleotide at the corresponding position in the nucleic acid strand to be sequenced based on the base complementary pairing rules.
In a twelfth aspect of the invention, the invention provides a kit comprising at least one of the foregoing compounds or salts thereof.
In some embodiments, the kit comprises first, second, third, and fourth compounds, each of which is independently a compound of the foregoing or a salt thereof.
In some embodiments, in the first compound, the Base 1 Selected from adenine, 7-deazaadenine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the second compound, base 1 Selected from thymine, uracil or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the third compound, base 1 Selected from cytosine or its mutualStereoisomers (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the fourth compound, base 1 Selected from guanine, 7-deazaguanine or a tautomer thereof (e.g)。
In some embodiments, in the first compound, the Base 2 Selected from adenine, 7-deazaadenine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the second compound, base 2 Selected from thymine, uracil or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the third compound, base 2 Selected from cytosine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the fourth compound, base 2 Selected from guanine, 7-deazaguanine or a tautomer thereof (e.g)。
In some embodiments, the first, second, third, and fourth compounds comprise a Base 1 Or Base 2 Are different from each other.
In some embodiments, the additional detectable labels carried by the first, second, third, and fourth compounds are different from each other.
In some embodiments, the kit further comprises: reagents for pre-treating nucleic acid molecules; a support for ligating nucleic acid molecules to be sequenced; reagents for attaching (e.g., covalently or non-covalently attaching) a nucleic acid molecule to be sequenced to a support; primers for initiating nucleotide polymerization; a polymerase for performing nucleotide polymerization; one or more buffer solutions; one or more wash solutions; or any combination thereof.
In a thirteenth aspect of the invention, the invention provides the use of a compound as hereinbefore described or a salt thereof or a kit as hereinbefore described for determining the sequence of a single stranded polynucleotide of interest.
Drawings
FIG. 1 shows base modified non-natural nucleosides for NGS sequencing;
FIG. 2 shows the NGS sequencing biochemical reactions;
FIG. 3 shows a 3' -N-substituted reversible blocking nucleotide analog of an embodiment of the present invention;
FIG. 4 shows a 3' -S-substituted reversible blocking nucleotide analog of an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below by way of specific examples, but they should not be construed as limiting the invention in any way.
Unless otherwise indicated, the above groups and substituents have the usual meaning in the art of pharmaceutical chemistry.
In the various parts of the present specification, substituents of the presently disclosed compounds are disclosed in terms of the type or scope of groups. It is specifically noted that the present invention includes each individual subcombination of the individual members of these group classes and ranges. For example, the term "C1-C6 alkyl" particularly refers to independently disclosed methyl, ethyl, C3 alkyl, C4 alkyl, C5 alkyl, and C6 alkyl groups.
In addition, unless explicitly stated otherwise, the descriptions used throughout this document that "each … is/is selected independently" and "… is/is selected independently" interchangeably "should be construed broadly to mean that specific items expressed between the same or different symbols in different groups do not affect each other, or that specific items expressed between the same or different symbols in the same group do not affect each other.
The term "C1-C6 alkyl" refers to any straight or branched chain saturated group containing 1 to 6 carbon atoms, such as methyl (Me), ethyl (Et), n-propyl, isopropyl (iPr), n-butyl, isobutyl, t-butyl (t-Bu), sec-butyl, n-pentyl, t-pentyl, n-hexyl, and the like.
The term "C2-C6 alkenyl" refers to any straight or branched chain group containing 2 to 6 carbon atoms and containing at least one carbon-carbon double bond, such as vinyl, 1-propenyl, 2-propenyl, and the like.
From all of the above descriptions, it will be apparent to those skilled in the art that any group whose name is a compound name, e.g. "N 3 C1-C6 alkyl "shall mean moieties conventionally derived therefrom, for example from an azido group (-N) 3 ) Substituted C1-C6 alkyl, wherein C1-C6 alkyl is as defined above.
As used herein, examples of the term "salt of a compound of formula a, formula a-1, formula a-2, formula a-3, formula B-1, formula B-2, or formula B-3" are organic acid addition salts formed from anion-forming organic acids, including, but not limited to, formate, acetate, propionate, benzoate, maleate, fumarate, succinate, tartrate, citrate, ascorbate, α -ketoglutarate, α -glycerophosphate, alkyl sulfonate, or aryl sulfonate; preferably, the alkyl sulfonate is methyl sulfonate or ethyl sulfonate; the aryl sulfonate is benzene sulfonate or p-toluene sulfonate. Suitable inorganic salts may also be formed, including, but not limited to, hydrochloride, hydrobromide, hydroiodide, nitrate, bicarbonate, and carbonate, sulfate or phosphate, and the like.
In the method of the present invention, as long as it is a substance consisting of both the growing nucleic acid strand and the nucleic acid strand to be sequenced, it is referred to as "duplex", and the nucleic acid strand to be sequenced may be longer than the chain length of the growing nucleic acid strand, irrespective of the chain length of the growing nucleic acid strand or the nucleic acid strand to be sequenced.
In the method of the invention, the nucleic acid molecule to be sequenced may be any nucleic acid molecule of interest. In certain preferred embodiments, the nucleic acid molecule to be sequenced comprises deoxyribonucleotides, ribonucleotides, modified deoxyribonucleotides, modified ribonucleotides, or any combination thereof. In the method of the present invention, the nucleic acid molecule to be sequenced is not limited by its type. In certain preferred embodiments, the nucleic acid molecule to be sequenced is DNA or RNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced may be genomic DNA, mitochondrial DNA, chloroplast DNA, mRNA, cDNA, miRNA, or siRNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is linear or circular. In certain preferred embodiments, the nucleic acid molecule to be sequenced is double-stranded or single-stranded. For example, the nucleic acid molecule to be sequenced may be single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), single-stranded RNA (ssRNA), double-stranded RNA (dsRNA), or a hybrid of DNA and RNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is single stranded DNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is double-stranded DNA.
In the method of the present invention, the nucleic acid molecule to be sequenced is not limited by its source. In certain preferred embodiments, the nucleic acid molecule to be sequenced may be obtained from any source, for example, any cell, tissue or organism (e.g., viruses, bacteria, fungi, plants and animals). In certain preferred embodiments, the nucleic acid molecule to be sequenced is derived from a mammal (e.g., human, non-human primate, rodent, or canine), plant, bird, reptile, fish, fungus, bacterium, or virus.
Methods for extracting or obtaining nucleic acid molecules from cells, tissues or organisms are well known to those skilled in the art. Suitable methods include, but are not limited to, ethanol precipitation, chloroform extraction, and the like. For a detailed description of such methods see, for example, J.Sambrook et al, molecular cloning: laboratory Manual, 2 nd edition, cold spring harbor laboratory Press, 1989, and F.M. Ausubel et al, fine-compiled guidelines for molecular biology experiments, 3 rd edition, john Wiley & Sons, inc.,1995. In addition, various commercial kits can be used to extract nucleic acid molecules from various sources (e.g., cells, tissues, or organisms).
In the method of the present invention, the nucleic acid molecule to be sequenced is not limited by its length. In certain preferred embodiments, the nucleic acid molecule to be sequenced may be at least 10bp, at least 20bp, at least 30bp, at least 40bp, at least 50bp, at least 100bp, at least 200bp, at least 300bp, at least 400bp, at least 500bp, at least 1000bp, or at least 2000bp in length. In certain preferred embodiments, the nucleic acid molecule to be sequenced may be 10-20bp,20-30bp,30-40bp,40-50bp,50-100bp,100-200bp,200-300bp,300-400bp,400-500bp,500-1000bp,1000-2000bp, or more than 2000bp in length. In certain preferred embodiments, the nucleic acid molecules to be sequenced may have a length of 10-1000bp to facilitate high throughput sequencing.
In the method for producing a polynucleotide or the sequencing method of the present invention, a suitable polymerase may be used to conduct nucleotide polymerization. In some exemplary embodiments, the polymerase is capable of synthesizing a new DNA strand (e.g., a DNA polymerase) using DNA as a template. In some exemplary embodiments, the polymerase is capable of synthesizing a new DNA strand (e.g., reverse transcriptase) using RNA as a template. In some exemplary embodiments, the polymerase is capable of synthesizing a new RNA strand (e.g., RNA polymerase) using DNA or RNA as a template. Thus, in certain preferred embodiments, the polymerase is selected from the group consisting of DNA polymerase, RNA polymerase, and reverse transcriptase. Suitable polymerases may be selected to perform nucleotide polymerization according to actual needs. In certain preferred embodiments, the polymerization reaction is a Polymerase Chain Reaction (PCR). In certain preferred embodiments, the polymerization reaction is a reverse transcription reaction.
In the method of the present invention, nucleotide polymerization may be performed using KOD polymerase or a mutant thereof. KOD polymerase or a mutant thereof (e.g., KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, KOD POL 391) has acceptable polymerization efficiency for the modified nucleoside or nucleotide of the present invention. KOD POL391 and KOD POL171 have acceptable polymerization efficiencies for the modified nucleotides of the invention. In certain embodiments, KOD POL391 or KOD POL171 has a polymerization efficiency of greater than 70%, such as 70% -80%, 80% -90%, or 90% -100% for the modified nucleotide of the invention.
In the method for producing a polynucleotide or the sequencing method of the present invention, the polymerization reaction of nucleotides is carried out under suitable conditions. Suitable polymerization conditions include the composition of the solution phase, the concentration of each component, the pH of the solution phase, the polymerization temperature, and the like. The polymerization is carried out under suitable conditions in order to obtain acceptable, even high polymerization efficiencies.
In some embodiments of the invention, the nitrogen or sulfur atom at the 3' position of deoxyribose is protected in a compound of formula A or formula B, and thus they are capable of terminating polymerization by a polymerase (e.g., DNA polymerase). For example, when a compound represented by formula A or formula B is introduced into the 3 '-end of a growing nucleic acid strand, since a free amino group (-NH) is not present at the 3' -position of the deoxyribose of the compound 2 ) Or thiol (-SH), the polymerase will not be able to proceed to the next round of polymerization and the polymerization will be terminated. In this case, in each round of polymerization, there will be and only one base incorporated into the growing nucleic acid strand.
In addition, the protecting group of the nitrogen atom or sulfur atom at the 3' -position of deoxyribose of the compound represented by formula A or formula B can be removed and converted into a free amino group (-NH) 2 ) Or mercapto (-SH). Subsequently, the growing nucleic acid strand may be subjected to a next round of polymerization reaction using a polymerase and the compound of formula A or formula B, and one base may be introduced again.
Thus, the nitrogen atom or the sulfur atom at the 3' -position of the deoxyribose of the compound represented by the formula A or the formula B can be reversibly blocked, specifically, in the compound represented by the formula A or the formula B, R is-N 3 When, -N 3 Is a reversible blocking group; r is-NR a R b When R is a And R is b Is a reversible blocking group; r is-SR c When R is c Is a reversible blocking group. When the compounds of formula A or formula B are incorporated into the 3' end of the growing nucleic acid strand, they will terminate the polymerase from continuing the polymerization, terminating further extension of the growing nucleic acid strand; and, after the blocking group contained in the compound represented by formula A or formula B is removed, a free amino group (-NH) will be present at the 3' position 2 ) Or sulfhydryl groups (-SH), the polymerase will be able to continue to polymerize the growing nucleic acid strand and continue to extend the nucleic acid strand. In particular, R is-N 3 When, -N 3 In order to reversibly block the group, under certain conditions, when the reversible blocking group is removed, a free amino (-NH) group will be present at the 3' position 2 ) The polymerase will be able to continue to polymerize the growing nucleic acid strand and continue to extend the nucleic acid strand.
In addition, in some embodiments, the base can also be protected at the same time (e.g., by R 0 Protection), they are also capable of terminating the polymerization of a polymerase (e.g., a DNA polymerase). For example, when a compound represented by formula B is introduced into the 3 '-end of a growing nucleic acid strand, not only is the nitrogen atom or sulfur atom at the 3' -position protected, but also due to R at the base of the compound 0 The steric hindrance or hydrogen bond interactions, etc., such that the polymerase will not be able to proceed with the next polymerization run, and the polymerization will be terminated. In this case, in each round of polymerization, there will be and only one base incorporated into the growing nucleic acid strand.
Meanwhile, a protecting group (R) at the base of the compound represented by formula B 0 ) Can also be removed. Subsequently, the growing nucleic acid strand may be subjected to the next round of polymerization reaction using the polymerase and the compound of formula B, and one base may be introduced again.
Thus, the base of the compound of formula B is reversibly blocked: when the compounds of formula B are incorporated into the 3' end of the growing nucleic acid strand, they will terminate the polymerase to continue polymerization, terminating further extension of the growing nucleic acid strand; and, after the blocking group contained in the compound represented by formula B is removed, the polymerase will be able to continue to polymerize the growing nucleic acid strand and continue to extend the nucleic acid strand.
Certain embodiments described herein relate to the use of conventional detectable labels. Detection may be by any suitable method, including fluorescence spectroscopy or other optical means. The preferred label is a fluorescent label, i.e. a fluorophore, which upon absorption of energy emits radiation of a defined wavelength. A number of suitable fluorescent labels are known. For example, welch et al (chem. Eur. J.5 (3): 951-960, 1999) disclose dansyl-functionalized fluorescent moieties that may be used in the present invention. Zhu et al (cytometric 28:206-211, 1997) describe the use of fluorescent labels Cy3 and Cy5, which can also be used in the present invention. Suitable markers are also disclosed by Prober et al (Science 238:336-341, 1987), connell et al (BioTechniques 5 (4): 342-384, 1987), ansorge et al (Nucl. Acids Res.15 (11): 4593-4602, 1987) and Smith et al (Nature 321:674,1986). Other commercially available fluorescent labels include, but are not limited to, fluorescein, rhodamine (including TMR, texas Red and Rox), alexa, fluoboric acid, acridine, coumarin, pyrene, benzanthracene and anthocyanin.
Multiplex labels, such as the double fluorophore FRET cassette (Tet. Let.46:8867-8871, 2000), and also multiple fluorophore dendrimers (J. Am. Chem. Soc.123:8101-8108, 2001) may also be used in this application. While fluorescent labels are preferred, other forms of detectable labels will be apparent to those of ordinary skill in the art. For example, microparticles, including quantum dots (Empodocles et al, nature 399:126-130,1999), gold nanoparticles (Reichert et al, anal. Chem.72:6025-6029, 2000) and microbeads (Lacaste et al, proc. Natl. Acad. Sci USA 97 (17): 9461-9466, 2000) may also be used.
Multicomponent labels may also be used herein. Multicomponent labels are labels that rely on interaction with another compound for detection. The most commonly used multicomponent marker in biology is the biotin-streptavidin system. Biotin is used as a label attached to a nucleotide or modified nucleotide. The detection was then allowed to occur by separate addition of streptavidin. Other multicomponent systems may be used. For example, dinitrophenol has a commercially available fluorescent antibody that can be used for detection.
In certain embodiments described herein, the modified nucleotide or nucleoside molecule may be provided with a detectable label as described above by the introduction of an affinity reagent (e.g., antibody, aptamer, affimer, knottin) that specifically recognizes and binds to an epitope of the modified nucleotide or nucleoside molecule, as described in detail in WO2018129214A1. The entire relevant content of WO2018129214A1 is incorporated into the present application.
In other embodiments described herein, the modified nucleotide or nucleoside molecule may be linked to a detectable label as described above. In certain such embodiments, the linking groups used may be cleavable. The use of cleavable linking groups ensures that the tag can be removed after detection if desired, which avoids any interfering signals with any of the subsequently incorporated tagged nucleotides or nucleosides.
In other embodiments, the linking groups used are non-cleavable. Because in each case the labeled nucleotides of the present invention are incorporated, there is no need to incorporate the nucleotides later, there is no need to remove the label from the nucleotides.
Cleavable linking groups are well known in the art and conventional chemistry can be used to attach the linking group to a nucleotide or modified nucleotide and label. The linking group may be cleaved by any suitable method, including exposure to acids, bases, nucleophiles, electrophiles, radicals, metals, reducing or oxidizing agents, light, temperature, enzymes, and the like. The linking groups discussed herein can also be cleaved using the same catalysts used to cleave the protecting groups at the bases. Suitable linking groups may be modified from standard chemical protecting groups as disclosed in Greene & Wuts, protective Groups in Organic Synthesis (protecting groups in organic synthesis), john Wiley & Sons. Suitable cleavable linking groups for solid phase synthesis are also disclosed in Guillier et al (chem. Rev.100:2092-2157, 2000).
The use of the term "cleavable linking group" does not mean that the entire linking group needs to be removed, e.g., from a nucleotide or modified nucleotide. When the detectable label is attached to a nucleotide or modified nucleotide, the nucleoside cleavage site can be located at a position on the linking group that ensures that a portion of the linking group remains attached to the nucleotide or modified nucleotide after cleavage.
When the detectable label is attached to a nucleotide or modified nucleotide, the linking group can be attached at any position on the nucleotide or modified nucleotide as long as Watson-Crick base pairing is still enabled.
A. Electrophilically cleavable linking groups
Electrophilically cleavable linking groups are typically cleaved by protons and include cleavage sensitive to acids. Suitable linking groups include modified benzyl systems such as trityl, p-alkoxybenzyl esters and p-alkoxybenzylamides. Other suitable linking groups include t-butoxycarbonyl (Boc) groups and acetal systems.
To prepare suitable linking molecules, the use of a sulfur-philic metal such as nickel, silver or mercury in the cleavage of thioacetal or other sulfur-containing protecting groups is also contemplated.
B. Nucleophilic cleavage of linking groups
Nucleophilic cleavage is also a well-known method in the preparation of linker molecules. Groups that are unstable in water (i.e., can be cleaved simply at alkaline pH), such as esters, and groups that are unstable to non-aqueous nucleophiles, can be used. Fluoride ions can be used to cleave the siloxane bond in groups such as Triisopropylsilane (TIPS) or t-butyldimethylsilane (TBDMS).
C. Photolyzable linking groups
Photodegradable linking groups are widely used in sugar chemistry. Preferably, the light required to activate cleavage does not affect other components in the modified nucleotide. For example, if a fluorophore is used as the label, it is preferred that the fluorophore absorbs light of a different wavelength than that required to cleave the linker molecule. Suitable linking groups include those based on O-nitrobenzyl compounds and nitro-veratryl compounds. Linking groups based on benzoin chemistry can also be used (Lee et al J.org.chem.64:3454-3460, 1999).
D. Cleavage under reducing conditions
A variety of linking groups are known that are susceptible to reductive cleavage. Catalytic hydrogenation using palladium-based catalysts has been used to cleave benzyl and benzyloxycarbonyl groups. Disulfide bond reduction is also known in the art.
E. Cleavage under oxidative conditions
Oxidation-based methods are well known in the art. These include oxidation of hydrocarbyloxybenzyl groups and oxidation of sulfur and selenium linkages. It is also within the scope of the invention to use an iodine solution (aqueou iodine) to cleave disulfides and other sulfur or selenium based linking groups.
F. Safety handle type connecting group
Safety-handle linkers (safety-catch linkers) are those that cleave in two steps. In a preferred system, the first step is the generation of reactive nucleophilic centres, followed by the second step involving intramolecular cyclization, which results in cleavage. For example, levulinate linkages can be treated with hydrazine or photochemically to release the active amine, which is then cyclized to cleave the ester elsewhere in the molecule (Burgess et al, J. Org. Chem.62:5165-5168, 1997).
G. Cleavage by elimination mechanism
An elimination reaction may also be used. Base-catalyzed elimination of groups such as fluorenylmethoxycarbonyl and cyanoethyl groups can be used as well as palladium-catalyzed reductive elimination of the allyl system.
In certain embodiments, the linking group may comprise a spacer unit. The length of the linking group is not critical as long as the label is kept a sufficient distance from the nucleotide to avoid interfering with the interaction between the nucleotide and the enzyme.
In certain embodiments, the linking group may consist of a similar function as the base protecting group. This will make the deprotection and deprotection methods more efficient because only a single treatment is required to remove the tag and protecting groups. Particularly preferred linking groups are azide-containing linking groups cleavable by a phosphine.
The utility of dideoxynucleoside triphosphates in so-called Sanger sequencing and related protocols (Sanger type) is known to those skilled in the art, which rely on random chain termination at a particular type of nucleotide. One example of a Sanger-type sequencing protocol is the BASS method described by Metzker.
The Sanger and Sanger type methods are generally performed by performing an experiment in which eight types of nucleotides are provided, four of which contain 3' -NH 2 A group or a 3' -SH group; NH was omitted from four types of nucleotides 2 A group or SH group and the nucleotides are labeled differently from each other. Missing 3' -NH used 2 The nucleotides of the group or 3' -SH group are dideoxynucleotides (ddNTPs). As is well known to those skilled in the art, when the ddNTPs are labeled differently, the sequence of the target oligonucleotide can be determined by determining the position of the incorporated terminal nucleotide, and combining this information.
It will be appreciated that the nucleotides herein have utility in Sanger's method and related protocols, as the same effect achieved by using ddNTPs can be achieved by using the nucleotide analogues described herein.
In addition, it should also be appreciated that the nucleotides in the present application also have utility in second generation sequencing (NGS sequencing) and third generation sequencing (single molecule sequencing), as the same effect achieved by using dntps can be achieved by using the nucleotide analogs described herein.
Furthermore, it will be appreciated that by using radioactivity in the attached phosphate groups 32 P, the monitoring of incorporation of protected nucleotides at the base can be determined. These may be present either in the ddNTPs themselves or in the primers used for extension.
The invention is further illustrated below in conjunction with specific examples. The following examples merely exemplify a method for preparing a nucleotide analog of one of four bases, and one skilled in the art can refer to this method to prepare a nucleotide analog of the remaining three bases for synthesis. In addition, each raw material is commercially available unless otherwise specified.
Example 1
And (3) after Ms activation of hydroxyl at the 3-position of the compound, producing a conformation inversion product under alkaline conditions, adding sodium azide to obtain an upper azide product, removing a Tr protecting group, and completing triphosphorylation to obtain the product.
Example 2
And (3) carrying out azidation on the compound to obtain an intermediate, removing a TBS protecting group, and then carrying out triphosphorylation to obtain the product.
Example 3
Example 4
EXAMPLE 5 method for assessing the presence of the above nucleotide analogs on MGISEQ2000
1. Blocking effect assessment
Nucleotide substrates: fluorescent labeling standard hot dNTP (four) and standard cold dNTP (four) were both from MGISEQ-2000RS high throughput sequencing kit (FCL SE 50), shenzhen Hua Dazhi manufactured technologies, inc., cat# 1000012551; the nucleotide analogs cold dNTPs (comprising dTTP, dATP, dCTP, dGTP four types, the structure is as follows, and provided by the European Kanares company; only one cold dNTP is used in each test):
sequencing was performed using the nucleotide substrate described above and the MGISEQ-2000RS high throughput sequencing kit (FCL SE 50) according to MGISEQ2000 sequencer protocol.
(1) DNA nanospheres were prepared using Ecoli sequencing library.
(2) DNA nanospheres were loaded onto MGISEQ2000 sequencing chips.
(3) And loading the loaded sequencing chip onto an MGISEQ2000 sequencer, and setting a sequencing flow.
(4) First round of on-machine test: polymerization standard hot dNTP, recording signal values by photographing, and then cleaving off blocking groups with thpp reagent at 65℃for 1min.
(5) Second round of on-machine test: standard cold dNTP and then standard hot dNTP are polymerized, the signal value is recorded by photographing, and then the blocking group is excised with thpp reagent for 1min at 65 ℃.
(6) Third on-line test: and aggregating standard hot dNTP, photographing and recording signal values. The blocking group was then excised with thpp reagent at 65℃for 1min.
(7) Fourth on-line test: the nucleotide analogues cold dNTPs of the invention were polymerized (only one cold dNTP was polymerized for each test), then standard hot dNTP was polymerized, and the signal values were recorded by photographing. The blocking group was then excised with thpp reagent at 65℃for 1min.
(8) Fifth round of on-machine test: and aggregating standard hot dNTP, photographing and recording signal values.
(9) Polymerization efficiency and excision efficiency were evaluated, and the results are shown in Table 1.
Polymerization efficiency calculation:
wherein: EI (Incorporation efficiency), which is the ratio of the polymerization efficiency of the test nucleotide to the polymerization efficiency of the reference nucleotide;
c1 is a first-round on-machine test signal value;
c2 is the second turn-on test signal value;
C3 is the third on-board test signal value;
c4 is a fourth on-line test signal value;
excision efficiency calculation:
wherein:
ec (Cleavage efficiency), the ratio of the excision efficiency of the test nucleotide to that of the control nucleotide;
EI is the ratio of the polymerization efficiencies of the test nucleotide to the comparative nucleotide;
c3 is the third on-board test signal value;
c5 is the test signal value of the fifth turn on machine;
CGT is the signal of the C base, G base and T base in the third round.
TABLE 1 polymerization efficiency and excision efficiency of nucleotide analogs of the invention

Claims (18)

  1. The use of the following compounds or salts thereof for determining a target single-stranded polynucleotide sequence,
  2. a compound represented by the formula (A) or (B) or a salt thereof,
    wherein:
    r is selected from-N 3 、-NR a R b 、-SR c
    R a 、R b Each independently selected from H, N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), and R a And R is b Not simultaneously H;
    preferably, R a 、R b Any one selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1C6 alkyl (e.g. C1-C6 alkyl-S-S-CH 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), the other being H;
    more preferably, R a 、R b Any one selected from N 3 -C1-C6 alkyl, the other being H;
    most preferably, R a 、R b Any one of them is-CH 2 -N 3 The other is H;
    R c selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl);
    preferably, R c Selected from N 3 -C1-C6 alkyl;
    more preferably, R c is-CH 2 -N 3
    R 0 Is that
    n is selected from 1, 2, 3, 4; preferably, n is 1;
    r' is selected from H, monophosphate groupDiphosphate groupTriphosphate group Tetraphosphoric acid group
    Preferably, R' is a triphosphate group
    Each Z is independently selected from O, S, BH; preferably, Z is O;
    Base 1 、Base 2 each independently selected from the group consisting of bases, deazabases, or tautomers thereof, e.g. Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof;
    preferably, base 1 Selected from the group consisting of
    Preferably, base 2 Selected from the group consisting of
    R is-N 3 When, -N 3 Is a reversible blocking group;
    r is-NR a R b When R is a And R is b Is a reversible blocking group;
    r is-SR c When R is c Is a reversible blocking group;
    R 0 is a reversible blocking group.
  3. The compound of claim 2 or a salt thereof, wherein the compound has a structure represented by the formula (A-1) or (B-1),
    Wherein:
    R 0 is that
    n is selected from 1, 2, 3, 4; preferably, n is 1;
    r' is selected from H, monophosphate groupDiphosphate groupTriphosphate groupTetraphosphoric acid group
    Preferably, R' is a triphosphate group
    Each Z is independently selected from O, S, BH; preferably, Z is O;
    Base 1 、Base 2 each independently selected from the group consisting of bases, deazabases, or tautomers thereof, e.g. Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof;
    preferably, base 1 Selected from the group consisting of
    Preferably, base 2 Selected from the group consisting of
  4. The compound of claim 2 or a salt thereof, wherein the compound has a structure represented by the formula (A-2) or (B-2),
    wherein:
    R a 、R b each independently selected from H, N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), and R a And R is b Not simultaneously H;
    preferably, R a 、R b Any one selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl), the other being H;
    More preferably, R a 、R b Any one selected from N 3 -C1-C6 alkyl, the other being H;
    most preferably, R a 、R b Any one of them is-CH 2 -N 3 The other is H;
    R 0 is that
    n is selected from 1, 2, 3, 4; preferably, n is 1;
    r' is selected from H, monophosphate groupDiphosphate groupTriphosphate groupTetraphosphoric acid group
    Preferably, R' is a triphosphate group
    Each Z is independently selected from O, S, BH; preferably, Z is O;
    Base 1 、Base 2 each independently selected from the group consisting of bases, deazabases, or tautomers thereof, e.g. Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof;
    preferably, base 1 Selected from the group consisting of
    Preferably, base 2 Selected from the group consisting of
  5. The compound of claim 2 or a salt thereof, wherein the compound has a structure represented by the formula (A-3) or (B-3),
    wherein:
    R c selected from N 3 C1-C6 alkyl (e.g. -CH 2 -N 3 ) C1-C6 alkyl-S-S-C1-C6 alkyl (e.g. C1-C6 alkyl-S-S-CH) 2 -, in particular as-CH 2 -SS-Me、-CH 2 -SS-Et、-CH 2 -SS-iPr or-CH 2 -SS-t-Bu), C2-C6 alkenyl-C1-C6 alkyl (e.g. allyl);
    preferably, R c Selected from N 3 -C1-C6 alkyl;
    more preferably, R c is-CH 2 -N 3
    R 0 Is that
    n is selected from 1, 2, 3, 4; preferably, n is 1;
    r' is selected from H, monophosphate group Diphosphate groupTriphosphate groupTetraphosphoric acid group
    Preferably, R' is a triphosphate group
    Each Z is independently selected from O, S, BH; preferably, Z is O;
    Base 1 、Base 2 each independently selected from the group consisting of bases, deazabases, or tautomers thereof, e.g. Base 1 、Base 2 Each independently selected from adenine, 7-deazaadenine, thymine, uracil, cytosine, guanine, 7-deazaguanine or a tautomer thereof;
    preferably, base 1 Selected from the group consisting of
    Preferably, base 2 Selected from the group consisting of
  6. The compound of any one of claims 2-5, or a salt thereof, wherein the compound is selected from the group consisting of:
  7. the compound or salt thereof of any one of claims 2-6, wherein the compound or salt thereof carries an additional detectable label;
    preferably, the additional detectable label carried by the compound or salt thereof is introduced by an affinity reagent (e.g., antibody, aptamer, affimer, knottin) that carries the detectable label and that can specifically recognize and bind to an epitope of the compound or salt thereof;
    preferably, the additional detectable label is attached to the compound or salt thereof, optionally via a linking group;
    preferably, the additional detectable label is optionally attached to the Base of the compound or salt thereof via a linking group 1 Or R is 0 Connecting;
    preferably, the linking group is a cleavable linking group or a non-cleavable linking group;
    preferably, the cleavable linking group is selected from the group consisting of an electrophilically cleavable linking group, a nucleophilic cleavable linking group, a photolyzable linking group, a linking group cleaved under reducing conditions, a linking group cleaved under oxidizing conditions, a safety handle linking group, a linking group cleaved via an elimination mechanism, or any combination thereof;
    preferably, base 1 Different, the compounds of formula a carry different additional detectable labels;
    preferably, base 2 Different, the compounds of formula B carry different additional detectable labels;
    preferably, the detectable label is a fluorescent label;
    preferably, the detectable label is selected from the group consisting of:
  8. a method of terminating nucleic acid synthesis comprising: incorporating a compound of any one of claims 2-7, or a salt thereof, into a nucleic acid molecule to be terminated;
    preferably, the incorporation of the compound or salt thereof is effected by a terminal transferase, terminal polymerase or reverse transcriptase;
    preferably, the method comprises: incorporating the compound or salt thereof into a nucleic acid molecule to be terminated using a polymerase;
    Preferably, the method comprises: the nucleotide polymerization reaction is performed using a polymerase under conditions allowing the polymerase to perform the nucleotide polymerization reaction, thereby incorporating the compound or a salt thereof into the 3' end of the nucleic acid molecule to be terminated.
  9. A method of preparing a growing polynucleotide complementary to a single stranded polynucleotide of interest in a sequencing reaction, comprising incorporating a compound of any one of claims 2-7 or a salt thereof into the growing complementary polynucleotide, wherein incorporation of the compound or salt thereof prevents any subsequent nucleotide incorporation into the growing complementary polynucleotide;
    preferably, the incorporation of the compound or salt thereof is effected by a terminal transferase, terminal polymerase or reverse transcriptase;
    preferably, the method comprises: incorporating the compound or salt thereof into the growing complementary polynucleotide using a polymerase;
    preferably, the method comprises: the compound or salt thereof is incorporated into the 3' end of the growing complementary polynucleotide by nucleotide polymerization using a polymerase under conditions that allow the polymerase to perform nucleotide polymerization.
  10. A nucleic acid intermediate formed in the sequencing of the single stranded polynucleotide of interest, wherein,
    The nucleic acid intermediate is formed by the steps of:
    incorporating into the growing nucleic acid strand a nucleotide complementary to the single stranded polynucleotide of interest to form the nucleic acid intermediate, wherein the incorporated one complementary nucleotide is a compound of any one of claims 2-7 or a salt thereof;
    alternatively, the nucleic acid intermediate is formed by:
    incorporating into the growing nucleic acid strand a nucleotide complementary to the target single-stranded polynucleotide to form the nucleic acid intermediate, wherein the incorporated one complementary nucleotide is the compound of any one of claims 2 to 7 or a salt thereof, and the growing nucleic acid strand is pre-incorporated with at least one nucleotide complementary to the target single-stranded polynucleotide, the pre-incorporated at least one nucleotide complementary to the target single-stranded polynucleotide being the compound of any one of claims 2 to 7 or a salt thereof from which the reversible blocking group and optionally the detectable label have been removed.
  11. A method of determining the sequence of a single stranded polynucleotide of interest comprising:
    1) Monitoring incorporation of nucleotides complementary to a single-stranded polynucleotide of interest in a growing nucleic acid strand, wherein at least one complementary nucleotide incorporated is a compound of any one of claims 2-7 or a salt thereof, and,
    2) Determining the type of nucleotide incorporated;
    preferably, the reversible blocking group and optionally the detectable label are removed prior to the introduction of the next complementary nucleotide;
    preferably, the reversible blocking group and the detectable label are removed simultaneously;
    preferably, the reversible blocking group and the detectable label are removed sequentially; for example, the reversible blocking group is removed after the detectable label is removed, or the detectable label is removed after the reversible blocking group is removed.
  12. The method of claim 11, comprising the steps of:
    (a) Providing a plurality of different nucleotides, wherein at least one nucleotide is a compound of claim 7 or a salt thereof, optionally the remaining nucleotides are compounds of any one of claims 2-7 or salts thereof;
    (b) Incorporating the plurality of different nucleotides into a complementary sequence of a single stranded polynucleotide of interest, wherein the plurality of different nucleotides are distinguishable from one another upon detection;
    (c) Detecting the nucleotides of (b) to thereby determine the type of nucleotide incorporated;
    (d) Removing the reversible blocking group and optionally the detectable label carried thereby from the nucleotide of (b); and
    (e) Optionally repeating steps (a) - (d) one or more times;
    thereby determining the sequence of the single stranded polynucleotide of interest.
  13. The method of claim 11, comprising the steps of:
    (1) Providing a first nucleotide, a second nucleotide, a third nucleotide and a fourth nucleotide, at least one of the four nucleotides being a compound of claim 7 or a salt thereof, optionally the remaining nucleotides being a compound of any one of claims 2 to 7 or a salt thereof;
    (2) Contacting the four nucleotides with a single-stranded polynucleotide of interest; removing the nucleotides not incorporated into the growing nucleic acid strand; detecting the nucleotide incorporated into the growing nucleic acid strand; removing the reversible blocking group and optionally the detectable label carried thereby in the nucleotide incorporated into the growing nucleic acid strand;
    optionally, further comprising (3): repeating (1) - (2) one or more times.
  14. The method of claim 11, comprising the steps of:
    (a) Providing a mixture comprising a duplex, a nucleotide comprising at least one compound of claim 7 or salt thereof, a polymerase, and a excision reagent; the duplex comprises a growing nucleic acid strand and a nucleic acid strand to be sequenced;
    (b) Carrying out a reaction comprising the following steps (i), (ii) and (iii), optionally repeated one or more times:
    step (i): incorporating the compound or salt thereof into a growing nucleic acid strand using a polymerase to form a nucleic acid intermediate comprising a reversible blocking group and optionally a detectable label:
    step (ii): detecting the nucleic acid intermediate;
    step (iii): cleaving the reversible blocking group and optionally the detectable label comprised by the nucleic acid intermediate using a cleavage reagent;
    preferably, the cleavage of the reversible blocking group and the cleavage of the detectable label are performed simultaneously, or the cleavage of the reversible blocking group and the cleavage of the detectable label are performed stepwise (e.g., cleavage of the reversible blocking group first, or cleavage of the detectable label first);
    preferably, the cleavage reagent used for cleavage of the reversible blocking group and for cleavage of the detectable label is the same reagent;
    preferably, the excision reagent used for excision of the reversible blocking group and excision of the detectable label is a different reagent.
  15. The method of claim 14, wherein the duplex is attached to a support;
    Preferably, the growing nucleic acid strand is a primer;
    preferably, the primer forms the duplex by annealing to a nucleic acid strand to be sequenced;
    preferably, the duplex, the compound or salt thereof, and the polymerase together form a reaction system containing a solution phase and a solid phase;
    preferably, the compound or salt thereof is incorporated into a growing nucleic acid strand using a polymerase under conditions that allow the polymerase to undergo nucleotide polymerization to form a nucleic acid intermediate comprising a reversible blocking group and optionally a detectable label;
    preferably, the polymerase is selected from KOD polymerase or a mutant thereof (e.g., KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, KOD POL 391);
    preferably, before any step of detecting the nucleic acid intermediate, the solution phase of the reaction system of the previous step is removed, leaving the duplex attached to the support;
    preferably, the excision reagent is contacted with the duplex or the growing nucleic acid strand in a reaction system comprising a solution phase and a solid phase;
    preferably, the cleavage agent is capable of cleaving the reversible blocking group and optionally the detectable label carried thereby in the compound incorporating the growing nucleic acid strand, without affecting the phosphodiester bond on the duplex backbone;
    Preferably, after any step of cleaving the reversible blocking group and optionally the detectable label comprised in the nucleic acid intermediate, the solution phase of the reaction system of this step is removed;
    preferably, the washing operation is performed after any one of the steps including the removing operation;
    preferably, after step (ii), further comprising: determining the type of compound incorporated into the growing nucleic acid strand in step (i) from the signal detected in step (ii), and determining the type of nucleotide at the corresponding position in the nucleic acid strand to be sequenced based on the base complementary pairing rules.
  16. A kit comprising at least one compound of any one of claims 2-7 or a salt thereof;
    preferably, the kit comprises a first, a second, a third and a fourth compound, each independently a compound of any one of claims 2-7 or a salt thereof;
    preferably, in the first compound, base 1 Selected from adenine, 7-deazaadenine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the second compound, base 1 Selected from thymine, uracil or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the third compound, base 1 Selected from cytosine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the fourth compound, base 1 Selected from guanine, 7-deazaguanine or a tautomer thereof (e.g);
    Preferably, in the first compound, base 2 Selected from adenine, 7-deazaadenine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the second compound, base 2 Selected from thymine, uracil or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the third compound, base 2 Selected from cytosine or a tautomer thereof (e.g) The method comprises the steps of carrying out a first treatment on the surface of the In the fourth compound, base 2 Selected from guanine, 7-deazaguanine or a tautomer thereof (e.g);
    Preferably, the first, second, third and fourth compounds comprise a Base 1 Or Base 2 Are different from each other;
    preferably, the additional detectable labels carried by the first, second, third and fourth compounds are different from each other.
  17. The kit of claim 16, wherein the kit further comprises: reagents for pre-treating nucleic acid molecules; a support for ligating nucleic acid molecules to be sequenced; reagents for attaching (e.g., covalently or non-covalently attaching) a nucleic acid molecule to be sequenced to a support; primers for initiating nucleotide polymerization; a polymerase for performing nucleotide polymerization; one or more buffer solutions; one or more wash solutions; or any combination thereof.
  18. Use of a compound according to any one of claims 2 to 7 or a salt thereof or a kit according to any one of claims 16 to 17 for determining the sequence of a single stranded polynucleotide of interest.
CN202280045385.6A 2021-07-08 2022-07-05 Modified nucleosides or nucleotides Pending CN117561269A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN2021107748013 2021-07-08
CN202110774801 2021-07-08
PCT/CN2022/103895 WO2023280156A1 (en) 2021-07-08 2022-07-05 Modified nucleoside or nucleotide

Publications (1)

Publication Number Publication Date
CN117561269A true CN117561269A (en) 2024-02-13

Family

ID=84801271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280045385.6A Pending CN117561269A (en) 2021-07-08 2022-07-05 Modified nucleosides or nucleotides

Country Status (2)

Country Link
CN (1) CN117561269A (en)
WO (1) WO2023280156A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8399196B2 (en) * 2003-02-21 2013-03-19 Geneform Technologies Limited Nucleic acid sequencing methods, kits and reagents
ES2521740T3 (en) * 2009-07-06 2014-11-13 Trilink Biotechnologies Chemically modified ligase cofactors, donors and acceptors
CN108239669B (en) * 2016-12-23 2021-03-16 深圳华大智造科技股份有限公司 Method and kit for detecting content of non-blocking impurities in reversible blocking dNTP
CN110650968B (en) * 2017-10-11 2022-07-05 深圳华大智造科技股份有限公司 Modified nucleosides or nucleotides
WO2021130151A1 (en) * 2019-12-23 2021-07-01 Baseclick Gmbh Method of amplifying mrnas and for preparing full length mrna libraries

Also Published As

Publication number Publication date
WO2023280156A1 (en) 2023-01-12

Similar Documents

Publication Publication Date Title
US20070042407A1 (en) Modified nucleosides and nucleotides and uses thereof
EP2875131A1 (en) A method of normalizing biological samples
JPH08500722A (en) Polynucleotide-immobilized carrier
JPH08509857A (en) DNA sequencing method by mass spectrometry
CN113748216B (en) Single-channel sequencing method based on self-luminescence
CN116507736A (en) Modified nucleosides or nucleotides
CN114286867B (en) Method for sequencing polynucleotide based on luminous marker optical signal dynamics and secondary luminous signal
EP3878968A1 (en) Method for sequencing polynucleotides
CN113004358A (en) Selenium or thiothymidine-5' -triphosphate and synthesis method thereof
WO2023280156A1 (en) Modified nucleoside or nucleotide
CN114008063A (en) Modified nucleotides and methods for DNA and RNA polymerization and sequencing
US11254982B2 (en) Osmiumtetroxide-based conversion of RNA and DNA containing thiolated nucleotides
EP2251436B1 (en) Sirna detection method
CN115197291A (en) Nucleotide analogs for sequencing
KR101922125B1 (en) A method for labeling a target nucleic acid
WO2024123866A1 (en) Nucleosides and nucleotides with 3´ blocking groups and cleavable linkers
WO2023122499A1 (en) Periodate compositions and methods for chemical cleavage of surface-bound polynucleotides
AU2022419500A1 (en) Periodate compositions and methods for chemical cleavage of surface-bound polynucleotides
JP2024519372A (en) Compositions and methods for sequencing by synthesis
JP2024502293A (en) Sequencing of non-denaturing inserts and identifiers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination