CN110650968B - Modified nucleosides or nucleotides - Google Patents

Modified nucleosides or nucleotides Download PDF

Info

Publication number
CN110650968B
CN110650968B CN201780090915.8A CN201780090915A CN110650968B CN 110650968 B CN110650968 B CN 110650968B CN 201780090915 A CN201780090915 A CN 201780090915A CN 110650968 B CN110650968 B CN 110650968B
Authority
CN
China
Prior art keywords
compound
nucleic acid
group
polymerase
kod
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780090915.8A
Other languages
Chinese (zh)
Other versions
CN110650968A (en
Inventor
刘二凯
陈奥
章文蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MGI Tech Co Ltd
Original Assignee
MGI Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MGI Tech Co Ltd filed Critical MGI Tech Co Ltd
Publication of CN110650968A publication Critical patent/CN110650968A/en
Application granted granted Critical
Publication of CN110650968B publication Critical patent/CN110650968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/06Pyrimidine radicals
    • C07H19/10Pyrimidine radicals with the saccharide radical esterified by phosphoric or polyphosphoric acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Abstract

The present invention relates to the field of nucleic acid sequencing. In particular, the present invention provides a modified nucleoside or nucleotide having a reversible blockade of the 3' -OH and carrying a detectable label. The invention also relates to a kit comprising said nucleoside or nucleotide, a method for preparing said nucleoside or nucleotide, and a sequencing method based on said nucleoside or nucleotide.

Description

Modified nucleosides or nucleotides
Technical Field
The present invention relates to the field of nucleic acid sequencing. In particular, the present invention provides a modified nucleoside or nucleotide having a reversible blockade of the 3' -OH and carrying a detectable label. The invention also relates to a kit comprising said nucleoside or nucleotide, a method for preparing said nucleoside or nucleotide, and a sequencing method based on said nucleoside or nucleotide.
Background
The DNA sequencing technology comprises a first generation DNA sequencing technology represented by a Sanger (Sanger) sequencing method and a second generation DNA sequencing technology represented by Illumina Hiseq2500, Roche 454, ABI Solid, BGISEQ-500 and the like. The Sanger sequencing method has the characteristics of simple experimental operation, visual and accurate result, short experimental period and the like, and is widely applied to the fields of clinical gene mutation detection, gene typing and the like with high requirements on timeliness of detection results. However, sanger sequencing has the disadvantages of low throughput and high cost, which limits its application in large-scale gene sequencing.
Compared with the first generation DNA sequencing technology, the second generation DNA sequencing technology has the characteristics of large sequencing flux, low cost, high automation degree and single molecule sequencing. Taking the sequencing technology of Hiseq2500V2 as an example, one experimental flow can generate data of 10-200G bases, the average sequencing cost of each base is less than 1/1000 of the sequencing cost of Sanger sequencing method, and the obtained sequencing result can be directly processed and analyzed by a computer. Therefore, second generation DNA sequencing technologies are well suited for large scale sequencing.
The most critical part of the second generation sequencing technology is sequencing-by-synthesis, and the method usually adopted is reversible blocking of deoxyribonucleoside triphosphates (dntps), and in short, the process comprises: incorporating a dNTP with a reversible blocking group and a detectable label into a growing nucleic acid strand by polymerase catalysis, wherein the reversible blocking group protects the 3 '-OH of the dNTP and prevents the dNTP already incorporated into the nucleic acid strand from reacting with free dNTP via the 3' -OH; detecting the detectable label to judge the type of the introduced dNTP; thereafter, the blocking group and detectable label on the introduced dntps are removed and the next round of sequencing is started.
At present, there are various groups for reversibly blocking 3' -OH, for example, Illumina blocks 3' -OH with azidomethylene, and after polymerization and detection, the azidomethylene is cleaved with organic phosphine to release 3' -OH, so that the next cycle of sequencing can be continued; the Junyue Ju research group of university of Columbia adopts allyl to block 3' -OH, and adopts ligands of metallic palladium and triphenylphosphine tri-m-sulfonate trisodium salt in the process of excision, and removes blocking groups through palladium-catalyzed allyl removal reaction; further, the inventors have also accomplished the deblocking by reversibly blocking a carbonate and a thiosulfide bond and cleaving the thiosulfide bond to cleave the carbonate to which 3' -OH is bonded with the generated mercapto group.
Among the methods of sequencing by synthesis, the blocking method is very strict, and among the blocking methods mentioned above, only the blocking method using azidomethylene of Illumina has been successfully used for sequencing, and other methods have been used with various problems and thus have not been successfully used at present. Reasons for this situation include: 1) DNA sequencing reactions must be performed in near neutral aqueous systems, which makes many chemically reactive and blocking groups impractical; 2) the groups formed after blocking must be very stable and the cleavage reaction needs to be rapid and efficient, thus also precluding many types of reactions, such as the above methods of carbonate and thiosulfide bonds, which are difficult to apply due to the insufficient cleavage rate; 3) the cleavage reaction cannot affect the sequencing system, for example, allyl group is used as a blocking group, catalysis is required by palladium metal complex during the cleavage, and the situation that the palladium metal complex is not suitable for a chip can be met in practical application, and finally, an immature product is produced on the market.
In addition, although the azidomethylene blocking technique employed by Illumina is well established, there are some problems, including insufficient stability of the azide, which can lead to reduced sequencing quality. To solve these problems Illumina in patent application WO2014139596a1 optimized azidomethylene for increased stability, but although the excision efficiency could reach 100%, still higher temperatures were required and the excision reagent had a negative effect on the DNA. Furthermore, Illumina in its patent application US20130085073a1 uses phosphate-blocked dntps for sequencing-by-synthesis, but it is clearly shown in this patent that only the hydroxyl groups on the phosphate are fully protected before they can be polymerized by the polymerase, while the phosphate groups with all protected hydroxyl groups cannot be cleaved by endonuclease IV. Thus, the method disclosed in this patent comprises: dNTPs in which the 3' -OH is blocked by a phosphate group are polymerized, in which the hydroxyl group on the phosphate group is completely protected, followed by stepwise cleavage, i.e., one hydroxyl-protecting group on the phosphate group is first removed, and then the entire phosphate group is cleaved by endonuclease IV, in which one protecting group is removed by a method including photocleavage, and removal of the methyl group by Ada (methyltransferase). The method disclosed in this patent comprises: polymerization of dNTPs in which the 3' -OH is blocked by a phosphate group is theoretically possible, but is very troublesome in practical use.
Disclosure of Invention
In order to solve the above problems, the inventors of the present application developed a novel nucleotide in which 3' -OH is reversibly blocked, which has higher stability, can be efficiently and even completely polymerized by polymerase, and can be efficiently and even 100% cleaved under conditions without damaging DNA.
Summary of The Invention
The inventors of the present application have developed a novel modified nucleoside or nucleotide in which the 3 '-OH is blocked by a blocking group which can be cleaved by a cleaving reagent to produce a free 3' -OH without destroying the nucleic acid strand. The modified nucleoside or nucleotide also carries a detectable label which is attached to the base of the nucleotide by a linker comprising a phosphodiester bond, which is cleavable by a cleavage reagent (the same or different from the one used to cleave the blocking group) to remove the detectable label without destroying the nucleic acid strand.
Thus, in one aspect, the present application provides compounds having a structure represented by formula (I),
Figure GDA0002274171490000031
wherein R is1And R3Each independently selected from
Figure GDA0002274171490000032
Figure GDA0002274171490000033
A is selected from phenyl, naphthyl, indolyl and pyridyl;
R2selected from nitro, halogeno C1-4Alkyl, halogen, hydrogen, aldehyde group,
Figure GDA0002274171490000034
Wherein Q is independently selected from C1-4An alkyl group;
R4selected from the group consisting of-H, monophosphoric groups (-PO)3H2) Diphosphoric acid group (-PO)3H-PO3H2) Triphosphate group (-PO)3H-PO3H-PO3H2) And tetraphosphoric acid group (-PO)3H-PO3H-PO3H-PO3H2);
R6Is hydrogen or hydroxy;
m and n are each independently selected from 0, 1, 2, 3, 4, 5;
l is a linking group or is absent;
base represents a Base, for example a purine Base or a pyrimidine Base, for example one selected from A, T, U, C and G;
label represents a detectable Label, for example a fluorophore;
blocker represents a blocking group.
In one aspect, the present application also provides a method of making a modified nucleoside or nucleotide as described above.
The inventors of the present application have also developed a method for sequencing a polynucleotide based on the modified nucleoside or nucleotide of the present invention. In the sequencing method of the present invention, sequencing is performed while synthesizing a growing polynucleotide complementary to a target single-stranded polynucleotide.
Thus, in one aspect, the application provides a method of preparing a growing polynucleotide complementary to a target single stranded polynucleotide in a sequencing reaction, comprising incorporating a compound as defined above into the growing complementary polynucleotide, wherein the incorporation of the compound prevents the introduction of any subsequent nucleotides into the growing complementary polynucleotide.
In another aspect, the present application provides a method of determining the sequence of a target single-stranded polynucleotide, comprising: monitoring the sequential incorporation of complementary nucleotides, wherein at least one of the complementary nucleotides incorporated is a compound as defined above, and detecting the detectable label carried by the compound.
In certain embodiments, the method of determining the sequence of a target single-stranded polynucleotide comprises:
(a) providing a mixture comprising a duplex, a compound of formula (I), a polymerase and a cleavage reagent; the duplex comprises a growing nucleic acid strand and a nucleic acid molecule to be sequenced;
(b) carrying out a reaction cycle comprising the following steps (i), (ii) and (iii):
step (i): incorporating the compound into a growing nucleic acid strand using a polymerase, forming a nucleic acid intermediate comprising a blocking group and a detectable label;
step (ii): detecting a detectable label on the nucleic acid intermediate;
step (iii): the blocking group on the nucleic acid intermediate is removed using a cleavage reagent.
In certain embodiments, the reaction cycle further comprises step (iv): the detectable label on the nucleic acid intermediate is removed using a cleavage reagent.
In certain embodiments, the cleavage reagent used in step (iii) and step (iv) is the same reagent. In certain embodiments, the cleavage reagents used in step (iii) and step (iv) are different reagents.
In one aspect, the present invention provides a kit comprising first, second, third and fourth compounds each of which is a compound of the general formula (I) as defined above, which are derivatives of nucleotides a, (T/U), C and G, respectively, and have base complementary pairing ability.
Embodiments of the present invention will be explained in detail below with reference to the drawings and detailed description of the invention. However, those skilled in the art will appreciate that the drawings and detailed description presented below are illustrative of the invention and are not intended to limit the scope of the invention. Various objects and advantageous aspects of the present invention will become apparent to those skilled in the art from the detailed disclosure of the drawings and the detailed description of the invention.
Brief Description of Drawings
FIG. 1 is an HPLC chromatogram of intermediate compound IV-1 involved in the preparation of dTTP labeled with Cy 3-and blocked at 3' -OH in example 1.
FIG. 2 is an HPLC chromatogram of dTTP labeled with Cy 3-and blocked 3' -OH obtained in example 1.
FIG. 3 is an HPLC chromatogram of intermediate compound IV-2 involved in the preparation of dATP labeled with Cy 3-and 3' -OH blocked in example 2.
Fig. 4 exemplarily illustrates a procedure for measuring dTTP polymerization efficiency in example 3.
FIG. 5 schematically illustrates a procedure for determining the cleavage efficiency of blocking groups in example 4.
FIG. 6 shows the efficiency of cleavage of the blocking group at different cleavage times in example 4.
FIG. 7 schematically illustrates a procedure for measuring the efficiency of the cleavage of the fluorescent group in example 5.
FIG. 8 shows the efficiency of fluorophore cleavage at different cleavage times in example 5.
FIG. 9 is an LC profile of dTTP blocked by methyl phosphate in example 6 after stability testing.
FIG. 10 is the LC profile of azidomethylene blocked dTTP of example 6 after stability testing.
Information on the sequences to which the invention relates is provided in the table below
Sequence No. (SEQ ID NO:) Description of the invention
1 Form panel
2 Primer and method for producing the same
3 Primer and method for producing the same
Sequence information
Sequence 1(SEQ ID NO:1):33bp
5’-CAACAGAAGGATTCTGGCGAACCGGAGGCTGAA-3’
Sequence 2(SEQ ID NO:2) 31bp
3’-TGTCTTCCTAAGACCGCTTGGCCTCCGACTT-5’
Sequence 3(SEQ ID NO:3) 32bp
3’-TTGTCTTCCTAAGACCGCTTGGCCTCCGACTT-5’
Detailed Description
In the present invention, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In embodiments of the invention, methods and materials similar or equivalent to those described herein can be used, and only exemplary suitable methods and materials are described below. All publications, patent applications, patents, and other references are incorporated herein by reference. In addition, the materials, methods, and examples are illustrative only and not intended to be ophthalmic. Meanwhile, in order to better understand the present invention, the definitions and explanations of related terms are provided below.
As used herein, the term "C1-4Alkyl "refers to straight or branched chain alkyl groups containing 1 to 4 carbon atoms, including but not limited to C1-2Alkyl radical, C1-3Alkyl radical, C2-4Alkyl groups, for example: methyl, ethyl, n-propyl, isopropyl, n-butyl, sec-butyl, isobutyl, tert-butyl, and the like.
As used herein, the term "halo" refers to a substitution of a hydrogen on a group or compound by one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9) halogen atoms, including perhalogenation and partial halogenation.
As used herein, the term "halogen" includes fluorine, chlorine, bromine, iodine.
As used herein, the term "halo C1-4Alkyl "means one or more halogen atoms substituted for C1-4One of alkyl orRadicals derived from a plurality of hydrogen atoms, the said "halogens" and "C1-4Alkyl "is as defined above. Halogen substituted C1-4Alkyl includes but is not limited to halo C1-2Alkyl, halo C1-3Alkyl, halo C2-4Alkyl groups such as halomethyl (e.g. fluoromethyl, chloromethyl), haloethyl (e.g. fluoroethyl, chloroethyl), halo-n-propyl, halo-isopropyl, halo-n-butyl, halo-sec-butyl, halo-isobutyl, halo-tert-butyl.
As used herein, the term "block" refers to the use of a particular group to form a protection for the 3' -OH of a nucleoside or nucleotide to terminate polymerization by a potential polymerase (e.g., a DNA polymerase). The group used for blocking is referred to as "blocking group". In certain preferred embodiments, the blocking group can be removed so that the blocked hydroxyl group can be converted to a reactive hydroxyl group, such blocking being referred to as "reversible blocking". The group used for reversible blocking is referred to as a "reversible blocking group".
As used herein, the term "support" refers to any material (solid or semi-solid) that allows for stable attachment of nucleic acids, such as latex beads, dextran beads, polystyrene, polypropylene, polyacrylamide gel, gold thin layers, glass and silicon wafers, and the like. In some exemplary embodiments, the support is optically transparent, e.g., glass. As used herein, "stably attached" means that the linkage between the nucleic acid molecule and the support is sufficiently strong so that the nucleic acid molecule does not become detached from the support due to the conditions used in various reactions or processes, such as polymerization and washing processes.
As used herein, the term "linked" is intended to encompass any form of linkage, such as covalent and non-covalent linkages. In certain exemplary embodiments, the nucleic acid molecule is attached to the support, preferably by covalent means.
As used herein, the term "fragmenting" refers to the process of converting large nucleic acid fragments (e.g., large DNA fragments) into small nucleic acid fragments (e.g., small DNA fragments). In certain embodiments, the term "large nucleic acid fragment" is intended to encompass nucleic acid molecules (e.g., DNA) greater than 5kb, greater than 10kb, greater than 25kb, such as greater than 500kb, greater than 1Mb, greater than 5Mb, or greater.
As used herein, the term "end-filling" refers to a process of filling the ends of a nucleic acid molecule having an overhanging end to form a nucleic acid molecule having a blunt end.
Herein, the terms "linker" and "linker sequence" are used interchangeably. As used herein, the terms "linker" and "linker sequence" refer to a stretch of oligonucleotide sequence artificially introduced at the 5 'and/or 3' end of a nucleic acid molecule. A joint may typically contain one or more regions for performing a particular function. Thus, when linkers are artificially introduced at the 5 'and/or 3' end of a nucleic acid molecule, the linkers will be able to perform the specified function, thereby facilitating subsequent use. For example, the linker may comprise one or more primer binding regions to facilitate binding of the primers. In some exemplary embodiments, the adaptor may comprise one or more primer binding regions, for example a primer binding region capable of hybridizing to a primer for performing amplification, and/or a primer binding region capable of hybridizing to a primer for a sequencing reaction. In certain preferred embodiments, the linker comprises a universal linker sequence capable of hybridizing to a universal primer, e.g., a universal linker sequence capable of hybridizing to a universal amplification primer and/or a universal sequencing primer. Thus, nucleic acid molecules carrying linkers can be conveniently amplified and/or sequenced by using universal amplification primers and/or universal sequencing primers. In some exemplary embodiments, the linker may further comprise a tag or tag sequence.
The terms "tag" and "tag sequence" are used interchangeably herein. As used herein, the terms "tag" and "tag sequence" refer to a stretch of oligonucleotide sequence of a particular base sequence artificially introduced at the 5 'and/or 3' end of a nucleic acid molecule. Tags are commonly used to identify/distinguish the source of a nucleic acid molecule. For example, different tags may be introduced into nucleic acid molecules of different origins, respectively, whereby, when these nucleic acid molecules of different origins are mixed together, the origin of each nucleic acid molecule can be accurately determined by the unique tag sequence carried on each nucleic acid molecule. The tag sequence may have any length, e.g.2-50 bp, such as 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50bp, according to the actual requirements.
As used herein, the term "hybridize" generally refers to hybridization under stringent conditions. Hybridization techniques are well known in the field of molecular biology. For illustrative purposes, the stringent conditions include, e.g., moderately stringent conditions (e.g., hybridization in 6 x sodium chloride/sodium citrate (SSC) at about 45 ℃ followed by one or more washes in 0.2 x SSC/0.1% SDS at about 50-65 ℃); high stringency conditions (e.g., hybridization at about 45 ℃ in 6 XSSC followed by one or more washes in 0.1 XSSC/0.2% SDS at about 68 ℃); and other stringent hybridization conditions known to those of skill in the art (see, e.g., Ausubel, F.M. et al, 1989, Current Protocols in Molecular Biology, Vol.1, Green Publishing Associates, Inc., and John Wiley & Sons, Inc., New York, pp.6.3.1-6.3.6 and 2.10.3).
As used herein, the expression "reaction system comprising a solution phase and a solid phase" means that the reaction system of the invention comprises both a support and a substance attached to the support (solid phase) and a substance dissolved in a solution/solvent (solution phase). Accordingly, the expression "removing the solution phase of the reaction system" means that the solution in the reaction system and the substance contained therein (solution phase) are removed while only the support in the reaction system and the substance attached to the support (solid phase) remain. In the context of the present invention, the substance (solid phase) attached to the support may comprise a nucleic acid molecule to be sequenced, a growing nucleic acid strand, and/or a duplex formed by the nucleic acid molecule to be sequenced and the growing nucleic acid strand.
As used herein, the term "primer" refers to a primer that contains an oligonucleotide sequence that hybridizes to a complementary sequence and initiates a specific polymerization reaction. In general, the sequences of the primers are selected/designed to have maximum hybridization activity for complementary sequences and very low non-specific hybridization activity for other sequences, thereby minimizing non-specific amplification. Methods of designing primers are well known to those skilled in the art and can be performed using commercially available software (e.g., Primer Premier version 6.0, Oligo version 7.36, etc.).
As used herein, the term "polymerase" refers to an enzyme capable of performing a nucleotide polymerization reaction. Such enzymes are capable of introducing nucleotides at the 3' end of a growing nucleic acid strand that pair with nucleotides at corresponding positions in the template nucleic acid according to the base complementary pairing rules.
As used herein, the expressions "a, (T/U), C and G" are intended to cover two cases: "A, T, C and G" and "A, U, C and G". Thus, the expression "the four compounds are derivatives of nucleotides a, (T/U), C and G, respectively" is intended to indicate that the four compounds are derivatives of nucleotides A, T, C and G, respectively, or of nucleotides A, U, C and G, respectively.
As used herein, the expression "a compound has base complementary pairing ability" means that the compound is capable of pairing with a corresponding base and forming a hydrogen bond according to the base complementary pairing principle. Base A can pair with base T or U and base G can pair with base C according to the base complementary pairing rules. Thus, when a compound having base complementary pairing ability is a derivative of nucleotide a, it will be able to pair with base T or U; when the compound having base complementary pairing ability is a derivative of the nucleotide T or U, it will be able to pair with the base a; when the compound having base complementary pairing ability is a derivative of nucleotide C, it will be able to pair with base G; when the compound having base complementary pairing ability is a derivative of nucleotide G, it will be able to pair with base C.
(I) modified nucleoside or nucleotide and process for producing the same
Modified nucleosides or nucleotides
The inventors of the present application have developed a novel modified nucleoside or nucleotide in which the 3 '-OH is blocked by a blocking group which can be cleaved by a cleaving reagent to produce a free 3' -OH without destroying the nucleic acid strand. The modified nucleoside or nucleotide also carries a detectable label which is attached to the base of the nucleotide by a linker comprising a phosphodiester bond, which is cleavable by a cleavage reagent (the same or different from the one used to cleave the blocking group) to remove the detectable label without destroying the nucleic acid strand.
Thus, in one aspect, the present application provides a compound having a structure represented by formula (I),
Figure GDA0002274171490000101
wherein R is1And R3Each independently selected from
Figure GDA0002274171490000102
Figure GDA0002274171490000103
A is selected from phenyl, naphthyl, indolyl and pyridyl;
R2selected from nitro, halogeno C1-4Alkyl (e.g. fluoro C)1-4Alkyl), halogen (e.g., fluorine, chlorine, bromine, iodine), hydrogen, aldehyde groups, alkyl, aryl, heteroaryl, and heteroaryl,
Figure GDA0002274171490000104
Wherein Q is independently selected from C1-4Alkyl (e.g. C)1-2Alkyl radical, C1-3Alkyl radical, C2-4Alkyl groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, sec-butyl, isobutyl, tert-butyl);
R4selected from the group consisting of-H, monophosphoric groups (-PO)3H2) Diphosphoric acid group (-PO)3H-PO3H2) Triphosphate group (-PO)3H-PO3H-PO3H2) And tetraphosphoric acid group (-PO)3H-PO3H-PO3H-PO3H2);
R6Is hydrogen or hydroxy;
m and n are each independently selected from 0, 1, 2, 3, 4, 5;
l is a linking group or is absent;
base represents a Base, for example a purine Base or a pyrimidine Base, for example one selected from A, T, U, C and G;
label represents a detectable Label, for example a fluorophore;
blocker represents a blocking group.
In the compound, A is an aromatic group, and after a phosphodiester bond connected with the aromatic group is broken, an aromatic ring with hydroxyl can be formed, so that a cut product can exist stably; r is1、R2And R3Is an electron withdrawing group, contributes to A and R1The phosphodiester bond between is cleaved by the cleaving reagent.
In certain embodiments, R1Is selected from
Figure GDA0002274171490000111
In certain embodiments, R1Is composed of
Figure GDA0002274171490000112
In certain embodiments, R2Selected from nitro, trifluoromethyl, fluorine, chlorine, hydrogen and aldehyde groups. In certain embodiments, R2Is nitro.
In certain embodiments, R3Is selected from
Figure GDA0002274171490000113
In certain embodiments, R3Is composed of
Figure GDA0002274171490000114
In some embodiments of the present invention, the substrate is,
Figure GDA0002274171490000115
selected from:
Figure GDA0002274171490000116
in some embodiments of the present invention, the substrate is,
Figure GDA0002274171490000117
is composed of
Figure GDA0002274171490000118
The compounds of the invention may be nucleosides or nucleotides, thus R4Can be-H, a monophosphate group (-PO)3H2) Diphosphoric acid group (-PO)3H-PO3H2) A triphosphate group (-PO)3H-PO3H-PO3H2) Or tetraphosphoric acid group (-PO)3H-PO3H-PO3H-PO3H2)。
In certain embodiments, R4Is a monophosphate group (-PO)3H2) Diphosphoric acid group (-PO)3H-PO3H2) Triphosphoric acid group (-PO)3H-PO3H-PO3H2) Or tetraphosphoric acid group (-PO)3H-PO3H-PO3H-PO3H2) A nucleotide. In this case, the compound is a nucleotide.
In certain embodiments, R4Is a triphosphate group (-PO)3H-PO3H-PO3H2). In this case, the compound is a nucleoside triphosphate.
In certain embodiments, R4is-H. In this case, the compound is a nucleoside.
In the compounds of the present invention, Blocker may have various structures. In certain embodiments, the structure of Blocker is:
Figure GDA0002274171490000121
wherein, Ra1And Ra2Each independently selected from H, F, -CF3、-CHF2、-CH2F、-CH2W, -COOW, -CONHW; w is selected from C1-C6An alkyl group.
In certain embodiments, the structure of Blocker is:
Figure GDA0002274171490000122
wherein, Rb is1、Rb2、Rb3、Rb4、Rb5Each independently selected from H and C1-C6An alkyl group.
In certain embodiments, the structure of Blocker is:
Figure GDA0002274171490000123
wherein Rc is1、Rc2Each independently selected from H, F, Cl and-CF3
In certain embodiments, the structure of Blocker is:
Figure GDA0002274171490000124
wherein R is5Is selected from C1-4Alkyl (e.g. C)1-2Alkyl radical, C1-3Alkyl radical, C2-4Alkyl groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, sec-butyl, isobutyl, tert-butyl). Thus, in certain embodiments, the compounds of the present invention have the structure shown in formula (I'):
Figure GDA0002274171490000125
wherein R is5Is selected from C1-4Alkyl (e.g. C)1-2Alkyl radical, C1-3Alkyl radical, C2-4Alkyl groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, sec-butyl, isobutyl, tert-butyl).
In certain embodiments, R5Is methyl or ethyl.
In the compounds of the invention, the detectable Label (Label) is linked to the Base (Base) via a phosphodiester bond; the phosphodiester bond can be cleaved by a cleavage reagent, thereby removing the detectable label from the compound. In certain embodiments, the detectable label is a fluorophore. Examples of fluorophores that can be used in the present invention include, but are not limited to, various known fluorescent markers, such as AF532, ALEX-350, FAM, VIC, TET, CAL
Figure GDA0002274171490000131
Gold 540, JOE, HEX, CAL Fluor Orange 560, TAMRA, CAL Fluor Red 590, ROX, CAL Fluor Red 610, TEXAS RED, CAL Fluor Red 635, Quasar 670, Cy3, Cy3.5, Cy5, Cy5.5, Quasar 705 and the like. Such fluorophores and methods for their detection are well known in the art and can be selected according to the actual need. In certain embodiments, the Label in a compound of the invention is Cy3, Cy3.5, Cy5, or Cy 5.5.
The compounds of the invention may be deoxyribonucleotides, ribonucleotides, deoxyribonucleosides, or ribonucleosides. Thus, R6And may be hydrogen or hydroxyl. In certain embodiments, R6Is hydrogen. In this case, the compound is a deoxyribonucleotide or a deoxyribonucleoside.
In the compound of the present invention,
Figure GDA0002274171490000132
l functions as a linking group, wherein L may or may not be present. In certain embodiments, m is 1. In certain embodiments, n is 1. In certain embodiments, L is
Figure GDA0002274171490000133
In certain embodiments, the compounds of the present invention have a structure represented by formula (II)
Figure GDA0002274171490000134
In certain embodimentsR in the general formula (II)2Is nitro.
In certain embodiments, the Label in formula (II) is Cy3.
In certain embodiments, the compounds of the present invention have the following structure:
Figure GDA0002274171490000141
Figure GDA0002274171490000151
method for preparing modified nucleoside or nucleotide
The present application also provides methods of making modified nucleosides or nucleotides as described above. An exemplary preparation process includes steps 1-3:
step 1:
Figure GDA0002274171490000161
step 1 further comprises the steps of:
step 1-1: taking the compound I as a starting material, and carrying out oxidation reaction to generate a compound II (intermediate).
In certain embodiments, step 1 comprises: adding methanol and tetrazole to acetonitrile, then adding a solution in which the compound I is dissolved, and stirring at room temperature to replace a diisopropyl amine group on the compound I by a methoxy group on the methanol; then, iodine was added thereto and stirred at room temperature to obtain compound II.
In certain embodiments, the methanol and/or tetrazole is in excess relative to compound I. In certain embodiments, the molar ratio of methanol to compound I is 2-5:1, e.g., 2:1, 3:1, 4:1, or 5:1, e.g., 3: 1. In certain embodiments, the molar ratio of tetrazole to compound I is 2-5:1, e.g., 2:1, 3:1, 4:1, or 5:1, e.g., 3: 1.
In certain embodiments, the iodine is added to the reaction system in the form of a solution, for example, a solution comprising water and/or an organic solvent (e.g., a solution comprising water, pyridine, and/or tetrahydrofuran).
In certain embodiments, the iodine is in excess relative to compound I. In certain embodiments, the molar ratio of compound I to iodine is 1:1.5 to 5, e.g., 1:1.5, 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, or 1:5, e.g., 1: 1.5.
In certain embodiments, step 1-1 further comprises: after the oxidation reaction of compound I was completed, sodium sulfite was added to remove unreacted iodine.
In certain embodiments, step 1-1 further comprises purifying compound II.
Step 1-2: and carrying out deprotection reaction on the compound II to generate a compound III.
In certain embodiments, steps 1-2 comprise: compound II was mixed with trichloroacetic acid and stirred.
In certain embodiments, steps 1-2 comprise: compound II is added to a solution containing trichloroacetic acid (e.g., trichloroacetic acid in dichloromethane) and stirred at room temperature to give compound III.
In certain embodiments, the trichloroacetic acid is in excess relative to compound II.
In certain embodiments, step 1-2 further comprises purifying compound III.
Step 1-3: and carrying out triphosphoric acid reaction and deprotection reaction on the compound III to generate a compound IV.
In certain embodiments, steps 1-3 comprise: adding 2-chloro-4H-1, 3, 2-benzodioxyphosp-4-one into the solution containing the compound III under the protection of argon, and stirring at room temperature; adding tri-n-butyl ammonium pyrophosphate and n-butylamine, and stirring at room temperature; then, iodine was added thereto, and the mixture was stirred at room temperature.
In certain embodiments, the solution comprising compound III further comprises 1, 4-dioxane and anhydrous pyridine.
In certain embodiments, the 2-chloro-4H-1, 3, 2-benzodioxyphosp-4-one is added to the solution comprising compound III in the form of a solution (e.g., a1, 4-dioxane solution).
In certain embodiments, the molar ratio of compound III to 2-chloro-4H-1, 3, 2-benzodioxyphosp-4-one is 1:1-2, e.g., 1:1.0, 1:1.1, 1:1.2, 1:1.3, 1:1.4, 1:1.5, 1:1.6, 1:1.7, 1:1.8, 1:1.9, or 1:2.0, e.g., 1: 1.1.
In certain embodiments, the tri-N-butylammonium pyrophosphate and/or N-butylamine are added to the reaction system as a solution (e.g., a solution of N, N-dimethylformamide).
In certain embodiments, the tri-n-butyl ammonium pyrophosphate is in excess relative to compound III. In certain embodiments, the molar ratio of compound III to tri-n-butyl ammonium pyrophosphate is 1:1.5 to 5, e.g., 1:1.5, 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, or 1:5, e.g., 1: 1.5.
In certain embodiments, the iodine is added to the reaction system in the form of a solution, for example, a solution comprising water and/or an organic solvent (e.g., a solution comprising water, pyridine, and/or tetrahydrofuran).
In certain embodiments, the iodine is in excess relative to compound III. In certain embodiments, the molar ratio of compound I to iodine is 1:1.5 to 5, e.g., 1:1.5, 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, or 1:5, e.g., 1: 1.5.
In certain embodiments, steps 1-3 further comprise: after the triphosphoric acid reaction and deprotection reaction of compound III were completed, sodium sulfite was added to remove unreacted iodine.
In certain embodiments, steps 1-3 further comprise purifying compound IV.
Step 2
Figure GDA0002274171490000181
Step 2 further comprises the steps of:
step 2-1: 2-glycolic acid and N-ethylenediamine trifluoroacetamide are reacted to produce compound V.
In certain embodiments, step 2-1 comprises: to 2-glycolic acid, O- (N-succinimidyl) -N, N' -tetramethylurea tetrafluoroborate and N, N-diisopropylethylamine were added, and the mixture was stirred at room temperature, followed by addition of N-ethylenediamine trifluoroacetamide and stirring at room temperature to obtain compound V.
In certain embodiments, the molar ratio of 2-hydroxyacetic acid to N-ethylenediamine trifluoroacetamide is from 1:1 to 1:2, e.g., 1:1.0, 1:1.1, 1:1.2, 1:1.3, 1:1.4, 1:1.5, 1:1.6, 1:1.7, 1:1.8, 1:1.9, or 1:2.0, e.g., 1: 1.2.
In certain embodiments, the molar ratio of 2-hydroxyacetic acid to O- (N-succinimidyl) -N, N' -tetramethylurea tetrafluoroborate is from 1:1 to 1:2, e.g., 1:1.0, 1:1.1, 1:1.2, 1:1.3, 1:1.4, 1:1.5, 1:1.6, 1:1.7, 1:1.8, 1:1.9, or 1:2.0, e.g., 1: 1.2.
In certain embodiments, the molar ratio of 2-hydroxyacetic acid to N, N-diisopropylethylamine is from 1:1 to 1:2, e.g., 1:1.0, 1:1.1, 1:1.2, 1:1.3, 1:1.4, 1:1.5, 1:1.6, 1:1.7, 1:1.8, 1:1.9, or 1:2.0, e.g., 1: 1.5.
In certain embodiments, step 2-1 further comprises purifying compound V.
Step 2-2: compound V is reacted in the presence of N, N-diisopropylethylamine and 2-cyanoethyl N, N-diisopropylphosphoramidite to form compound VI.
In certain embodiments, step 2-2 comprises: adding 2-cyanoethyl N, N-diisopropyl phosphoramidite chloride to the solution containing the compound V and N, N-diisopropyl ethylamine at 0 ℃, stirring for a period of time at 0 ℃, slowly raising the temperature to room temperature, and continuing stirring to obtain a compound VI.
In certain embodiments, the solution comprising compound V and N, N-diisopropylethylamine also contains dichloromethane.
In certain embodiments, the 2-cyanoethyl N, N-diisopropylphosphoramidite is added as a solution (e.g., dichloromethane solution).
In certain embodiments, the molar ratio of compound V and N, N-diisopropylethylamine is from 1:1 to 1:2, e.g., 1:1.0, 1:1.1, 1:1.2, 1:1.3, 1:1.4, 1:1.5, 1:1.6, 1:1.7, 1:1.8, 1:1.9, or 1:2.0, e.g., 1: 1.5.
In certain embodiments, the molar ratio of compound V to 2-cyanoethyl N, N-diisopropylphosphorochloridite is from 1:1 to 1:2, e.g., 1:1.0, 1:1.1, 1:1.2, 1:1.3, 1:1.4, 1:1.5, 1:1.6, 1:1.7, 1:1.8, 1:1.9, or 1:2.0, e.g., 1:1.
In certain embodiments, step 2-2 further comprises purifying compound VI.
Step 2-3: the compound VI is subjected to an oxidation reaction to produce a compound VI'.
In certain embodiments, steps 2-3 comprise: compound VI is mixed with tetrazole and tert-butyl 2-nitro-5-hydroxy-benzoate, stirred at room temperature for a period of time, iodine is added, and stirring is continued at room temperature to yield compound VI'.
In certain embodiments, the tetrazole is in excess relative to compound VI. In certain embodiments, the molar ratio of compound VI to tetrazole is 1:5, e.g., 1:1, 1:2, 1:3, 1:4, or 1:5, e.g., 1:2.
In certain embodiments, the 2-nitro-5-hydroxy-benzoic acid tert-butyl ester is in excess relative to compound VI. In certain embodiments, the molar ratio of compound VI to tert-butyl 2-nitro-5-hydroxy-benzoate is 1:5, e.g., 1:1, 1:2, 1:3, 1:4, or 1:5, e.g., 1:2.
In certain embodiments, compound VI is dissolved in acetonitrile prior to the reaction. In certain embodiments, the tetrazole and tert-butyl 2-nitro-5-hydroxy-benzoate are combined with compound VI in the form of a solution (e.g., acetonitrile solution).
In certain embodiments, the iodine is added to the reaction system in the form of a solution, for example, a solution comprising water and/or an organic solvent (e.g., a solution comprising water, pyridine, and/or tetrahydrofuran).
In certain embodiments, the iodine is in excess relative to compound VI. In certain embodiments, the molar ratio of compound VI to iodine is 1:1.5 to 5, e.g., 1:1.5, 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, or 1:5, e.g., 1: 1.5.
In certain embodiments, steps 2-3 further comprise: after the oxidation reaction of compound VI was completed, sodium sulfite was added to remove unreacted iodine.
Step 2-4: deprotection reaction was performed on compound VI'.
In certain embodiments, steps 2-4 comprise: adding ammonia water to compound VI' to remove trifluoroacetyl and cyanoethyl groups; the ammonia water was removed and potassium hydroxide was added to remove the tert-butyl group to give compound VII.
In certain embodiments, steps 2-4 comprise: adding excessive ammonia water into the compound VI', and stirring at room temperature; remove ammonia, add excess potassium hydroxide, stir at room temperature.
In certain embodiments, the potassium hydroxide is added as a solution (e.g., an aqueous solution).
In certain embodiments, steps 2-4 further comprise: compound VII was purified.
And step 3:
Figure GDA0002274171490000211
step 3-1: compound VII is reacted with N-hydroxysuccinimide ester of Cy3 under weakly basic conditions to produce compound VIII.
In certain embodiments, step 3-1 comprises: to the N-hydroxysuccinimide ester of Cy3, N-diisopropylethylamine and compound VII were added, and stirred at room temperature to give compound VIII.
In certain embodiments, step 3-1 comprises: to a solution of Cy3 in DMF of N-hydroxysuccinimide ester, a solution of DMF comprising N, N-diisopropylethylamine and compound VII was added and stirred at room temperature to give compound VIII.
In certain embodiments, step 3-1 further comprises: compound VIII was purified.
Step 3-2: reacting compound IV and compound VIII to produce the modified nucleoside or nucleotide of the present invention.
In certain embodiments, step 3-2 comprises: the compound VIII, O- (N-succinimidyl) -N, N' -tetramethylurea tetrafluoroborate and N, N-diisopropylethylamine were mixed, followed by addition of the compound IV and stirring at room temperature to obtain the compound of the present invention.
In certain embodiments, the reaction of step 3-2 is carried out in DMF.
In certain embodiments, compound IV is added to the reaction system in the form of a solution (e.g., a solution comprising sodium bicarbonate).
In certain embodiments, step 3-2 further comprises: the modified nucleoside or nucleotide of the present invention is purified.
(II) sequencing method
The inventors of the present application have also developed a method for sequencing a polynucleotide based on the modified nucleoside or nucleotide of the present invention. In the sequencing method of the present invention, sequencing is performed while synthesizing a growing polynucleotide complementary to a target single-stranded polynucleotide.
Thus, in one aspect, the application provides a method of preparing a growing polynucleotide complementary to a target single stranded polynucleotide in a sequencing reaction, comprising incorporating a compound as defined above into the growing complementary polynucleotide, wherein the incorporation of the compound prevents the introduction of any subsequent nucleotides into the growing complementary polynucleotide.
In certain embodiments, the incorporation of the compound is achieved by a terminal transferase, a terminal polymerase, or a reverse transcriptase.
In certain embodiments, the method comprises: using a polymerase, the compound is incorporated into the growing complementary polynucleotide.
In certain embodiments, the method comprises: performing a nucleotide polymerization reaction using a polymerase under conditions that allow the polymerase to perform the nucleotide polymerization reaction, thereby incorporating the compound into the 3' end of the growing complementary polynucleotide.
In another aspect, the present application provides a method of determining the sequence of a target single-stranded polynucleotide, comprising: monitoring the sequential incorporation of complementary nucleotides, wherein at least one of the complementary nucleotides incorporated is a compound as defined above, and detecting the detectable label carried by the compound.
In certain embodiments, the blocking group and the detectable label in the compound are removed prior to introducing the next complementary nucleotide.
In certain embodiments, the blocking group and the detectable label are removed simultaneously.
In certain embodiments, the blocking group and the detectable label are removed sequentially. For example, the blocking group is removed before or after the detectable label is removed.
In certain embodiments, the method of determining the sequence of a target single-stranded polynucleotide comprises:
(a) providing a mixture comprising a duplex, a compound of formula (I), a polymerase and a cleavage reagent; the duplex comprises a growing nucleic acid strand and a nucleic acid molecule to be sequenced;
(b) carrying out a reaction cycle comprising the following steps (i), (ii) and (iii):
step (i): incorporating the compound into a growing nucleic acid strand using a polymerase, forming a nucleic acid intermediate comprising a blocking group and a detectable label;
step (ii): detecting a detectable label on the nucleic acid intermediate;
step (iii): the blocking group on the nucleic acid intermediate is removed using a cleavage reagent.
In certain embodiments, the reaction cycle further comprises step (iv): the detectable label on the nucleic acid intermediate is removed using a cleavage reagent.
In certain embodiments, the cleavage reagent used in step (iii) and step (iv) is the same reagent. In certain embodiments, the cleavage reagents used in step (iii) and step (iv) are different reagents.
In certain embodiments, for example, where the at least one complementary nucleotide incorporated is a compound of formula (I'), the method of determining the sequence of a target single-stranded polynucleotide comprises:
(1) providing a duplex comprising a growing nucleic acid strand and a nucleic acid molecule to be sequenced, said duplex being attached to a support;
(2) adding a polymerase for performing a nucleotide polymerization reaction, and first, second, third, and fourth compounds, thereby forming a reaction system comprising a solution phase and a solid phase; wherein, the four compounds are derivatives of nucleotide A, (T/U), C and G, all have the structure shown in the general formula (I') and have base complementary pairing ability;
(3) performing a nucleotide polymerization reaction using a polymerase under conditions that allow the polymerase to perform the nucleotide polymerization reaction, thereby incorporating one of the four compounds into the 3' end of the growing nucleic acid strand;
(4) removing the solution phase of the reaction system of the previous step, retaining the duplexes attached to the support, and detecting the signal emitted by the duplexes or the detectable labels on the growing nucleic acid strands;
(5) adding a cleaving agent to contact the duplex or the growing nucleic acid strand with the cleaving agent in a reaction system comprising a solution phase and a solid phase; wherein the cleaving agent is capable of cleaving phosphodiester bonds (1) and/or phosphodiester bonds (2) in a compound incorporated at the 3' end of a growing nucleic acid strand and does not affect the phosphodiester bonds on the duplex backbone;
(6) the solution phase of the reaction system of the previous step is removed.
In certain preferred embodiments, the method further comprises the steps of:
(7) repeating the steps (2) - (6) or the steps (2) - (4) one or more times.
Optionally, a washing operation is performed after any one of the steps comprising the removing operation. In certain preferred embodiments, between step (4) and step (5), a washing operation is performed. In certain preferred embodiments, after step (6), a washing operation is performed.
In certain embodiments, the duplex is obtained by a method comprising the steps of:
providing a primer that anneals to a nucleic acid molecule to be sequenced, said primer acting as an initial growing nucleic acid strand that, together with said nucleic acid molecule to be sequenced, forms a duplex attached to a support.
Nucleic acid molecules
In the method of the invention, the nucleic acid molecule to be sequenced may be any nucleic acid molecule of interest. In certain preferred embodiments, the nucleic acid molecule to be sequenced comprises deoxyribonucleotides, ribonucleotides, modified deoxyribonucleotides, modified ribonucleotides, or any combination thereof. In the method of the present invention, the nucleic acid molecule to be sequenced is not limited by its type. In certain preferred embodiments, the nucleic acid molecule to be sequenced is DNA or RNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced can be genomic DNA, mitochondrial DNA, chloroplast DNA, mRNA, cDNA, miRNA, or siRNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is linear or circular. In certain preferred embodiments, the nucleic acid molecule to be sequenced is double-stranded or single-stranded. For example, the nucleic acid molecule to be sequenced may be single stranded DNA (ssdna), double stranded DNA (dsdna), single stranded RNA (ssrna), double stranded RNA (dsrna), or a hybrid of DNA and RNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is a single stranded DNA. In certain preferred embodiments, the nucleic acid molecule to be sequenced is double-stranded DNA.
In the method of the invention, the nucleic acid molecule to be sequenced is not limited by its origin. In certain preferred embodiments, the nucleic acid molecule to be sequenced can be obtained from any source, e.g., any cell, tissue, or organism (e.g., viruses, bacteria, fungi, plants, and animals). In certain preferred embodiments, the nucleic acid molecule to be sequenced is derived from a mammal (e.g., a human, non-human primate, rodent, or canine), plant, avian, reptilian, fish, fungus, bacterium, or virus.
Methods for extracting or obtaining nucleic acid molecules from cells, tissues or organisms are well known to those skilled in the art. Suitable methods include, but are not limited to, ethanol precipitation, chloroform extraction, and the like. For a detailed description of such methods see, for example, j.sambrook et al, molecular cloning: a laboratory manual, 2 nd edition, cold spring harbor laboratory press, 1989, and f.m. ausubel et al, finely compiled molecular biology laboratory guidelines, 3 rd edition, John Wiley & Sons, inc., 1995. In addition, various commercial kits can be used to extract nucleic acid molecules from various sources (e.g., cells, tissues, or organisms).
In the method of the present invention, the nucleic acid molecule to be sequenced is not limited by its length. In certain preferred embodiments, the nucleic acid molecule to be sequenced may be at least 10bp, at least 20bp, at least 30bp, at least 40bp, at least 50bp, at least 100bp, at least 200bp, at least 300bp, at least 400bp, at least 500bp, at least 1000bp, or at least 2000bp in length. In certain preferred embodiments, the length of the nucleic acid molecule to be sequenced may be 10-20bp, 20-30bp, 30-40bp, 40-50bp, 50-100bp, 100-200bp, 200-300bp, 300-400bp, 400-500bp, 500-1000bp, 1000-2000bp, or more than 2000 bp. In certain preferred embodiments, the nucleic acid molecule to be sequenced may have a length of 10-1000bp to facilitate high throughput sequencing.
In certain preferred embodiments, the nucleic acid molecules may be pretreated prior to attachment to the support. Such pre-treatments include, but are not limited to, fragmentation of the nucleic acid molecule, end filling, addition of linkers, addition of tags, repair of nicks, amplification of the nucleic acid molecule, isolation and purification of the nucleic acid molecule, and any combination thereof.
For example, in certain preferred embodiments, the nucleic acid molecules may be subjected to a fragmentation process in order to obtain nucleic acid molecules of a suitable length. In the method of the present invention, fragmentation of a nucleic acid molecule (e.g., DNA) can be performed by any method known to one of ordinary skill in the art. For example, fragmentation can be performed by enzymatic or mechanical means. The mechanical means may be ultrasonic or physical shearing. The enzymatic method can be performed by digestion with nucleases (e.g., deoxyribonuclease) or restriction endonucleases. In certain preferred embodiments, the fragmentation results in an end of unknown sequence. In certain preferred embodiments, the fragmentation results in ends of known sequence.
In certain preferred embodiments, the enzymatic method uses dnase I to fragment nucleic acid molecules. DNase I is a general purpose enzyme that nonspecifically cleaves double-stranded DNA (dsDNA) to release 5' phosphorylated di-, tri-and oligonucleotide products. DNase I in the presence of Mn2+、Mg2+And Ca2+But optimal activity in buffers without other salts, which are commonly used to fragment a large DNA genome into small DNA fragments, and the small DNA fragments generated can then be used to construct DNA libraries.
The cleavage properties of DNase I will lead to random digestion of DNA molecules (i.e.no sequence bias) and will mainly produce blunt-ended dsDNA fragments when used in the presence of buffers containing manganese ions (Melgar, E. and D.A. Goldhwait.1968. deoxyribonic acid nucleotides. II. the effects of metal on the mechanism of action of deoxyribonase I.J.biol.chem.243: 4409). When treating genomic DNA with dnase I, the following three factors can be considered: (i) the amount (units) of enzyme used; (ii) digestion temperature (. degree. C.); and (iii) incubation time (minutes). Typically, large DNA fragments or whole genomic DNA can be digested with DNase I for 1-2 minutes at between 10 ℃ and 37 ℃ to produce DNA molecules of appropriate length.
Thus, in certain preferred embodiments, the nucleic acid molecule of interest (the nucleic acid molecule to be sequenced) is fragmented prior to step (1'). In certain preferred embodiments, the nucleic acid molecule to be sequenced is fragmented by enzymatic or mechanical means. In certain preferred embodiments, the nucleic acid molecule to be sequenced is fragmented using DNase I. In certain preferred embodiments, the nucleic acid molecules to be sequenced are fragmented by sonication. In certain preferred embodiments, the fragmented nucleic acid molecule has a length of 50-2000bp, such as 50-100bp, 100-200bp, 200-300bp, 300-400bp, 400-500bp, 500-1000bp, 1000-2000bp, 50-1500bp, or 50-1000 bp.
Fragmentation of double-stranded nucleic acid molecules (e.g., dsDNA, genomic DNA) can produce nucleic acid fragments with blunt ends or overhangs that are one or two nucleotides in length. For example, when genomic DNA (gdna) is treated with sonication or dnase I, the product may comprise DNA fragments with blunt ends or overhangs. In this case, the ends of the nucleic acid molecule having overhangs may be filled up using a polymerase to form a nucleic acid molecule having blunt ends to facilitate subsequent application (e.g., to facilitate ligation of fragmented nucleic acid molecules to linkers).
Thus, in certain preferred embodiments, after fragmentation of a nucleic acid molecule to be sequenced (e.g., dsDNA), the fragmented nucleic acid molecule is treated with a DNA polymerase to produce DNA fragments with blunt ends. In certain preferred embodiments, the DNA polymerase can be any known DNA polymerase, such as T4 DNA polymerase, Pfu DNA polymerase, Klenow DNA polymerase. In some cases, the use of Pfu DNA polymerase may be advantageous because Pfu DNA polymerase not only fills the overhang portion to form blunt ends, but also has 3'-5' exonuclease activity, and can remove single nucleotide and double nucleotide overhangs, thereby further increasing the number of DNA fragments having blunt ends (Costa, G.L. and M.P.Weiner.1994a. protocols for cloning and analysis of blank-ended PCR-generated DNA fragments. PCR Methods Appl 3(5): S95; Costa, G.L.t A.Grakfsy and M.P.Wener.1994 b. cloning and analysis of PCR-generated DNA fragments. PCR Methods l 3(6): 338; Costa, G.L. and M.P.Weiner.1994 b. cloning and analysis of PCR-generated DNA fragments. PCR Methods l3 (3) and PCR of PCR Methods 4. PCR of PCR Products (PCR) PCR of PCR products 3. PCR Methods of PCR 3. PCR, G.L. and M.P.P.Weining.1994. PCR).
In certain preferred embodiments, linkers may be introduced at the 5 'and/or 3' ends of the nucleic acid molecules to be sequenced. In general, the linker is an oligonucleotide sequence, and it can be any sequence, of any length. Linkers of appropriate length and sequence can be selected using methods well known in the art. For example, the linker attached to the end of the nucleic acid molecule to be sequenced is typically a relatively short nucleotide sequence of between 5-100 nucleotides in length (e.g., 5-10bp, 10-20bp, 20-30bp, 30-40bp, 40-50bp, 50-100 bp). In certain preferred embodiments, the linker may have a primer binding region. Such primer binding regions can anneal or hybridize to a primer and thus can be used to prime a specific polymerase reaction. In certain preferred embodiments, the linker has one or more primer binding regions. In certain preferred embodiments, the adaptor has one or more regions capable of hybridizing to primers used for amplification. In certain preferred embodiments, the linker has one or more regions capable of hybridizing to a primer for a sequencing reaction. In certain preferred embodiments, a linker is introduced at the 5' end of the nucleic acid molecule to be sequenced. In certain preferred embodiments, a linker is introduced at the 3' end of the nucleic acid molecule to be sequenced. In certain preferred embodiments, linkers are introduced at the 5 'and 3' ends of the nucleic acid molecule to be sequenced. In some embodiments, the linker comprises a universal linker sequence capable of hybridizing to a universal primer. In some embodiments, the linker comprises a universal linker sequence capable of hybridizing to a universal amplification primer and/or a universal sequencing primer.
In certain preferred embodiments, a tag sequence may be introduced into the nucleic acid molecule to be sequenced, or may be introduced into the linker described above. The tag sequence is an oligonucleotide having a specific base sequence. The tag sequence may have any length, e.g. 2-50bp, e.g. 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50bp, according to the actual need. In certain preferred embodiments, each nucleic acid molecule to be sequenced is provided with a tag sequence comprising a specific sequence to facilitate distinguishing the source of each nucleic acid molecule to be sequenced. In certain preferred embodiments, the tag sequence may be introduced directly at the 5 'and/or 3' end of the nucleic acid molecule to be sequenced. In certain preferred embodiments, the tag sequence may be introduced in the linker prior to ligation to the 5 'and/or 3' end of the nucleic acid molecule to be sequenced. The tag sequence may be located anywhere in the linker sequence, for example at the 5 'and/or 3' end of the linker sequence. In certain preferred embodiments, the linker comprises a primer binding region and a tag sequence. In certain further preferred embodiments, the primer binding region comprises a universal adaptor sequence that is recognizable by a universal primer, and preferably, the tag sequence may be located 3' to the primer binding region.
In certain preferred embodiments, different tag sequences are used to label/distinguish nucleic acid molecules from different sources. In such embodiments, preferably, the same tag sequence is introduced into nucleic acid molecules of the same origin, and one unique tag sequence is used for each nucleic acid origin. Subsequently, nucleic acid molecules from different sources can be combined together to form a library, and the source of each nucleic acid molecule in the library can be identified/distinguished by the unique tag sequence carried on each nucleic acid molecule.
The nucleic acid molecule to be sequenced may be ligated to the linker or tag sequence by methods well known in the art (e.g., PCR or ligation reactions). For example, if a portion of the sequence of the nucleic acid molecule to be sequenced is known, the nucleic acid molecule to be sequenced can be amplified by PCR using appropriate PCR primers that contain the linker sequence and a sequence that is capable of specifically recognizing the nucleic acid molecule to be sequenced. The amplification product obtained is the nucleic acid molecule to be tested with the linker introduced at the 5 'and/or 3' end. In certain embodiments, the nucleic acid molecule may be attached to the linker using a non-specific ligase (e.g., T4 DNA ligase). In certain embodiments, the nucleic acid molecule and the linker may be treated with a restriction enzyme so that they have the same sticky ends, and then the nucleic acid molecule having the same sticky ends and the linker may be ligated together using a ligase to obtain the linker-ligated nucleic acid molecule.
In certain embodiments, after ligation of the nucleic acid molecule and the linker together using a ligase, the resulting product may present a nick at the junction. In this case, a polymerase may be used to repair the nick. For example, a DNA polymerase that loses 3'-5' exonuclease activity but exhibits 5'-3' exonuclease activity may have the ability to recognize nicks and repair nicks (Hamilton, s.c., j.w.farchass and m.c. davis.2001.DNA polymerases as enzymes for biotechnology. biotechnology 31: 370). DNA polymerases that can be used for this purpose include, for example, polI of thermoanaerobacterium hydrosulfuricus (thermoanaerobacterium thermosulfuricus), DNA polI of escherichia coli (e.coli), and bacteriophage phi 29. In a preferred embodiment, the polI of Bacillus stearothermophilus (Bacillus stearothermophilus) is used to repair the nicks in the dsDNA and form unnotched dsDNA.
In certain preferred embodiments, the nucleic acid molecule to be sequenced may also be amplified to increase the amount or copy number of the nucleic acid molecule. Methods for amplifying nucleic acid molecules are well known to those skilled in the art, and a typical example thereof is PCR. For example, the following methods can be used to amplify nucleic acid molecules: (i) polymerase Chain Reaction (PCR) requiring temperature cycling (see, e.g., Saiki et al, 1995.Science 230: 1350-; (ii) isothermal amplification systems (see, e.g., Guatelli et al, 1990.Proc. Natl. Acad. Sci. USA 87: 1874-1878); QP replicase systems (see, e.g., Lizardi et al, 1988.Biotechnology 6: 1197-; and strand displacement amplification (Nucleic Acids Res.1992Apr 11; 20(7): 1691-6). In certain preferred embodiments, the nucleic acid molecule to be sequenced is amplified by PCR, and the primers used to perform the PCR amplification comprise an adaptor sequence and/or a tag sequence. The PCR product thus produced will carry the linker sequence and/or the tag sequence and can thus be conveniently used for subsequent applications (e.g. high throughput sequencing).
In certain preferred embodiments, the nucleic acid molecule to be sequenced may also be isolated and purified before or after various pretreatment steps are performed. Such separation and purification steps may be advantageous. For example, in certain preferred embodiments, the isolation and purification steps can be used to obtain a nucleic acid molecule to be sequenced of suitable length (e.g., 50-1000bp) to facilitate subsequent applications (e.g., high throughput sequencing). In certain preferred embodiments, agarose gel electrophoresis may be used to separate and purify the nucleic acid molecules to be sequenced. In certain preferred embodiments, the nucleic acid molecule to be sequenced may be isolated and purified by size exclusion chromatography or sucrose sedimentation.
It is to be understood that the pretreatment steps described above (e.g., fragmentation, end-filling, adaptor addition, tag addition, nick repair, amplification, isolation and purification) are exemplary only, and not limiting. The person skilled in the art can subject the nucleic acid molecules to be sequenced to various desired pretreatments according to the actual need, and the individual pretreatment steps are not limited to a particular order. For example, in certain embodiments, nucleic acid molecules may be fragmented and adaptors added prior to amplification. In other embodiments, the nucleic acid molecule may be amplified prior to fragmentation and linker addition. In certain embodiments, the nucleic acid molecule is fragmented and adaptors are added without an amplification step.
In certain exemplary embodiments, prior to step (1'), the nucleic acid molecule of interest (e.g., genomic DNA) is subjected to the following pre-treatment:
(i) fragmenting the nucleic acid molecule of interest (e.g., a large nucleic acid fragment, such as genomic DNA), thereby producing a fragmented nucleic acid molecule;
(ii) ligating the fragmented nucleic acid molecules with an adaptor sequence (e.g., comprising, a primer binding region capable of hybridizing to a universal amplification primer, a primer binding region capable of hybridizing to a universal sequencing primer, and/or a tag sequence), and optionally isolating, purifying, and denaturing, thereby producing a nucleic acid molecule to be sequenced;
(iii) attaching the nucleic acid molecule to be sequenced to a support, thereby obtaining the nucleic acid molecule to be sequenced attached to the support.
Support for a food processor
In general, the support used to attach the nucleic acid molecules to be sequenced is in a solid phase to facilitate handling. Thus, in the present disclosure, "support" is also sometimes referred to as "solid support" or "solid phase support". However, it should be understood that reference herein to a "support" is not limited to a solid, and it may also be a semi-solid (e.g., a gel).
In the method of the present invention, the support for attaching the nucleic acid molecules to be sequenced may be made of various suitable materials. Such materials include, for example: minerals, natural polymers, synthetic polymers, and any combination thereof. Specific examples include, but are not limited to: cellulose, cellulose derivatives (e.g., nitrocellulose), acrylic resins, glass, silica gel, polystyrene, gelatin, polyvinylpyrrolidone, copolymers of vinyl and acrylamide, polystyrene crosslinked with divinylbenzene and the like (see, for example, Merrifield Biochemistry 1964,3,1385-TM) Agarose gel (Sepharose)TM) And other supports known to those skilled in the art.
In certain preferred embodiments, the support for attaching nucleic acid molecules to be sequenced can be a solid support comprising an inert substrate or matrix (e.g., a glass slide, a polymer bead, etc.) that has been functionalized, for example, by application of an intermediate material containing reactive groups that allow covalent attachment of biomolecules such as polynucleotides. Examples of such supports include, but are not limited to, polyacrylamide hydrogels supported on inert substrates such as glass, in particular polyacrylamide hydrogels as described in WO 2005/065814 and US 2008/0280773, the contents of which are incorporated herein by reference in their entirety. In such embodiments, the biomolecule (e.g., polynucleotide) may be covalently attached directly to the intermediate material (e.g., hydrogel), while the intermediate material itself may be non-covalently attached to the substrate or matrix (e.g., glass substrate). In certain preferred embodiments, the support is a glass slide or silicon wafer having a surface modified with a layer of avidin, amino, acrylamide silane, or aldehyde based chemical groups.
In the present invention, the support or solid support is not limited in its size, shape and configuration. In some embodiments, the support or solid support is a planar structure, such as a slide, chip, microchip and/or array. The surface of such a support may be in the form of a planar layer.
In some embodiments, the support or surface thereof is non-planar, such as an inner or outer surface of a tube or container. In some embodiments, the support or solid support comprises a microsphere or bead. As used herein, "microsphere" or "bead" or "particle" or grammatical equivalents refers to small discrete particles. Suitable bead components include, but are not limited to, plastics, ceramics, glass, polystyrene, methylstyrene, acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium dioxide, latex, cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and teflon, and any other materials outlined herein for preparing solid supports. In addition, the beads may be spherical or aspherical. In some embodiments, spherical beads may be used. In some embodiments, irregular particles may be used. Furthermore, the beads may also be porous.
In certain preferred embodiments, the support for attaching the nucleic acid molecules to be sequenced is an array of beads or wells (which are also referred to as a chip). The array may be prepared using any of the materials outlined herein for preparing solid supports, and preferably the surfaces of the beads or wells on the array are functionalized to facilitate attachment of nucleic acid molecules. The number of beads or wells on the array is not limited. For example, each array may contain 10-102、102-103、103-104、104-105、105-106、106-107、107-108、108-109Or more beads or wells. In some casesIn exemplary embodiments, the surface of each bead or well can be attached to one or more nucleic acid molecules. Accordingly, each array may be connected 10-102、102-103、103-104、104-105、105-106、106-107、107-108、108-109Or more nucleic acid molecules. Such arrays may therefore be particularly advantageous for high throughput sequencing of nucleic acid molecules.
As is generally known in the art, the support may be manufactured using a variety of techniques. Such techniques include, but are not limited to, photolithography, stamping techniques, molding techniques, and microetching techniques. As will be appreciated by those skilled in the art, the technique used will depend on the composition, structure and shape of the support.
Attachment of nucleic acid molecules to be sequenced to a support
In the methods of the invention, the nucleic acid molecule to be sequenced can be attached (e.g., covalently or non-covalently) to the support by any method known to one of ordinary skill in the art. For example, the nucleic acid molecule to be sequenced may be attached to the support by covalent attachment, or by irreversible passive adsorption, or by intermolecular affinity (e.g., between biotin and avidin). Preferably, however, the linkage between the nucleic acid molecule to be sequenced and the support is sufficiently strong that the nucleic acid molecule does not become detached from the support as a result of the conditions used in the various reactions (e.g., polymerization reactions) and washing with water or buffer solutions.
For example, in certain preferred embodiments, the 5' end of the nucleic acid molecule to be sequenced carries a means, such as a chemically modified functional group, capable of covalently attaching the nucleic acid molecule to a support. Examples of such functional groups include, but are not limited to, phosphate groups, carboxylic acid molecules, aldehyde molecules, thiols, hydroxyl, Dimethoxytrityl (DMT), or amino groups.
For example, in certain preferred embodiments, the 5' end of the nucleic acid molecule to be sequenced can be modified with a chemical functional group (e.g., a phosphate, thiol, or amino group) and the support (e.g., a porous glass bead) is derivatized with an amino-alkoxysilane (e.g., aminopropyltrimethoxysilane, aminopropyltriethoxysilane, 4-aminobutyltriethoxysilane, etc.) so that the nucleic acid molecule can be covalently attached to the support through a chemical reaction between the reactive groups. In certain preferred embodiments, the 5' end of the nucleic acid molecule to be sequenced may be modified with a carboxylic acid or aldehyde group and the support (e.g., latex beads) derivatized with hydrazine, so that the nucleic acid molecule can be covalently attached to the support by chemical reaction between the reactive groups (Kremsky et al, 1987).
In addition, a cross-linking agent may be used to attach the nucleic acid molecule of interest to the support. Such crosslinking agents include, for example, succinic anhydride, phenyl diisothiocyanate (Guo et al, 1994), maleic anhydride (Yang et al, 1998), 1-ethyl-3- (3-dimethylaminopropyl) -carbodiimide hydrochloride (EDC), m-maleimidobenzoic acid-N-hydroxysuccinimide ester (MBS), n-succinimidyl [ 4-iodoacetyl ] aminobenzoic acid (SIAB), 4- (N-maleimidomethyl) cyclohexane-1-carboxylic acid Succinimide (SMCC), N-gamma-maleimidobutyryloxy-succinimide ester (GMBS), 4- (p-maleimidophenyl) butyric acid Succinimide (SMPB), and the corresponding thio compounds (water soluble).
In addition, the support may be derivatized with bifunctional crosslinking agents (e.g., homobifunctional crosslinking agents and heterobifunctional crosslinking agents) to provide a modified functionalized surface. Subsequently, nucleic acid molecules having 5' -phosphate, thiol or amino groups can interact with the functionalized surface to form a covalent linkage between the nucleic acid and the support. A number of bifunctional crosslinking agents and methods for their use are well known in the art (see, e.g., Pierce Catalog and Handbook, pp. 155-200).
Primer and method for producing the same
In the method of the present invention, the primer may be of any length and may comprise any sequence or any base so long as it is capable of specifically annealing to a region of the target nucleic acid molecule. In other words, in the method of the present invention, the primer is not limited in its length, structure and composition. For example, in some exemplary embodiments, the length of the primer may be 5-50bp, such as 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50 bp. In some exemplary embodiments, the primer is capable of forming a secondary structure (e.g., a hairpin structure). In some exemplary embodiments, the primers do not form any secondary structure (e.g., hairpin structure). In some exemplary embodiments, the primer may comprise a naturally occurring or non-naturally occurring nucleotide. In some exemplary embodiments, the primer comprises or consists of a naturally occurring nucleotide. In some exemplary embodiments, the primer comprises a modified nucleotide, such as a Locked Nucleic Acid (LNA). In some exemplary embodiments, the primer is capable of hybridizing to a nucleic acid molecule of interest under stringent conditions (e.g., moderately stringent conditions or highly stringent conditions). In some exemplary embodiments, the primer has a sequence that is fully complementary to a target sequence in a nucleic acid molecule of interest. In some exemplary embodiments, the primer is partially complementary (e.g., there is a mismatch) to a target sequence in a nucleic acid molecule of interest. In some exemplary embodiments, the primer comprises a universal primer sequence. In some exemplary embodiments, the nucleic acid molecule to be sequenced comprises a linker, and the linker comprises a sequence capable of hybridizing to a universal primer, and the primer used is a universal primer.
Polymerase enzyme
In the method for preparing a polynucleotide or the sequencing method of the present invention, a nucleotide polymerization reaction may be performed using a suitable polymerase. In some exemplary embodiments, the polymerase is capable of synthesizing a new DNA strand (e.g., a DNA polymerase) using DNA as a template. In some exemplary embodiments, the polymerase is capable of synthesizing a new DNA strand (e.g., a reverse transcriptase) using RNA as a template. In some exemplary embodiments, the polymerase is capable of synthesizing a new RNA strand using DNA or RNA as a template (e.g., RNA polymerase). Thus, in certain preferred embodiments, the polymerase is selected from the group consisting of a DNA polymerase, an RNA polymerase, and a reverse transcriptase. The nucleotide polymerization reaction can be carried out by selecting an appropriate polymerase according to actual needs. In certain preferred embodiments, the polymerization reaction is a Polymerase Chain Reaction (PCR). In certain preferred embodiments, the polymerization reaction is a reverse transcription reaction.
In the method of the present invention, a nucleotide polymerization reaction may be performed using KOD polymerase or a mutant thereof. KOD polymerase or a mutant thereof (e.g., KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, KOD POL391) has acceptable polymerization efficiency for the modified nucleoside or nucleotide of the present invention. KOD POL391 and KOD POL171 have acceptable polymerization efficiency for the modified nucleotide of the present invention. In certain embodiments, KOD POL391 or KOD POL171 polymerizes more than 70% of the modified nucleotides of the invention, e.g., 70% -80%, 80% -90%, or 90% -100%.
Further, as described above, in the method of the present invention, steps (2) - (6) or steps (2) - (4) may be repeatedly performed. Thus, in certain preferred embodiments of the invention, one or more cycles of nucleotide polymerization may be performed. In other words, in certain preferred embodiments of the present invention, the nucleotide polymerization reaction may be performed in one or more steps. In this case, the same or different polymerase may be used for each round of nucleotide polymerization. For example, a first DNA polymerase may be used in a first round of nucleotide polymerization reactions and a second DNA polymerase may be used in a second round of nucleotide polymerization reactions. However, in certain exemplary embodiments, the same polymerase (e.g., the same DNA polymerase) is used in all nucleotide polymerization reactions.
Polymerization conditions
In the method for preparing a polynucleotide or the sequencing method of the present invention, the polymerization reaction of nucleotides is carried out under suitable conditions. Suitable polymerization conditions include the composition of the solution phase and the concentrations of the ingredients, the pH of the solution phase, the polymerization temperature, and the like. The polymerization is carried out under suitable conditions, which is advantageous for obtaining acceptable, even high, polymerization efficiencies.
In certain embodiments, the solution phase in which the polymerization reaction occurs comprises monovalent salt ions (e.g., sodium ions, chloride ions) and/or divalent salt ions (e.g., magnesium ions, sulfate ions). In certain embodiments, the concentration of the monovalent or divalent salt ions in the solution phase is 1-200mM, e.g., 1mM, 3mM, 10mM, 20mM, 50mM, 100mM, 150mM, or 200 mM.
In certain embodiments, the solution phase in which the polymerization reaction occurs comprises a buffered solution, such as a buffered solution comprising Tris. In certain embodiments, the concentration of Tris in the solution phase is 10mM to 200mM, e.g., 10mM, 20mM, 50mM, 100mM, 150mM, or 200 mM.
In certain embodiments, the solution phase in which the polymerization reaction occurs comprises an organic solvent, such as DMSO or glycerol (glycerol). In certain embodiments, the organic solvent is present in the solution phase at a mass content of 0.01% to 10%, e.g., 0.01%, 0.02%, 0.05%, 1%, 2%, 5%, or 10%.
In certain embodiments, the pH of the solution phase in which the polymerization occurs is from 7.0 to 9.0, e.g., 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, or 9.0.
In certain embodiments, the solution phase in which the polymerization reaction occurs comprises: monovalent salt ions (e.g., sodium ions, chloride ions), divalent salt ions (e.g., magnesium ions, sulfate ions), buffer solutions (e.g., buffer solutions comprising Tris), and organic solvents (e.g., DMSO or glycerol). In certain embodiments, the pH of the solution phase is 7.8.
In certain embodiments, the polymerization reaction is carried out at 50-65 deg.C (e.g., 50 deg.C, 51 deg.C, 52 deg.C, 53 deg.C, 54 deg.C, 55 deg.C, 56 deg.C, 57 deg.C, 58 deg.C, 59 deg.C, 60 deg.C, 61 deg.C, 62 deg.C, 63 deg.C, 64 deg.C, or 65 deg.C).
The time for the polymerization reaction can be determined according to actual needs, and can be, for example, 1min-5min, 5min-10min, 10min-30min or 30min-1 h.
Base derivatives
In the method of the present invention, the four compounds used in step (2) are derivatives of nucleotides A, (T/U), C and G, respectively. In certain exemplary embodiments, the four compounds are derivatives of ribonucleotides A, T, C and G, respectively. In certain exemplary embodiments, the four compounds are derivatives of ribose or deoxyribonucleotides A, U, C and G, respectively. Particularly advantageously, the four compounds do not chemically react with each other during the nucleotide polymerization reaction.
In addition, the four compounds have base complementary pairing ability. For example, when the compound is a derivative of nucleotide a, it will be able to pair with the base T or U. When the compound is a derivative of the nucleotide T or U, it will be capable of pairing with base A. When the compound is a derivative of nucleotide C, it will be capable of pairing with base G. When the compound is a derivative of nucleotide G, it will be capable of pairing with base C. Thus, in step (3), a polymerase (e.g., a DNA polymerase) will incorporate compounds capable of complementary pairing with bases at corresponding positions in the template nucleic acid into the 3' end of the growing nucleic acid strand according to the base complementary pairing rules. Accordingly, after the type of compound incorporated into the 3' end of a growing nucleic acid strand is determined by the signal emitted by a detectable label (e.g., a fluorophore), the type of base at the corresponding position in the template nucleic acid can be determined by the base complementary pairing rules. For example, if a compound incorporated at the 3' end of a growing nucleic acid strand is determined to be a derivative of nucleotide A, then the base at the corresponding position in the template nucleic acid can be determined to be T or U. If the compound incorporated at the 3' end of the growing nucleic acid strand is determined to be a derivative of a nucleotide T or U, then the base at the corresponding position in the template nucleic acid can be determined to be A. If the compound incorporated at the 3' end of the growing nucleic acid strand is determined to be a derivative of nucleotide C, then the base at the corresponding position in the template nucleic acid can be determined to be G. If the compound incorporated at the 3' end of the growing nucleic acid strand is determined to be a derivative of nucleotide G, then the base at the corresponding position in the template nucleic acid can be determined to be C.
In the present invention, the hydroxyl group (-OH) at the 3' -position of ribose or deoxyribose of the four compounds is protected. In other words, the hydroxyl groups (-OH) at the 3' -positions of the ribose or deoxyribose sugars of the four compounds are protected by a protecting group, so that they can terminate the polymerization of a polymerase, such as a DNA polymerase. For example, when any of the four compounds is introduced into the 3 'end of a growing nucleic acid strand, the polymerase will not be able to proceed with the next round of polymerization reaction because there is no free hydroxyl group (-OH) at the 3' position of the ribose or deoxyribose of the compound, and the polymerization reaction will be terminated. In this case, in each round of polymerization, there will be and only one base incorporated into the growing nucleic acid strand.
Furthermore, in certain embodiments, the protecting group at the 3' position of the ribose or deoxyribose of the four compounds can be removed. In step (5), the protecting group is removed and converted to a free hydroxyl group (-OH). Subsequently, the growing nucleic acid strand can be subjected to the next round of polymerization reaction using the polymerase and the four compounds, and one base is introduced again.
Thus, in certain embodiments, the four compounds used in step (2) are reversibly blocked: when they are incorporated into the 3' end of a growing nucleic acid strand (e.g. in step (3)), they will terminate the polymerase from continuing polymerization, terminating further extension of the growing nucleic acid strand; and, after the blocking groups they contain are removed, the polymerase will be able to continue to polymerize the growing nucleic acid strand, continuing to extend the nucleic acid strand.
Determination of Compounds incorporated into growing nucleic acid strands
In the sequencing method of the invention, after each round of polymerization, a detectable signal is detected in the duplex or growing nucleic acid strand. By detection, the type of compound incorporated into the growing nucleic acid strand can be determined, thereby determining the base type at the corresponding position in the nucleic acid molecule to be sequenced.
In certain preferred embodiments, the detectable label is a fluorophore.
In certain preferred embodiments, the sequencing method of the present invention further comprises, after step (4), determining the base type at the corresponding position in the nucleic acid molecule to be sequenced based on the base complementary pairing principle according to the type of compound incorporated into the 3' end of the growing nucleic acid strand in step (3). For example, if the compound incorporated at the 3' end of a growing nucleic acid strand is determined to be a derivative of nucleotide A, then the base at the corresponding position in the nucleic acid molecule to be sequenced can be determined to be a base (e.g., T or U) capable of pairing with the derivative of nucleotide A.
More specifically, if the compound incorporated at the 3' end of a growing nucleic acid strand is determined to be a derivative of nucleotide A, it is possible to determine that the base at the corresponding position in the nucleic acid molecule to be sequenced is T or U. If the compound incorporated at the 3' end of the growing nucleic acid strand is determined to be a derivative of a nucleotide T or U, then the base at the corresponding position in the nucleic acid molecule to be sequenced can be determined to be A. If the compound incorporated at the 3' end of the growing nucleic acid strand is determined to be a derivative of nucleotide C, then the base at the corresponding position in the nucleic acid molecule to be sequenced can be determined to be G. If the compound incorporated at the 3' end of the growing nucleic acid strand is determined to be a derivative of nucleotide G, then the base at the corresponding position in the nucleic acid molecule to be sequenced can be determined to be C.
Treatment of duplexes or growing nucleic acid strands
In the sequencing method of the invention, each round of polymerization may involve one signal detection, and each round of polymerization may involve, in addition to the last round of polymerization, an excision of the duplex or growing nucleic acid strand. After the final round of polymerization, the duplexes or growing nucleic acid strands may or may not be subjected to excision treatment.
In certain embodiments, the treatment in step (5) may be used to remove blocking groups incorporated into the compound at the 3' end of the growing nucleic acid strand (so that a new round of polymerization may begin) and to remove detectable labels that may be carried on the duplex or growing nucleic acid strand (so that interference with subsequent detection may be avoided).
In certain embodiments, the cleavage reagent used to cleave the blocking group and the cleavage reagent used to cleave the detectable label are the same reagent, or, although different reagents, the cleavage can be performed under the same conditions without interference between the two cleavage reactions. Thus, in step (5), cleavage of the blocking group and cleavage of the detectable label may be performed simultaneously.
In certain embodiments, the cleavage reagent used to cleave the blocking group and the cleavage reagent used to cleave the detectable label are different reagents and cleavage is desired under different conditions. To avoid interference between the cleavage reactions, the cleavage of the blocking group and the cleavage of the detectable label can be performed in steps.
In certain embodiments, for example, where the at least one complementary nucleotide incorporated is a compound according to formula (I'), step (5) comprises:
step (5-1): adding a first cleaving reagent to contact the duplex or the growing nucleic acid strand with the first cleaving reagent in a reaction system comprising a solution phase and a solid phase to cleave the phosphodiester bond (1) in the compound incorporated into the 3' end of the growing nucleic acid strand without affecting the phosphodiester bond on the backbone of the duplex;
step (5-2): adding a second cleaving agent, contacting said duplex or said growing nucleic acid strand with said second cleaving agent in a reaction system comprising a solution phase and a solid phase, and cleaving phosphodiester bonds (2) in the compound incorporated into the 3' end of the growing nucleic acid strand without affecting the phosphodiester bonds on the duplex backbone.
It should be noted that there is no fixed sequence between step (5-1) and step (5-2), and step (5-1) may be performed first, or step (5-2) may be performed first. The use of "first" and "second" merely distinguishes between cleavage reagents and does not indicate the order of use of the two cleavage reagents.
The time for the excision reaction can be determined according to actual needs, and can be, for example, 1min-5min, 5min-10min, 10min-30min or 30min-1 h.
Excision reagent
In the present invention, the blocking group and/or detectable label in the compound of formula (I) is removed using a cleavage reagent. When the compound represented by the general formula (I ') is used, the cleavage reagent of the present invention is capable of cleaving the phosphodiester bond (1) and/or the phosphodiester bond (2) in the compound of the general formula (I') and does not cleave the phosphodiester bond in the backbone of the nucleic acid chain to maintain the integrity of the backbone of the nucleic acid chain.
In the present invention, cleavage reagents useful for removing blocking groups include, but are not limited to, endonuclease IV. The endonuclease IV can selectively cleave the phosphodiester bond (1) in the compound represented by the general formula (I') without affecting the phosphodiester bond in the backbone of the nucleic acid strand.
In the present invention, cleavage reagents that can be used to remove the detectable label include, but are not limited to, alkaline phosphatase. The alkaline phosphatase can selectively cleave the phosphodiester bond (2) in the compound represented by the general formula (I') without affecting the phosphodiester bond in the backbone of the nucleic acid strand.
The conditions of the cleavage reaction may depend on the cleavage reagent, and for example, the cleavage reagent used is a commercially available enzyme, so that the conditions of the cleavage reaction can be determined according to the use conditions recommended by the supplier (e.g., recommended buffer solution, temperature, etc.).
Washing step
In the sequencing method of the present invention, a washing step may be added as necessary. The washing step may be added at any desired stage, and optionally, the washing step may be performed one or more times.
For example, in step (4), after the solution phase of the reaction system is removed, one or more washes may be performed to sufficiently remove the remaining solution phase. Such a washing step may be advantageous, which can be used to remove sufficiently free (i.e. not incorporated into the growing nucleic acid strand) compounds carrying the detectable label, minimizing non-specific signals.
Similarly, in step (6), after removing the solution phase of the reaction system, one or more washes may be performed to sufficiently remove the remaining solution phase. Such a washing step may be advantageous, which may serve to sufficiently remove the cleavage reagent applied in step (5), thereby minimizing adverse effects on subsequent reactions.
The washing step can be carried out using various suitable washing solutions. Examples of such wash solutions include, but are not limited to, phosphate buffer, citrate buffer, Tris-HCl buffer, acetate buffer, carbonate buffer, and the like. It is within the ability of the skilled person to select a suitable washing solution (including suitable ingredients, concentrations, ionic strength, pH etc.) according to the actual need.
(III) reagent kit
In one aspect, the invention provides a kit comprising first, second, third and fourth compounds each of which is a compound of general formula (I) as defined above, said four compounds being derivatives of nucleotides a, (T/U), C and G, respectively, and having base complementary pairing capabilities.
In certain embodiments, the labels in the four compound structures are different, e.g., different fluorophores.
In certain embodiments, the kits of the invention further comprise: reagents and/or devices for extracting nucleic acid molecules from a sample; reagents for pretreating nucleic acid molecules; a support for attaching nucleic acid molecules to be sequenced; reagents for attaching (e.g., covalently or non-covalently attaching) a nucleic acid molecule to be sequenced to a support; a primer for initiating a nucleotide polymerization reaction; a polymerase for performing a nucleotide polymerization reaction; one or more buffer solutions; one or more wash solutions; or any combination thereof.
In certain embodiments, the kits of the invention further comprise, reagents and/or devices for extracting nucleic acid molecules from a sample. Methods for extracting nucleic acid molecules from a sample are well known in the art. Thus, various reagents and/or devices for extracting nucleic acid molecules, such as a reagent for disrupting cells, a reagent for precipitating DNA, a reagent for washing DNA, a reagent for solubilizing DNA, a reagent for precipitating RNA, a reagent for washing RNA, a reagent for solubilizing RNA, a reagent for removing protein, a reagent for removing DNA (e.g., when the nucleic acid molecule of interest is RNA), a reagent for removing RNA (e.g., when the nucleic acid molecule of interest is DNA), and any combination thereof, may be provided in the kit of the present invention, as desired.
In certain embodiments, the kits of the invention further comprise reagents for pretreating the nucleic acid molecules. In the kit of the present invention, the reagent for pretreating nucleic acid molecules is not additionally limited and may be selected according to actual needs. The reagent for pretreating a nucleic acid molecule includes, for example, a reagent for fragmenting a nucleic acid molecule (e.g., dnase I), a reagent for filling the ends of a nucleic acid molecule (e.g., DNA polymerase such as T4 DNA polymerase, Pfu DNA polymerase, Klenow DNA polymerase), a linker molecule, a tag molecule, a reagent for linking a linker molecule to a nucleic acid molecule of interest (e.g., ligase such as T4 DNA ligase), a reagent for repairing a nick in nucleic acid (e.g., DNA polymerase that loses 3'-5' exonuclease activity but exhibits 5'-3' exonuclease activity), a reagent for amplifying a nucleic acid molecule (e.g., DNA polymerase, primers, dntps), a reagent for isolating and purifying a nucleic acid molecule (e.g., a chromatography column), and any combination thereof.
In certain embodiments, the kits of the invention further comprise a support for attaching nucleic acid molecules to be sequenced. The support may have any of the technical features described in detail above for the support, as well as any combination thereof.
For example, in the present invention, the support may be made of various suitable materials. Such materials include, for example: minerals, natural polymers, synthetic polymers, and any combination thereof. Specific examples include, but are not limited to: cellulose, cellulose derivatives (e.g., nitrocellulose), acrylic resins, glass, silica gel, polystyrene, gelatin, polyvinylpyrrolidone, copolymers of vinyl and acrylamide, polystyrene associations crosslinked with divinylbenzene and the like (see, e.g., Merrifield Bioche)Chemistry 1964,3,1385-TM) Agarose gel (Sepharose)TM) And other supports known to those skilled in the art.
In certain preferred embodiments, the support for attaching nucleic acid molecules to be sequenced can be a solid support comprising an inert substrate or matrix (e.g., a glass slide, a polymer bead, etc.) that has been functionalized, for example, by application of an intermediate material containing reactive groups that allow covalent attachment of biomolecules such as polynucleotides. Examples of such supports include, but are not limited to, polyacrylamide hydrogels supported on an inert substrate such as glass, particularly polyacrylamide hydrogels described in WO 2005/065814 and US 2008/0280773, the contents of which are incorporated herein by reference in their entirety. In such embodiments, the biomolecule (e.g., polynucleotide) can be directly covalently attached to the intermediate material (e.g., hydrogel), while the intermediate material itself can be non-covalently attached to a substrate or matrix (e.g., glass substrate). In certain preferred embodiments, the support is a glass slide or silicon wafer having a surface modified with a layer of avidin, amino, acrylamide silane, or aldehyde based chemical groups.
In the present invention, the support or solid support is not limited in its size, shape and configuration. In some embodiments, the support or solid support is a planar structure, such as a slide, chip, microchip and/or array. The surface of such a support may be in the form of a planar layer. In some embodiments, the support or surface thereof is non-planar, such as an inner or outer surface of a tube or container. In some embodiments, the support or solid support comprises a microsphere or bead. In certain preferred embodiments, the support for attaching the nucleic acid molecules to be sequenced is an array of beads or wells.
In certain preferred embodiments, the kits of the invention further comprise reagents for attaching (e.g., covalently or non-covalently) the nucleic acid molecule to be sequenced to a support. Such agents include, for example, agents that activate or modify a nucleic acid molecule (e.g., at its 5' end), such as a phosphate, thiol, amine, carboxylic acid, or aldehyde; a reagent for activating or modifying the surface of the support, such as amino-alkoxysilane (e.g., aminopropyltrimethoxysilane, aminopropyltriethoxysilane, 4-aminobutyltriethoxysilane, etc.); crosslinkers, such as succinyl anhydride, phenyl diisothiocyanate (Guo et al, 1994), maleic anhydride (Yang et al, 1998), 1-ethyl-3- (3-dimethylaminopropyl) -carbodiimide hydrochloride (EDC), N-hydroxysuccinimide ester of m-maleimidobenzoic acid (MBS), N-succinimidyl [ 4-iodoacetyl ] aminobenzoic acid (SIAB), 4- (N-maleimidomethyl) cyclohexane-1-carboxylic acid Succinimide (SMCC), N-gamma-maleimidobutyryloxy-succinimide ester (GMBS), 4- (p-maleimidophenyl) butyric acid Succinimide (SMPB); and any combination thereof.
In certain preferred embodiments, the kits of the invention further comprise primers for initiating a nucleotide polymerization reaction. In the present invention, the primer is not additionally limited as long as it can specifically anneal to a region of the target nucleic acid molecule. In some exemplary embodiments, the length of the primer may be 5-50bp, such as 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50 bp. In some exemplary embodiments, the primer may comprise a naturally occurring or non-naturally occurring nucleotide. In some exemplary embodiments, the primer comprises or consists of a naturally occurring nucleotide. In some exemplary embodiments, the primer comprises a modified nucleotide, such as a Locked Nucleic Acid (LNA). In certain preferred embodiments, the primer comprises a universal primer sequence.
In certain preferred embodiments, the kits of the invention further comprise a polymerase for performing a nucleotide polymerization reaction. In the present invention, polymerization can be carried out using various suitable polymerases. In some exemplary embodiments, the polymerase is capable of synthesizing a new DNA strand (e.g., a DNA polymerase) using DNA as a template. In some exemplary embodiments, the polymerase is capable of synthesizing a new DNA strand (e.g., a reverse transcriptase) using RNA as a template. In some exemplary embodiments, the polymerase is capable of synthesizing a new RNA strand using DNA or RNA as a template (e.g., RNA polymerase). Thus, in certain preferred embodiments, the polymerase is selected from the group consisting of a DNA polymerase, an RNA polymerase, and a reverse transcriptase.
In certain preferred embodiments, the kits of the present invention comprise KOD polymerase or a mutant thereof. In certain preferred embodiments, the mutant is selected from the group consisting of KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, and KOD POL 391. In certain preferred embodiments, the kits of the invention comprise KOD POL391, KOD POL171, or a combination thereof.
In certain preferred embodiments, the kits of the invention further comprise a cleavage reagent that is capable of cleaving the phosphodiester bond (1) and/or phosphodiester bond (2) in formula (I') and does not affect the phosphodiester bond on the duplex backbone.
In certain preferred embodiments, the cleavage agent is selected from endonuclease IV and alkaline phosphatase.
In certain preferred embodiments, the kits of the invention further comprise one or more buffer solutions. Such buffers include, but are not limited to, buffer solutions for dnase I, buffer solutions for DNA polymerase, buffer solutions for ligase, buffer solutions for eluting nucleic acid molecules, buffer solutions for solubilizing nucleic acid molecules, buffer solutions for performing nucleotide polymerization reactions (e.g., PCR), and buffer solutions for performing ligation reactions. The kit of the present invention may comprise any one or more of the above-described buffer solutions.
In certain embodiments, the buffer solution for a DNA polymerase comprises monovalent salt ions (e.g., sodium ions, chloride ions) and/or divalent salt ions (e.g., magnesium ions, sulfate ions). In certain embodiments, the concentration of the monovalent or divalent salt ion in the buffer solution is 1-200mM, e.g., 1mM, 3mM, 10mM, 20mM, 50mM, 100mM, 150mM, or 200 mM.
In certain embodiments, the buffer solution for a DNA polymerase comprises Tris. In certain embodiments, the concentration of Tris in the buffer solution is 10mM-200mM, e.g., 10mM, 20mM, 50mM, 100mM, 150mM, or 200 mM.
In certain embodiments, the buffer solution for DNA polymerase comprises an organic solvent, such as DMSO or glycerol (glycerol). In certain embodiments, the organic solvent is present in the buffer solution in an amount of 0.01% to 10% by mass, e.g., 0.01%, 0.02%, 0.05%, 1%, 2%, 5%, or 10%.
In certain embodiments, the pH of the buffer solution for a DNA polymerase is 7.0-9.0, e.g., 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, or 9.0.
In certain embodiments, the buffer solution for a DNA polymerase comprises: monovalent salt ions (e.g., sodium ions, chloride ions), divalent salt ions (e.g., magnesium ions, sulfate ions), Tris, and organic solvents (e.g., DMSO or glycerol). In certain embodiments, the pH of the buffer solution phase is 7.8.
In certain preferred embodiments, the kits of the invention comprise one or more cleavage reagents that are capable of cleaving the phosphodiester bonds (1) and/or phosphodiester bonds (2) in the compound of formula (I') and that do not cleave the phosphodiester bonds in the backbone of the nucleic acid strand to maintain the integrity of the backbone of the nucleic acid strand.
In certain embodiments, the cleavage agent is selected from the group consisting of an endonuclease and IV alkaline phosphatase.
In certain preferred embodiments, the kits of the invention further comprise one or more wash solutions. Examples of such wash solutions include, but are not limited to, phosphate buffer, citrate buffer, Tris-HCl buffer, acetate buffer, carbonate buffer, and the like. The kit of the invention may comprise any one or more of the above-described wash solutions.
Modified nucleoside or nucleotide, and kitUse of
The modified nucleoside or nucleotide of the present invention can be used for determining the sequence of a target single-stranded polynucleotide.
Accordingly, the invention also provides the use of a compound as defined in any one of the above, and a kit as defined in any one of the above, for determining the sequence of a target single-stranded polynucleotide.
Advantageous effects of the invention
Compared with the prior art, the technical scheme of the invention has the following beneficial effects:
the modified nucleoside or nucleotide provided by the invention has higher stability in sequencing, the blocking group and the detectable label carried by the modified nucleoside or nucleotide can be excised under mild conditions, no damage is caused to DNA, higher excision efficiency can be achieved, and even complete excision can be realized. The modified nucleoside or nucleotide provided by the invention can be polymerized by DNA polymerase on the market, has acceptable polymerization efficiency, can even achieve complete polymerization under proper conditions, does not need to remove ester protecting groups on phosphate groups, can be directly cut off, and simplifies the cutting method.
Embodiments of the present invention will be described in detail below with reference to the drawings and examples, but those skilled in the art will understand that the following drawings and examples are only for illustrating the present invention and do not limit the scope of the present invention. Various objects and advantageous aspects of the present invention will become apparent to those skilled in the art from the accompanying drawings and the following detailed description of the preferred embodiments.
Detailed Description
The invention will now be described with reference to the following examples, which are intended to illustrate the invention, but not to limit it. The examples are given by way of illustration and are not intended to limit the scope of the invention as claimed.
Example 1 Using base T as an example and commercially available compound I as a starting material, dTTP having a Cy3 fluorophore and 3' -OH blocked was prepared.
(1) Conversion of Compound I-1 to Compound II-1
Figure GDA0002274171490000461
First, 3mmol of methanol and 3mmol of tetrazole were added to 10mL of anhydrous acetonitrile solvent, and then a 5mL of anhydrous acetonitrile solution in which 0.88g (1mmol) of Compound I-1 was dissolved was slowly added, and the reaction was stirred at room temperature for 1 hour. Then, 10mL of an iodine solution (volume ratio 2/20/78) containing 1.5mmol of iodine as a solvent, water/pyridine/tetrahydrofuran (0.15M) was added, and the mixture was stirred at room temperature for 30 minutes. Thereafter, a 5% sodium sulfite solution was slowly added until the color of elemental iodine in the solution disappeared. 100mL of dichloromethane and 100mL of saturated sodium chloride solution are added, the organic phase is extracted, another 100mL of dichloromethane are added to the aqueous phase and the organic phase is re-extracted, the organic phases are combined, magnesium sulfate is added for drying, and the solvent is removed by rotary evaporation. Silica gel column chromatography may be further performed in a volume ratio of 1:1 to obtain a compound II-1, wherein the ethyl acetate/n-hexane solvent is adopted to separate and purify the product, and the yield is as follows: 95% (0.78 g).
Nuclear magnetic and mass spectral data for compound II-1:1H NMR(300MHz,CDCl3):δ(ppm)7.84(s,1H),7.3-7.18(m,9H),6.77(m,4H),6.40(dd,1H),5.05(s,1H,),4.21(s,1H),4.09(m,3H),3.75(s,6H),3.47(m,1H),3.32(d,1H),2.67(m,1H),2.60(m,2H),2.36(m,1H),31P NMR(163MHz,CDCl3):δ(ppm)-2.48,-2.66.LC-MS(ESI):m/z 827.2(M+H+).
(2) conversion of Compound II-1 to Compound III-1
Figure GDA0002274171490000471
To a solution of 3% trichloroacetic acid in dichloromethane (5mL) was added 0.82g (1mmol) of Compound II-1, stirred at room temperature for 3 hours, concentrated by rotary evaporation to give a solution volume of about 1mL, and purified by column chromatography on silica gel using 4:1 to obtain compound III-1, yield: 92% (0.48 g).
Nuclear magnetic and mass spectral data for compound III-1:1H NMR(300MHz,CDCl3):δ(ppm)8.24(s,1H),6.26(dd,1H),5.39–5.34(m,1H),4.23–4.13(m,3H),4.09(m,3H),3.96(dd,1H),3.89(dd,1H),3.80-3.70(s,6H)2.54–2.47(m,1H),2.33–2.42(m,1H′),LC-MS(ESI):m/z 525.4(M+H+).
(3) conversion of Compound III-1 to Compound IV-1
Figure GDA0002274171490000472
Pyridine is added into 52.4mg (0.1mmol) of the compound III-1, new anhydrous pyridine is added after being pumped under high vacuum condition for a plurality of times, 800 mu L of 1, 4-dioxane and 300 mu L of anhydrous pyridine are added into the solid after being pumped, 100 mu L of 1, 4-dioxane dissolved with 0.11mmol (22mg) of 2-chloro-4H-1, 3, 2-benzodioxyphosp-4-one is added under the protection of argon, and the mixture is reacted at room temperature and stirred for 10 minutes. 300. mu.L of N, N-dimethylformamide having 1.5-fold equivalent (1.5mmol, 71.6mg) of tri-N-butylammonium pyrophosphate and 100. mu.L of N-butylamine dissolved therein was added thereto, and the mixture was stirred at room temperature and reacted for 15 minutes. Then, 1mL of iodine solution (water/pyridine/tetrahydrofuran (2/20/78)) was added, the concentration of iodine was 0.15M, and 0.15mmol of iodine was contained, and the reaction was stirred at room temperature for 30 minutes. Then, a 5% sodium sulfite solution was slowly added until the color of elemental iodine in the solution disappeared. Rotary evaporation to remove all solvent, adding 5mL water to dissolve the residual solid, then adding 5mL 30% ammonia water, stirring at room temperature, reacting for 3 hours, rotary evaporation to remove all solvent, dissolving the residual solid with a trace amount of water, dissolving with 6: 4:1, 4 dioxane/water/ammonia as solvent, isolating the product on preparative thin layer chromatography to give compound IV-1, yield: 30% (18 mg).
Mass Spectrometry data for Compound IV-1: MS (MALDI-): M/z 614.6 (M-1).
On High Performance Liquid Chromatography (HPLC), the reaction was performed with 1:1 acetonitrile/water was tested as the mobile phase and the time of flow out is shown in FIG. 1.
(4) Synthesis of Compound V-1
Figure GDA0002274171490000481
To 380mg (5mmol) of 2-hydroxyacetic acid, 50mL of DMF was added, and 1.8g (6mmol) of O- (N-succinimidyl) -N, N, N ', N' -tetramethylurea tetrafluoroborate and 1.04mL of N, N-diisopropylethylamine were added, followed by stirring at room temperature for 30 minutes. Then, 780mg of N-ethylenediamine trifluoroacetamide was added thereto, and the mixture was stirred at room temperature to react for 2 hours. All solvents were removed by rotary evaporation and the remaining solid was dissolved by adding 100mL of ethyl acetate. Extraction with 0.1M hydrochloric acid solution (2 × 100mL), followed by extraction with saturated sodium carbonate solution (2 × 150mL), followed by drying over magnesium sulfate, filtration and rotary evaporation to remove the solvent gave compound V-1, yield: 75% (0.8 g).
Nuclear and mass spectral data for compound V-1:1H NMR(300MHz,CDCl3):δ(ppm)4.05(s,2H),3.40-3.30(m,4H),LC-MS(ESI):m/z 215.2(M+H+).
(5) conversion of Compound V-1 to synthetic Compound VI-1
Figure GDA0002274171490000491
214mg (1mmol) of the compound V-1 and 205. mu. L N, N-diisopropylethylamine were added to 10mL of anhydrous dichloromethane, and then 2mL of anhydrous dichloromethane in which 283mg (1.2mmol) of 2-cyanoethyl N, N-diisopropylphosphoramidite was dissolved was added at 0 ℃ and stirring was continued at 0 ℃ for 10 minutes. The temperature was slowly raised to room temperature and the reaction was continued for 2 hours. Then, 20mL of methylene chloride and 50mL of a saturated sodium bicarbonate solution were added, and after extraction, magnesium sulfate was added to dry and remove the solvent. With 1:1 in ethyl acetate/n-hexane as eluent, and separating on a silica gel column to obtain the compound VI-1 with the yield: 85% (352 mg).
Nuclear magnetic and Mass Spectrometry data for Compound VI-1:1H NMR(300MHz,CDCl3):δ(ppm)4.16(s,2H),3.8(m,2H),3.40-3.30(m,6H),3.10(t,2H),1.16(d,12H)LC-MS(ESI):m/z 415.3(M+H+).
(6) conversion of Compound VI-1 to Compound VII-1
Figure GDA0002274171490000492
207mg (0.5mmol) of the compound VI-1 was dissolved in 3mL of anhydrous acetonitrile, and the solution was slowly added to anhydrous acetonitrile (7mL) containing 70mg (1mmol) of tetrazole and 239mg (1mmol) of tert-butyl 2-nitro-5-hydroxy-benzoate, and the mixture was stirred at room temperature for 30 minutes. Then 5mL of iodine solution (water/pyridine/tetrahydrofuran (2/20/78)) with iodine concentration of 0.15M and iodine content of 0.75mmol was added, and the mixture was stirred at room temperature for 30 minutes. Then, a 5% sodium sulfite solution was slowly added until the color of elemental iodine in the solution disappeared. All solvents were removed by rotary evaporation, 5mL of water was added to dissolve the remaining solid, 5mL of 2M potassium hydroxide was added, and the mixture was stirred at room temperature for 2 hours. The pH of the solution was adjusted to be slightly acidic, the solids were dissolved using a large amount of methanol, and then the solvent was removed by rotary evaporation. The remaining solid was recrystallized using water as solvent to give compound VII-1 in yield: 36% (131 mg).
Nuclear and mass spectral data for compound VII-1:1H NMR(300MHz,CDCl3):δ(ppm)8.23-8.11(m,2H),7.64(m,1H),4.45(s,2H),3.40-3.30(t,2H),2.90(t,2H),LC-MS(ESI):m/z 362.1(M-H-).
(7) carrying out fluorescence labeling on the compound VII-1 to obtain a compound VIII-1
Figure GDA0002274171490000501
12.8mg (20. mu. mol) of N-hydroxysuccinimide ester of Cy3 was dissolved in 1mL of DMF, and 4.2. mu.L (24. mu. mol) of N, N-diisopropylethylamine and 8.7mg (24. mu. mol) of compound VII-1 in DMF (1mL) were added thereto, stirred at room temperature, reacted for 2 hours, and then the solvent was removed by rotary evaporation. The remaining solid was dissolved by adding as little DMF as possible and then purified on HPLC using 100% acetonitrile as mobile phase to give compound VIII-1, yield: 95% (18.6 mg).
Mass Spectrometry data for Compound VIII-1: LC-MS (ESI) M/z 975.2(M-H-).
(8) Compound IV-1 is linked to compound VIII-1 to generate dTTP with a fluorophore Cy3 and a reversibly blocked 3' -OH
Figure GDA0002274171490000502
To 1mL of DMF containing 9.8mg (10. mu. mol) of Compound VIII-1 were added 2. mu.L (12. mu. mol) of N, N-diisopropylethylamine and 3.6mg (12. mu. mol) of O- (N-succinimidyl) -N, N, N ', N' -tetramethylurea tetrafluoroborate, and after 30 minutes of reaction, 1mL of a 0.1M sodium bicarbonate solution containing 6.2mg (10. mu. mol) of Compound IV-1 was added, followed by stirring at room temperature and reaction for 2 hours. All solvents were removed under high vacuum, as little water as possible was added, and then purified on HPLC using 1:1 water/acetonitrile and 0.05M triethylamine solution (pH7.0) as mobile phase to give Cy 3-labeled dTTP with 3' -OH blocked. Yield: 90% (14.1 mg).
Mass spectral data of the final product: LC-MS (ESI) M/z 1573.2 (M-H-); the HPLC chromatogram is shown in FIG. 2.
Example 2 Using base A as an example, a nucleotide (dATP) having a Cy3 fluorophore and 3' -OH blocked was prepared.
Referring to the procedures of example 1, (1) - (3), an intermediate (compound IV-2) for synthesizing dATP was prepared. The structure of compound IV-2 is as follows:
Figure GDA0002274171490000511
mass Spectrometry data for Compound IV-2: MS (MALDI-): M/z 636.1 (M-1).
On high performance liquid chromatography, the mixture is purified by using a 1: acetonitrile/water of 1 as mobile phase and the time of elution is shown in FIG. 3.
Referring to the procedure of example 1, dATP blocked at 3' -OH with a Cy3 fluorophore was prepared.
Example 3 polymerization of modified dTTP
In this example, modified dTTP of the invention was polymerized using exemplary templates and primers, the polymerization efficiency was tested, and the effect of polymerase and polymerization conditions on polymerization efficiency was examined.
(1) Template and primer sequences
Template: 5 '-CAACAGAAGGATTCTGGCGAACCGGAGGCTGAA- -3' (SEQ ID NO:1)
Primer: 3 '-TGTCTTCCTAAGACCGCTTGGCCTCCGACTT-5' (SEQ ID NO:2)
(2) Polymerase enzyme
theminator(NEB)、Taq(NEB)、BST 2.0(NEB)、BST 3.0(NEB)、9°Nm(NEB)、KOD(merckmillipore)
KOD polymerase mutants: KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, and KOD POL391
The above polymerases or mutants thereof are all obtained by purchase.
(3) Sequencing system
BGISEQ-500 sequencing system (operation method refer to BGISEQ-500 user manual)
(4) Test procedure
Fig. 4 illustrates the testing process.
The chip used for the test comprises two polymerization reaction areas: a region 1 and a region 2, wherein the reaction of the reference group is performed in the region 1, and the reaction of the test group is performed in the region 2. It is known that Cy3 modified dideoxy dTTP can be 100% polymerized by Taq DNA polymerase under the test conditions mentioned hereinafter, and thus serves as a reference group.
Cy 3-modified dideoxy dTTP (triethylamine salt) was purchased from Jena Bioscience and has the following chemical structure:
Figure GDA0002274171490000521
the chip is loaded. Firstly, the chip is photographed, a background signal value is collected, then, a reaction solution 1 containing Taq DNA polymerase and Cy3 modified dideoxy dTTP is added to the area 1, a reaction solution 2 containing polymerase to be tested and reversibly blocked dTTP with a fluorescent group Cy3 prepared in example 1 are added to the area 2, polymerization reaction is carried out for 10 minutes at respective proper temperature, the concentration of dTTP is 10 mu M, and a buffer system recommended by a polymerase product instruction is used as a buffer solution. After the reaction is finished, repeatedly washing the two sides of the chip by using 1X phosphate buffer solution to remove unreacted dTTP; 1X phosphate buffer containing 10mM vitamin C was pumped in, and both sides of the chip were photographed to collect signal values.
Each test gave 4 values, including blank background values (a and B) for the reference and test groups, and signal values (a and B) for the reference and test groups after aggregation. The polymerization efficiency of the control group is known to be 100%, and the polymerization efficiency of the test group can be calculated by the following equation:
the polymerization efficiency of the test group was (B-B)/(a-a) × 100%.
The polymerization was performed using different polymerases and the test results are shown in table 1 (slight differences between background values for different batches of tests and between signals of the reference group).
TABLE 1
Figure GDA0002274171490000531
The test results showed that the polymerases in Table 1 were able to polymerize the dTTP prepared in example 1, but the polymerases other than KOD had low polymerization efficiency. The polymerization efficiency using KOD as a polymerase was acceptable.
Further, the test was performed using KOD polymerase mutants KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, and KOD POL391 as polymerases. These mutants have poor polymerization effects when polymerizing certain reversibly blocked dNTPs in the prior art.
Using Taq DNA polymerase and Cy3 modified dideoxy dTTP as a reference set, using SEQ ID NO:1 and the template of SEQ ID NO:2, and the same experimental procedure as above, for 10 minutes. The test results are shown in table 2.
TABLE 2
Figure GDA0002274171490000541
As can be seen from Table 2, the polymerases KOD POL391 and KOD POL171 have higher polymerization efficiency with respect to dTTP prepared in example 1.
Further, the inventors optimized the reaction conditions (including pH of the reaction solution, salt concentration, buffer system concentration, additives, and reaction temperature) for polymerization using KOD POL 391.
The initial reaction conditions were: the reaction solution had a pH of 9.0 and contained 50mM sodium chloride, 1mM magnesium sulfate, 50mM Tris, 0.05% Tween-20 at a reaction temperature of 55 ℃. In optimizing the reaction conditions, the reference set was not used, but comparative tests were carried out on two reaction areas of the chip, the polymerization time being 5 minutes. The better conditions that were optimized first are used in subsequent tests. The test results are shown in tables 3-1 to 3-6.
TABLE 3-1
pH pH 9.0 pH 8.5 pH 8.0 pH7.8 pH 7.5 pH 7.0
Efficiency of polymerization 55.12 76.65 89.90 93.76 80.21 37.38
Other conditions are as follows: the reaction solution contained 50mM sodium chloride, 1mM magnesium sulfate, 50mM Tris, 0.05% Tween-20 at a reaction temperature of 55 ℃.
TABLE 3-2
Concentration of sodium chloride 20mM 100mM
Efficiency of polymerization 96.33 92.50
Other conditions are as follows: the reaction solution was pH7.8, and contained 1mM magnesium sulfate, 50mM Tris, 0.05% Tween-20, and the reaction temperature was 55 ℃.
Tables 3 to 3
Concentration of magnesium sulfate 10mM 3mM
Efficiency of polymerization 93.94 97.40
Other conditions are as follows: the reaction solution, pH7.8, contained 20mM sodium chloride, 50mM Tris, 0.05% Tween-20, and the reaction temperature was 55 ℃.
Tables 3 to 4
Tris concentration 100mM 20mM
Efficiency of polymerization 94.67 93.28
Other conditions are as follows: the reaction solution was pH7.8, and contained 20mM sodium chloride, 3mM magnesium sulfate, and 0.05% Tween-20, and the reaction temperature was 55 ℃.
Tables 3 to 5
Figure GDA0002274171490000551
Other conditions are as follows: the reaction solution was pH7.8, and contained 20mM sodium chloride, 3mM magnesium sulfate, 50mM Tris, 0.05% Tween-20, and the reaction temperature was 55 ℃.
Tables 3 to 6
Reaction temperature 57 60℃ 63℃
Efficiency of polymerization 99.02 98.43 93.77
Other conditions are as follows: the reaction solution, pH7.8, contained 20mM sodium chloride, 3mM magnesium sulfate, 50mM Tris, 0.05% Tween-20, 5% DMSO.
As can be seen from tables 3-1 to 3-6, changing the pH to 7.8 and adding 5% DMSO has the greatest effect on the polymerization efficiency, with little positive increase in pH for other conditions. The reaction conditions obtained by the final optimization are: the reaction solution, pH7.8, contained 20mM sodium chloride, 3mM magnesium sulfate, 50mM Tris, 5% DMSO, 0.05% Tween-20 at a reaction temperature of 57 ℃.
Under the finally optimized conditions, the polymerization efficiency of the polymerases KOD POL391 and KOD POL171 with respect to dTTP obtained in example 1 was measured, and the polymerization reaction was carried out for 10 minutes. A reaction in which Cy 3-modified dideoxy dTTP was polymerized with Taq DNA polymerase was used as a reference group. The results are shown in Table 4.
TABLE 4
Figure GDA0002274171490000561
As shown in table 4, the Cy 3-modified reversibly blocked dTTP prepared in example 1 was efficiently polymerized by KOD POL391 or KOD POL171, and the polymerization efficiency was close to 100%.
Example 4 removal of blocking groups
In this example, a reversibly blocked dTTP without a fluorophore was first polymerized. The reversibly blocked dTTP without a fluorophore used in this example was compound IV-1 of example 1, having the following structure. Under optimized conditions, it can be 100% polymerized.
Figure GDA0002274171490000562
After polymerization, the blocking group was cleaved using endonuclease IV (NEB), and the efficiency of cleavage was tested by adding the next base after cleavage.
Fig. 5 illustrates the testing process.
First, a chip having two reaction regions (region 1 and region 2) is loaded, wherein the reaction of the reference group is performed in region 1 and the reaction of the test group is performed in region 2. Two different primers are added to the two regions respectively, wherein the primers of the reference group are shown as SEQ ID NO:3, which has one more T base at the 3-terminal than the primer of the test group (shown in SEQ ID NO:2), so that after one T base is added to the test group and the test group is excised, the reference group and the test group have the same sequence, and the excision efficiency of the test group can be determined by comparing the signal of the next base.
In both of the regions 1 and 2, the compound IV-1 was polymerized for 10 minutes using the polymerase KOD POL391 under the finally optimized reaction conditions obtained in example 3, with the concentration of the compound IV being 10. mu.M. Washing away unreacted reagent, and performing excision reaction in NEB buffer solution 3 at 37 deg.C with excision enzyme 4(100U/mL) as excision reagent for 10min, 20min, 40min or 60 min. And washing the chip, and then acquiring signals of the chip to obtain background values a and b of the reference group and the group to be detected. Adding Taq DNA and Cy3 modified ddGTP, carrying out polymerization reaction for 10 minutes, then washing off the unreacted ddGTP, photographing to collect signals, and obtaining signal values A and B of a reference group and a to-be-detected group. The ablation efficiency can be calculated using the following equation,
the cleavage efficiency was 100% (B-B)/(a-a).
Table 5 and fig. 6 show the resection efficiency at different resection times.
TABLE 5
Figure GDA0002274171490000571
As can be seen from Table 5 and FIG. 6, the excision efficiency reached a plateau after 20 minutes, and the final excision efficiency reached 100%.
EXAMPLE 5 removal of fluorescent groups
In this example, the phosphodiester bond (2) in the modified nucleotide of the present invention was cleaved to remove the fluorescent group, and the efficiency of cleavage was tested.
Fig. 7 exemplarily illustrates the testing process.
The Cy-3-modified reversibly blocked dTTP prepared in example 1 was polymerized on a chip using KOD POL391 as a polymerase, the reaction conditions were the final optimized reaction conditions of example 1, and the concentration of dTTP was 10. mu.M. Signal values were collected before and after polymerization, respectively, wherein the signal value before polymerization was background a1 and the signal value after polymerization was a, and then cleavage was performed using alkaline phosphatase (NEB, cat # M0371S), for 2, 5, 10 or 20 minutes, respectively, and the reaction reagents were washed away after cleavage, and then signal values were collected as a2, and the cleavage efficiency was calculated according to the following equation:
the cleavage efficiency was (a-a2)/(a-a1) × 100%.
The test results are shown in table 6 and fig. 8.
TABLE 6
Cut-off time (min) A1 a A2 Efficiency of excision
2 434 6012 1382 83%
5 333 4567 672 92%
10 282 4718 370 98%
20 426 5840 412 100%
As can be seen from Table 6 and FIG. 8, the fluorophores on the nucleic acid strand could be completely removed by 20min of excision.
Example 6 the stability of the azidomethylene blocking group and the stability of the methyl phosphate blocking group were tested.
Test compounds:
(1) dTTP prepared in example 1 and having a fluorophore of Cy3 and 3' -OH blocked by methylphosphonate;
(2) dTTP (structure below, purchased from mychem lab) carrying a fluorophore of Cy3 with the 3' -OH blocked by azidomethylene.
Figure GDA0002274171490000581
And (3) testing process:
1M tris-buffer (pH 8.0) (0.3mL) and human serum albumin were added to 1mg of each of the two compound samples, and mixed and stirred at 65 ℃ for 36 hours; afterwards, both samples were analyzed (LC-MS) using a liquid chromatography-mass spectrometry (LC) with a 1:1 acetonitrile: and (3) detecting whether the two samples have basic groups removed from the 3' end protecting groups or molecular mass spectrum peaks removed from the fluorescent modifying groups by taking the aqueous solution (0.05M, pH 7) of the triethylamine acetate as a mobile phase.
The LC result of dTTP blocked by methyl phosphate is shown in FIG. 9, a very small new peak is generated on the LC spectrum, and the molecular weight is 780.1(M-H) through mass spectrum identification-The structure is as follows.
Figure GDA0002274171490000591
The LC results for the azidomethylene blocked dTTP are shown in fig. 10, with the LC pattern having two larger new peaks and 1 smaller new peak. Structure and molecular weight (in order from left to right in the figure) as identified by mass spectrometry:
Figure GDA0002274171490000592
622.0(M-H)-
Figure GDA0002274171490000593
677.4(M-H)-and are and
nucleotide modified by base band fluorescence without protecting group at 3' end, 1482.3(M-H)-
While specific embodiments of the invention have been described in detail, those skilled in the art will understand that: various modifications and changes in detail can be made in light of the overall teachings of the disclosure, and such changes are intended to be within the scope of the present invention. The full scope of the invention is given by the appended claims and any equivalents thereof.
Sequence listing
<110> Shenzhen Hua Dazhi science and technology Limited
<120> modified nucleoside or nucleotide
<130> IDC190377
<160> 3
<170> PatentIn version 3.5
<210> 1
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> template
<400> 1
caacagaagg attctggcga accggaggct gaa 33
<210> 2
<211> 31
<212> DNA
<213> Artificial Sequence
<220>
<223> primer
<400> 2
tgtcttccta agaccgcttg gcctccgact t 31
<210> 3
<211> 32
<212> DNA
<213> Artificial Sequence
<220>
<223> primer
<400> 3
ttgtcttcct aagaccgctt ggcctccgac tt 32

Claims (79)

1. A compound having a structure represented by the general formula (I'),
Figure FDA0003637949990000011
wherein R is1And R3Each independently selected from
Figure FDA0003637949990000012
Figure FDA0003637949990000013
A is selected from phenyl, naphthyl, indolyl and pyridyl;
R2selected from nitro, halogeno C1-4Alkyl, halogen, hydrogen, aldehyde group,
Figure FDA0003637949990000014
Wherein Q is independently selected from C1-4An alkyl group;
R4is a triphosphate group (-PO)3H-PO3H-PO3H2);
R5Is selected from C1-4An alkyl group;
R6is hydrogen or hydroxy;
m and n are each independently selected from 0, 1, 2, 3, 4, 5;
l is a linking group or is absent;
base represents a Base;
label represents a fluorophore.
2. The compound of claim 1, wherein Base represents a purine Base or a pyrimidine Base.
3. The compound of claim 1, wherein Base is selected from one of A, T, U, C and G.
4. The compound of claim 1, wherein R1Is selected from
Figure FDA0003637949990000015
5. The compound of claim 1, wherein R1Is composed of
Figure FDA0003637949990000021
6. The compound of claim 1, wherein R2Selected from nitro, trifluoromethyl, fluorine, chlorine, hydrogen and aldehyde groups.
7. The compound of claim 1, wherein R2Is nitro.
8. The compound of claim 1, wherein R3Is selected from
Figure FDA0003637949990000022
9. The compound of claim 1, wherein R3Is composed of
Figure FDA0003637949990000023
10. The compound of claim 1, wherein,
Figure FDA0003637949990000024
selected from:
Figure FDA0003637949990000025
11. the compound of claim 1, wherein,
Figure FDA0003637949990000026
is composed of
Figure FDA0003637949990000027
12. The compound of claim 1, wherein R6Is hydrogen.
13. The compound of claim 1, wherein Label is selected from Cy3, Cy3.5, Cy5, and Cy5.5.
14. The compound of claim 1, wherein m is 1.
15. The compound of claim 1, wherein n is 1.
16. The compound of claim 1, wherein L is
Figure FDA0003637949990000031
17. A compound of claim 1, R5Is methyl or ethyl.
18. The compound of claim 1 having the structure of formula (II)
Figure FDA0003637949990000032
19. The compound of claim 18, wherein R2Is nitro.
20. The compound of claim 18, wherein Label is Cy3.
21. The compound of any one of claims 1-20, selected from the group consisting of:
Figure FDA0003637949990000033
Figure FDA0003637949990000041
22. a method of preparing a growing polynucleotide complementary to a target single stranded polynucleotide in a sequencing reaction, comprising incorporating a compound of any one of claims 1-21 into the growing complementary polynucleotide, wherein the incorporation of the compound prevents the introduction of any subsequent nucleotides into the growing complementary polynucleotide.
23. The method of claim 22, wherein the incorporation of the compound is achieved by a terminal transferase, a terminal polymerase, or a reverse transcriptase.
24. The method of claim 22, comprising: using a polymerase, the compound is incorporated into the growing complementary polynucleotide.
25. The method of claim 22, comprising: performing a nucleotide polymerization reaction using a polymerase under conditions that allow the polymerase to perform the nucleotide polymerization reaction, thereby incorporating the compound into the 3' end of the growing complementary polynucleotide.
26. A method of determining the sequence of a target single-stranded polynucleotide, comprising: monitoring the sequential incorporation of complementary nucleotides, wherein at least one of the complementary nucleotides incorporated is a compound according to any one of claims 1 to 21, and detecting the fluorophore carried by said compound.
27. The method of claim 26, wherein the blocking group in the compound is introduced prior to the introduction of the next complementary nucleotide
Figure FDA0003637949990000051
And fluorophore removal.
28. The method of claim 27, wherein the blocking group and the fluorescent group are removed simultaneously.
29. The method of claim 27, wherein said blocking group and said fluorescent group are removed sequentially.
30. The method of claim 27, wherein the blocking group is removed before or after the fluorescent group is removed.
31. The method of claim 26, comprising the steps of:
(a) providing a mixture comprising a duplex, a compound of any one of claims 1-21, a polymerase, and a cleavage reagent; the duplex comprises a growing nucleic acid strand and a nucleic acid molecule to be sequenced;
(b) carrying out a reaction cycle comprising the following steps (i), (ii) and (iii):
step (i): incorporating the compound into a growing nucleic acid strand using a polymerase to form a nucleic acid strand comprising a blocking group
Figure FDA0003637949990000061
And a nucleic acid intermediate of a fluorophore;
step (ii): detecting a fluorophore on the nucleic acid intermediate;
step (iii): the blocking group on the nucleic acid intermediate is removed using a cleavage reagent.
32. The process of claim 31, the reaction cycle further comprising step (iv): the fluorescent group on the nucleic acid intermediate is removed using a cleavage reagent.
33. The method of claim 32, wherein the cleavage reagent used in step (iii) and step (iv) is the same reagent.
34. The method of claim 32, wherein the cleavage reagents used in step (iii) and step (iv) are different reagents.
35. The method of claim 26, comprising the steps of:
(1) providing a duplex comprising a growing nucleic acid strand and a nucleic acid molecule to be sequenced, said duplex being attached to a support;
(2) adding a polymerase for performing a nucleotide polymerization reaction, and first, second, third, and fourth compounds, thereby forming a reaction system comprising a solution phase and a solid phase; wherein, the four compounds are respectively derivatives of nucleotide A, (T/U), C and G, all have the structure shown in the general formula (I') and have the complementary base pairing ability;
(3) performing a nucleotide polymerization reaction using a polymerase under conditions that allow the polymerase to perform the nucleotide polymerization reaction, thereby incorporating one of the four compounds into the 3' end of the growing nucleic acid strand;
(4) removing the solution phase of the reaction system of the previous step, retaining the duplexes attached to the support, and detecting the signal emitted by the fluorophores on said duplexes or said growing nucleic acid strands;
(5) adding a cleaving agent to contact the duplex or the growing nucleic acid strand with the cleaving agent in a reaction system comprising a solution phase and a solid phase; wherein the cleaving agent is capable of cleaving phosphodiester bonds (1) and/or phosphodiester bonds (2) in a compound incorporated at the 3' end of a growing nucleic acid strand and does not affect the phosphodiester bonds on the duplex backbone;
(6) the solution phase of the reaction system of the previous step is removed.
36. The method of claim 35, further comprising the steps of:
(7) repeating the steps (2) - (6) or the steps (2) - (4) one or more times.
37. The method of claim 35, wherein after any step comprising a removal operation, a washing operation is performed.
38. The method of claim 35, wherein the duplex in step (1) is obtained by a method comprising the steps of:
providing a primer that anneals to a nucleic acid molecule to be sequenced, said primer acting as an initial growing nucleic acid strand that, together with said nucleic acid molecule to be sequenced, forms a duplex attached to a support.
39. The method of claim 35, wherein the polymerase in step (3) is selected from KOD polymerase or a mutant thereof.
40. The method of claim 35, wherein the polymerase in step (3) is selected from the group consisting of KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, and KOD POL 391.
41. The method of claim 35, wherein the polymerase in step (3) is selected from the group consisting of KOD POL391 and KOD POL 171.
42. The method of claim 35, wherein the solution phase in step (2) comprises monovalent salt ions and/or divalent salt ions.
43. The method of claim 42, wherein said monovalent salt ions are selected from the group consisting of sodium ions, chloride ions.
44. The method of claim 42, wherein the divalent salt ion is selected from the group consisting of magnesium ion and sulfate ion.
45. The method of claim 42, wherein the concentration of said monovalent or divalent salt ions in said solution phase is between 1 and 200 mM.
46. The method of claim 35, wherein the solution phase in step (2) comprises a buffer solution.
47. The method of claim 46, wherein the buffer solution is a Tris buffer solution.
48. The method of claim 47, the concentration of Tris in the solution phase is between 10mM and 200 mM.
49. The process of claim 35, wherein the solution phase in step (2) comprises an organic solvent.
50. The method of claim 49, wherein the organic solvent is DMSO or glycerol.
51. The method of claim 49, wherein the organic solvent is present in the solution phase at 0.01% to 10% by mass.
52. The process of claim 35, wherein the pH of the solution phase in step (2) is from 7.0 to 9.0.
53. The process of claim 35, wherein the polymerization in step (3) is carried out at 50-65 ℃.
54. The method of claim 35, further comprising, after step (4), determining the base type at the corresponding position in the nucleic acid molecule to be sequenced based on the principle of base-complementary pairing based on the type of compound incorporated into the 3' end of the growing nucleic acid strand in step (3).
55. The method of claim 35, wherein in step (5) the cleavage of the blocking group and the cleavage of the fluorophore are performed simultaneously, or the cleavage of the blocking group and the cleavage of the fluorophore are performed in steps.
56. The method of claim 35, said step (5) comprising:
step (5-1): adding a first cleaving reagent to contact the duplex or the growing nucleic acid strand with the first cleaving reagent in a reaction system comprising a solution phase and a solid phase to cleave the phosphodiester bond (1) in the compound incorporated into the 3' end of the growing nucleic acid strand without affecting the phosphodiester bond on the backbone of the duplex;
step (5-2): adding a second cleaving agent, contacting said duplex or said growing nucleic acid strand with said second cleaving agent in a reaction system comprising a solution phase and a solid phase, and cleaving phosphodiester bonds (2) in the compound incorporated into the 3' end of the growing nucleic acid strand without affecting the phosphodiester bonds on the duplex backbone.
57. The method of claim 56, wherein said first excision reagent is endonuclease IV.
58. The method of claim 56, wherein said second cleavage agent is alkaline phosphatase.
59. A kit comprising a first, second, third and fourth compound, each of which is a compound of any one of claims 1-21, said four compounds being derivatives of nucleotides a, (T/U), C and G, respectively, and having base complementary pairing capabilities.
60. The kit of claim 59, wherein the labels in the four compound structures are different fluorophores.
61. The kit of claim 59, comprising a polymerase for performing a nucleotide polymerization reaction.
62. The kit of claim 59, comprising KOD polymerase or a mutant thereof.
63. The kit of claim 62, wherein said mutant is selected from the group consisting of one or more of KOD POL151, KOD POL157, KOD POL171, KOD POL174, KOD POL376, and KOD POL 391.
64. The kit of claim 59, comprising KOD POL391, KOD POL171, or a combination thereof.
65. The kit of claim 59, further comprising a cleavage reagent that is capable of cleaving the phosphodiester bond (1) and/or phosphodiester bond (2) in formula (I') and does not affect the phosphodiester bond on the duplex backbone.
66. The kit of claim 65, wherein said excision reagent is selected from the group consisting of endonuclease IV and alkaline phosphatase.
67. The kit of claim 59, further comprising: reagents and/or devices for extracting nucleic acid molecules from a sample; reagents for pretreating nucleic acid molecules; a support for attaching nucleic acid molecules to be sequenced; reagents for attaching a nucleic acid molecule to be sequenced to a support; a primer for initiating a nucleotide polymerization reaction; a polymerase for performing a nucleotide polymerization reaction; one or more buffer solutions; one or more wash solutions; or any combination thereof.
68. The kit of claim 67, said linkage being covalent or non-covalent.
69. The kit of claim 59, comprising a buffer solution.
70. The kit of claim 69, the buffer solution comprising monovalent salt ions and/or divalent salt ions.
71. The kit of claim 70, wherein said monovalent salt ions are selected from the group consisting of sodium ions, chloride ions.
72. The kit of claim 70, wherein said divalent salt ion is selected from the group consisting of magnesium ion, sulfate ion.
73. The kit of claim 70, wherein the concentration of said monovalent salt ion or divalent salt ion in said buffer solution is between 1 and 200 mM.
74. The kit of claim 69, wherein the buffered solution comprises Tris.
75. The kit of claim 74, the concentration of Tris in the buffer solution is between 10mM and 200 mM.
76. The kit of claim 69, wherein the buffer solution has a pH of 7.0 to 9.0.
77. The kit of claim 69, the buffer solution comprising an organic solvent.
78. The kit of claim 77, wherein the organic solvent is DMSO or glycerol.
79. Use of a compound according to any one of claims 1 to 21 or a kit according to any one of claims 59 to 78 for determining the sequence of a target single-stranded polynucleotide.
CN201780090915.8A 2017-10-11 2017-10-11 Modified nucleosides or nucleotides Active CN110650968B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/105734 WO2019071474A1 (en) 2017-10-11 2017-10-11 Modified nucleoside or nucleotide

Publications (2)

Publication Number Publication Date
CN110650968A CN110650968A (en) 2020-01-03
CN110650968B true CN110650968B (en) 2022-07-05

Family

ID=66100193

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780090915.8A Active CN110650968B (en) 2017-10-11 2017-10-11 Modified nucleosides or nucleotides

Country Status (2)

Country Link
CN (1) CN110650968B (en)
WO (1) WO2019071474A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115820380A (en) * 2023-01-04 2023-03-21 深圳赛陆医疗科技有限公司 Microfluidic chip and preparation method and application thereof

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220298201A1 (en) * 2018-02-21 2022-09-22 Singular Genomics Systems, Inc. Fluorinated nucleotides and uses thereof
BR112021022809A2 (en) * 2019-05-15 2022-01-25 Egi Tech Shen Zhen Co Ltd Autoluminescence-based single-channel sequencing method
CN114040983A (en) * 2019-06-26 2022-02-11 南京金斯瑞生物科技有限公司 Oligonucleotide containing blocker
CA3199278A1 (en) * 2020-10-21 2022-04-28 Bgi Shenzhen Modified nucleoside or nucleotide
CN115197291A (en) * 2021-04-01 2022-10-18 深圳华大生命科学研究院 Nucleotide analogs for sequencing
WO2023280156A1 (en) * 2021-07-08 2023-01-12 深圳华大生命科学研究院 Modified nucleoside or nucleotide

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102030792A (en) * 2009-09-29 2011-04-27 韩国科学技术研究院 3'-o-fluorescently modified nucleotides and uses thereof
WO2012083249A2 (en) * 2010-12-17 2012-06-21 The Trustees Of Columbia University In The City Of New York Dna sequencing by synthesis using modified nucleotides and nanopore detection
WO2016020691A1 (en) * 2014-08-08 2016-02-11 Illumina Cambridge Limited Modified nucleotide linkers

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10378051B2 (en) * 2011-09-29 2019-08-13 Illumina Cambridge Limited Continuous extension and deblocking in reactions for nucleic acids synthesis and sequencing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102030792A (en) * 2009-09-29 2011-04-27 韩国科学技术研究院 3'-o-fluorescently modified nucleotides and uses thereof
WO2012083249A2 (en) * 2010-12-17 2012-06-21 The Trustees Of Columbia University In The City Of New York Dna sequencing by synthesis using modified nucleotides and nanopore detection
WO2016020691A1 (en) * 2014-08-08 2016-02-11 Illumina Cambridge Limited Modified nucleotide linkers

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115820380A (en) * 2023-01-04 2023-03-21 深圳赛陆医疗科技有限公司 Microfluidic chip and preparation method and application thereof
CN115820380B (en) * 2023-01-04 2024-01-30 深圳赛陆医疗科技有限公司 Microfluidic chip and preparation method and application thereof

Also Published As

Publication number Publication date
WO2019071474A1 (en) 2019-04-18
CN110650968A (en) 2020-01-03

Similar Documents

Publication Publication Date Title
CN110650968B (en) Modified nucleosides or nucleotides
US11634768B2 (en) Methods for indexing samples and sequencing multiple polynucleotide templates
CN110114476B (en) Sequencing method based on single fluorescent dye
EP3797170B1 (en) Polynucleotide synthesis method, system and kit
EP3570972B1 (en) Methods and reagents for synthesising polynucleotide molecules
US20090226975A1 (en) Constant cluster seeding
WO2008041002A2 (en) Method for sequencing a polynucleotide template
JP2003501071A (en) Solid-phase amplification of multiple nucleic acids
KR20030055343A (en) Isothermal amplification of nucleic acids on a solid support
KR20160138579A (en) Systems and methods for clonal replication and amplification of nucleic acid molecules for genomic and therapeutic applications
WO2020227953A1 (en) Single-channel sequencing method based on self-luminescence
CN108192957B (en) DNA (deoxyribonucleic acid) synthetic sequencing method and sequencing system
US20220348994A1 (en) Methods and systems for nucleic acid sequencing
EP4359557A1 (en) Methods and compositions for combinatorial indexing of bead-based nucleic acids
US20230105642A1 (en) Method and compositions for preparing nucleic acid libraries
KR20230035237A (en) Generation of Nucleic Acids with Modified Bases Using Recombinant Terminal Deoxynucleotidyl Transferases
AU2022413575A1 (en) Methods for metal directed cleavage of surface-bound polynucleotides
WO2023235865A1 (en) Compositions and methods for reducing base call errors by removing deaminated nucleotides from a nucleic acid library
WO2023122491A1 (en) Periodate compositions and methods for chemical cleavage of surface-bound polynucleotides
WO2024059550A1 (en) Double-stranded splint adaptors with universal long splint strands and methods of use

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518083 the comprehensive building of Beishan industrial zone and 11 2 buildings in Yantian District, Shenzhen, Guangdong.

Applicant after: Shenzhen Huada Zhizao Technology Co.,Ltd.

Address before: 518083 the comprehensive building of Beishan industrial zone and 11 2 buildings in Yantian District, Shenzhen, Guangdong.

Applicant before: MGI TECH Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant