WO2020093261A1 - 对多核苷酸进行测序的方法 - Google Patents

对多核苷酸进行测序的方法 Download PDF

Info

Publication number
WO2020093261A1
WO2020093261A1 PCT/CN2018/114281 CN2018114281W WO2020093261A1 WO 2020093261 A1 WO2020093261 A1 WO 2020093261A1 CN 2018114281 W CN2018114281 W CN 2018114281W WO 2020093261 A1 WO2020093261 A1 WO 2020093261A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleotide
marker
labeled
label
luminescent
Prior art date
Application number
PCT/CN2018/114281
Other languages
English (en)
French (fr)
Inventor
赵杰
廖莎
章文蔚
陈奥
徐崇钧
傅德丰
Original Assignee
深圳华大智造极创科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大智造极创科技有限公司 filed Critical 深圳华大智造极创科技有限公司
Priority to US17/292,400 priority Critical patent/US20220010370A1/en
Priority to JP2021518956A priority patent/JP7332235B2/ja
Priority to KR1020217017101A priority patent/KR20210088637A/ko
Priority to SG11202104099VA priority patent/SG11202104099VA/en
Priority to CA3118607A priority patent/CA3118607A1/en
Priority to AU2018448937A priority patent/AU2018448937A1/en
Priority to CN201880098581.3A priority patent/CN112840035B/zh
Priority to EP18939391.1A priority patent/EP3878968A4/en
Priority to PCT/CN2018/114281 priority patent/WO2020093261A1/zh
Publication of WO2020093261A1 publication Critical patent/WO2020093261A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2563/00Nucleic acid detection characterized by the use of physical, structural and functional properties
    • C12Q2563/103Nucleic acid detection characterized by the use of physical, structural and functional properties luminescence

Definitions

  • the invention relates to a method for sequencing polynucleotides, in which the sequential incorporation of different nucleotides is detected by the same luminescence signal, thereby realizing the determination of the polynucleotide sequence.
  • Sanger sequencing method has the characteristics of simple experimental operation, intuitive and accurate results, and short experimental period. It has a wide range of applications in clinical gene mutation detection and genotyping that require high timeliness of test results.
  • the low throughput and high cost of the Sanger sequencing method limit its application in large-scale gene sequencing.
  • the second generation sequencing technology came into being. Compared with the first-generation sequencing technology, the second-generation sequencing technology has the advantages of high throughput, low cost, and high degree of automation, and is suitable for large-scale sequencing.
  • the second-generation sequencing technology that has been developed so far mainly involves sequencing by sequencing (SBL) technology and sequencing by synthesis (SBS) technology.
  • SBL sequencing by sequencing
  • SBS sequencing by synthesis
  • Typical examples of these sequencing technologies include the Roche 454 sequencing method, the SOLiD sequencing method developed by Applied Biosystems, the Joint Probe Anchor Ligation Method (cPAL) independently developed by Complete Genomics, and the Joint Probe Anchor Synthesis Method developed by BGI.
  • cPAS Joint Probe Anchor Ligation Method
  • Illumina sequencing method jointly developed by Illumina and Solexa Technology, etc.
  • Sequencing detection methods mainly include electrochemical methods and optical signal detection methods, among which the more mainstream detection method is optical signal detection.
  • the more mainstream detection method is optical signal detection.
  • 4 fluorescent dyes In order to realize the identification and differentiation of 4 bases (A, T / U, C and G), it is necessary to use 4 fluorescent dyes to label 4 bases respectively.
  • 2 fluorescent dyes At present, there are also reports on the use of 2 fluorescent dyes to label 4 bases, and the identification and differentiation of 4 bases through different combinations of 2 fluorescent dyes.
  • the Roche 454 sequencing method uses the principle of autofluorescence to convert the pyrophosphate generated by dNTP synthesis into the sequence to be tested into ATP, and uses the generated ATP and luciferase to oxidize fluorescein to produce fluorescence by detecting the presence and strength of the fluorescent signal. To distinguish between 4 kinds of bases and the number of synthetic bases. Due to hardware requirements, the second-generation sequencing technology is generally relatively large, which is not conducive to carrying and handling.
  • the sequencing technology has been developed to the third generation, which overcomes the huge shortcomings of the second-generation sequencing technology instruments.
  • the sequencer of Oxford Nanopore has greatly reduced the size of the sequencer because of the different sequencing principles, and can even be carried to space Sequencing experiments.
  • the current third-generation sequencing technology has a high error rate, which limits its large-scale promotion.
  • the NextSeq sequencing system and Mini-Seq sequencing system developed by Illumina and the BGISEQ-50 sequencing system of BGI gene use 2 fluorescent dyes to label 4 bases, and achieve 4 bases by different combinations of 2 fluorescent dyes. Identification and differentiation. For example, by marking the base A with the first fluorescent dye, the base G with the second fluorescent dye, and the base C with the first and second fluorescent dyes at the same time, and without marking the base T, the four bases are distinguished base. See, for example, U.S. Patent No. 9453258 B2.
  • a deoxyribonucleotide (dNTP) is introduced in sequence, if the dNTP can be paired with the sequence to be tested, pyrophosphate is released after the synthesis of dNTP, pyrophosphate and ATP in the sequencing reaction system
  • the sulfatase reacts to generate ATP.
  • the generated ATP and the luciferase in the system jointly oxidize luciferin to emit fluorescence.
  • the fluorescent signal is captured by the detector and converted into sequencing results by computer analysis. See, for example, Martin Kircher and Janet Kelso. High-throughput DNA sequencing-concepts and limits. Bioessays, 2010, 32: 524-536.
  • the Ion torrent sequencing system is similar to the Roche 454 sequencing method, in which a deoxyribonucleotide (dNTP) is sequentially introduced. If the dNTP can be paired with the sequence to be tested, hydrogen ions are released after the synthesis of the dNTP, and the generated hydrogen ions are changed.
  • the electrical components integrated on the sequencing chip convert the change in pH value into an electrical signal and transmit it to a computer, which is converted into a sequencing result by computer analysis. See, for example, Sara Goodwin, John D. McPherson and W. Richard McCombie, Coming of age: ten years, of next-generation sequencing technologies, Nature reviews, 2016, 17: 333-351.
  • the sequencing equipment is equipped with at least 2 monochromatic excitation light sources and 2 cameras, which results in the manufacturing cost of the sequencing device being expensive and huge.
  • the invention relates to a method for sequencing polynucleotides, in which the sequential incorporation of different nucleotides is detected by the same luminescence signal, thereby realizing the determination of the polynucleotide sequence.
  • the invention relates to a method for determining the sequence of a target polynucleotide, which includes:
  • the nucleotide is selected from one or more of the following: a first nucleotide, a second nucleotide, a third nucleotide, and a fourth nucleotide, wherein the first nucleotide A first nucleotide labeled with a first marker and an optionally unlabeled first nucleotide, the second nucleotide comprising a second nucleotide labeled with a second marker and optionally unlabeled A second nucleotide selected from: (1) a third nucleotide labeled with a first marker and a third nucleotide labeled with a second marker, or (2) A third nucleotide simultaneously labeled with the first marker and the second marker, the fourth nucleotide including an unlabeled fourth nucleotide,
  • each of the nucleotides contains a protecting group attached via a 2 ’or 3’ oxygen atom
  • step (d) detecting the presence of the first marker on the partial duplex in step (c),
  • step (e) detecting the presence of the second marker on the partial duplex in step (c),
  • step (f) optionally remove the protecting group and label on the nucleotide incorporated in the partial duplex of step (c),
  • the first marker is a luminescent marker.
  • step (d) includes contacting the portion of the duplex of step (c) with a ligand labeled with a luminescent label that specifically binds to the first label, and then detecting the portion The presence of the luminescent marker on the duplex.
  • the ligand is removed together when the protecting group on the nucleotide incorporated in the partial duplex of step (c) and the label are removed.
  • step (e) includes contacting the portion of the duplex of step (c) with a ligand labeled with a luminescent label that specifically binds to the second label, and then detecting the portion The presence of the luminescent marker on the duplex.
  • step (e) is performed after step (d).
  • the luminescent markers are the same luminescent markers.
  • the luminescent label is a fluorescent label, such as a fluorophore, for example selected from coumarin, AlexaFluor, Bodipy, fluorescein, tetramethylrhodamine, Cy5, Cy3, Texas Red and others derivative.
  • a fluorophore for example selected from coumarin, AlexaFluor, Bodipy, fluorescein, tetramethylrhodamine, Cy5, Cy3, Texas Red and others derivative.
  • the ratio of the first nucleotide labeled with the first marker and the unlabeled first nucleotide in the first nucleotide is 4: 1 to 3: 2.
  • the ratio of the second nucleotide labeled with the second marker and the unlabeled second nucleotide in the second nucleotide is 4: 1 to 3: 2.
  • the present invention also relates to a kit for sequencing polynucleotides, which comprises: (a) one or more nucleotides selected from the group consisting of: a first nucleotide and a second nucleotide , A third nucleotide and a fourth nucleotide, wherein the first nucleotide comprises a first nucleotide labeled with a first marker and an optionally unlabeled first nucleotide, the first The dinucleotide includes a second nucleotide labeled with a second marker and an optionally unlabeled second nucleotide, the third nucleotide is selected from: (1) labeled with the first marker A third nucleotide and a third nucleotide labeled with a second marker, or (2) a third nucleotide simultaneously labeled with a first marker and a second marker, the fourth nucleotide comprising Unlabeled fourth nucleotide; and (a) one
  • the first marker is a luminescent marker.
  • the kit further comprises a luminescent label-labeled ligand that specifically binds to the first label.
  • the kit further comprises a luminescent label-labeled ligand that specifically binds to the second label.
  • the luminescent markers are the same luminescent markers.
  • the luminescent label is a fluorescent label, such as a fluorophore, for example selected from coumarin, AlexaFluor, Bodipy, fluorescein, tetramethylrhodamine, Cy5, Cy3, Texas Red and Its derivatives.
  • a fluorescent label such as a fluorophore, for example selected from coumarin, AlexaFluor, Bodipy, fluorescein, tetramethylrhodamine, Cy5, Cy3, Texas Red and Its derivatives.
  • the kit also contains an enzyme and a buffer suitable for the enzyme to function.
  • Figure 1 shows the first base signal extraction diagram of sequencing the E. coli barcode sequence in Example 1.
  • FIG. 2 shows the 10th base signal extraction diagram of sequencing the E. coli barcode sequence in Example 1.
  • FIG. 2 shows the 10th base signal extraction diagram of sequencing the E. coli barcode sequence in Example 1.
  • FIG. 3 shows the first base signal extraction diagram of sequencing the E. coli barcode sequence in Example 2.
  • FIG. 4 shows the 50th base signal extraction diagram of sequencing the E. coli barcode sequence in Example 2.
  • FIG. 4 shows the 50th base signal extraction diagram of sequencing the E. coli barcode sequence in Example 2.
  • FIG. 5 shows the signal extraction diagram of the first base in the experiment without adding unlabeled nucleotides in Example 1.
  • FIG. 6 shows a signal extraction diagram of the first base in the experiment without adding unlabeled nucleotides in Example 2.
  • polynucleotide refers to deoxyribonucleic acid (DNA), ribonucleic acid (RNA), or the like.
  • the polynucleotide may be single-stranded, double-stranded, or contain both single-stranded and double-stranded sequences.
  • the polynucleotide molecule may be derived from double-stranded DNA (dsDNA) form (eg, genomic DNA, PCR, amplification products, etc.), or may be derived from single-stranded form DNA (ssDNA) or RNA and it may be converted to dsDNA form, And vice versa.
  • dsDNA double-stranded DNA
  • ssDNA single-stranded form DNA
  • RNA single-stranded form
  • genes or gene fragments eg, probes, primers, EST or SAGE tags
  • genomic DNA genomic DNA fragments, exons, introns, messenger RNA (mRNA), transport RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, synthetic polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acids of any of the above sequences Probes, primers or amplified copies.
  • mRNA messenger RNA
  • ribosomal RNA ribozymes
  • cDNA recombinant polynucleotides
  • synthetic polynucleotides synthetic polynucleotides
  • branched polynucleotides branched polynucleotides
  • plasmids vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acids of any of the
  • Polynucleotides may include nucleotides or nucleotide analogs. Nucleotides usually contain sugars (such as ribose or deoxyribose), bases and at least one phosphate group. Nucleotides can be abasic (ie, lack bases). Nucleotides include deoxyribonucleotides, modified deoxyribonucleotides, ribonucleotides, modified ribonucleotides, peptide nucleotides, modified peptide nucleotides, modified phosphate sugar backbone nucleosides Acids and their mixtures.
  • nucleotides include, for example, adenosine monophosphate (AMP), adenosine diphosphate (ADP), adenosine triphosphate (ATP), thymidine monophosphate (TMP), thymidine diphosphate (TDP), thoracic acid Glycosine triphosphate (TTP), cytidine acid (CMP), cytidine diphosphate (CDP), cytidine triphosphate (CTP), guanosine monophosphate (GMP), guanosine diphosphate (GDP), guanosine triphosphate (GTP), uridine monophosphate (UMP), uridine diphosphate (UDP), uridine triphosphate (UTP), deoxyadenosine monophosphate (dAMP), deoxyadenosine diphosphate (dADP), deoxyadenosine triphosphate (dATP), deoxythymidine monophosphate (dTMP), deoxythymidine diphosphate (
  • Nucleotide analogs containing modified bases can also be used in the methods described herein. Whether it has a natural backbone or a similar structure, exemplary modified bases that can be included in the polynucleotide include, for example, inosine, xathanine, hypoxathanine, isocytosine, isobird Purine, 2-aminopurine, 5-methylcytosine, 5-hydroxymethylcytosine, 2-aminoadenine, 6-methyladenine, 6-methylguanine, 2-propylguanine, 2 -Propyl adenine, 2-thiouracil, 2-thiothymine, 2-thiocytosine, 15-halouracil, 15-halocytosine, 5-propynyluracil, 5-propyne Cytosine, 6-azouracil, 6-azocytosine, 6-azothymine, 5-uracil, 4-thiouracil, 8-halogenated adenine or guanine, 8-aminoaden
  • nucleotides include nucleotides A, C, G, T or U.
  • nucleotide A refers to a nucleotide containing adenine (A) or a modification or analogue thereof, such as ATP, dATP.
  • Nucleotide G refers to a nucleotide containing guanine (G) or a modification or analogue thereof, such as GTP, dGTP.
  • Nucleotide C refers to a nucleotide containing cytosine (C) or a modification or analogue thereof, such as CTP, dCTP.
  • Nucleotide T refers to a nucleotide containing thymine (T) or a modification or analogue thereof, such as TTP, dTTP.
  • Nucleotide U refers to a nucleotide containing uracil (U) or a modification or analogue thereof, such as UTP, dUTP.
  • the present invention relates to labeling nucleotides with different markers, alone or in combination, so that different nucleotides can be distinguished, wherein the different markers can be detected by the same luminescence signal.
  • detection of different labels by the same luminescence signal is achieved by specifically binding different labels to respective ligands labeled with luminescence labels that can produce the same luminescence signal.
  • the luminescent labels that can produce the same luminescent signal are the same luminescent labels.
  • the label used to label a nucleotide and the ligand specifically bound thereto may be any molecule capable of specifically binding to each other, and the binding pair thereof is referred to herein as an anti-ligand pair.
  • the binding between the members of the anti-ligand pair may be non-covalent.
  • Anti-ligand pairs are not necessarily limited to paired single molecules.
  • a single ligand can be bound by a synergistic effect of two or more anti-ligands.
  • the binding between the members of the anti-ligand pair results in the formation of a binding complex, sometimes referred to as a ligand / anti-ligand complex or simply as a ligand / anti-ligand.
  • anti-ligand pairs include, but are not limited to: (a) haptens or antigenic compounds combined with corresponding antibodies or binding portions or fragments thereof, such as digoxin-digoxin antibody, N3G-N3G antibody, FITC FITC antibodies; (b) nucleic acid aptamers and proteins; (c) non-immune binding pairs (e.g.
  • one of the different markers can be a luminescent marker so that it can be directly detected. Other markers are still detected by specific binding to the respective ligands labeled with luminescent markers that produce the same luminescent signal.
  • the luminescent markers associated with the different markers are the same luminescent markers.
  • the term "luminescent marker” refers to any substance capable of emitting fluorescence at a specific emission wavelength when excited by a suitable excitation wavelength.
  • a luminescent label may be, for example, a fluorophore, for example selected from coumarin, AlexaFluor, Bodipy, fluorescein, tetramethylrhodamine, phenoxazine, acridine, Cy5, Cy3, AF532, Texas Red and derivatives thereof .
  • nucleotides labeled with different markers of the present invention alone or in combination can be used in various nucleic acid sequencing methods.
  • the nucleotides labeled with different markers of the present invention, alone or in combination are suitable for sequencing by synthesis.
  • Synthetic sequencing as used herein is a variety of synthetic sequencing methods well known in the art. Basically, sequencing by synthesis involves first hybridizing the sequenced nucleic acid molecule with the sequencing primer, and then polymerizing the sequenced nucleic acid molecule as a template at the 3 'end of the sequencing primer in the presence of a polymerase as labeled Nucleotides. After polymerization, the labeled nucleotide is identified by detecting the label. After removing the label from the labeled nucleotide (ie, the chemiluminescent label as described herein), the next polymerization sequencing cycle begins.
  • nucleic acid sequencing method can also use the nucleotides described herein to perform the method disclosed in US Patent No. 5302509.
  • the method for determining the target polynucleotide sequence can be performed by denaturing the target polynucleotide sequence, contacting the target polynucleotide with different nucleotides respectively, so as to form a complement of the target nucleotide, and detecting Incorporation of the nucleotide.
  • the method utilizes polymerization so that the polymerase extends the complementary strand by incorporating the correct nucleotide complementary to the target.
  • the polymerization reaction also requires special primers to initiate polymerization.
  • the incorporation of the labeled nucleotide is performed by polymerase, and the incorporation event is subsequently determined.
  • polymerase There are many different polymerases, and it is easy for those of ordinary skill in the art to determine the most suitable polymerase.
  • Preferred enzymes include DNA polymerase I, Klenow fragment, DNA polymerase III, T4 or T7 DNA polymerase, Taq polymerase or vent polymerase. It is also possible to use polymerases engineered to have specific properties.
  • the sequencing method is preferably performed on target polynucleotides arranged on a solid support.
  • Multiple target polynucleotides can be immobilized on the solid support via a linker molecule, or can be attached to particles such as microspheres, which can also be attached to the solid support material.
  • the polynucleotide can be attached to the solid support by a variety of methods, including the use of biotin-streptavidin interactions.
  • Methods for immobilizing polynucleotides on solid supports are well known in the art, and include lithography techniques and spotting each polynucleotide on a specific position on the solid support.
  • Suitable solid supports are well known in the art and include glass slides and beads, ceramic and silicon surfaces, and plastic materials.
  • the support is usually flat, although microbeads (microspheres) can also be used, and the latter can also be attached to other solid supports by known methods.
  • the microspheres can have any suitable size, and their diameter is usually 10-100 nm.
  • the polynucleotide is attached directly on a plane, preferably on a flat glass surface.
  • the connection is preferably made in the form of a covalent bond.
  • the array used is preferably a single molecule array, which includes polynucleotides located in unique optically distinguishable regions, such as described in International Application No. WO00 / 06770.
  • the conditions necessary to carry out the polymerization are well known to those skilled in the art.
  • the primer sequence In order to carry out the polymerase reaction, it is usually necessary first to anneal the primer sequence to the target polynucleotide, the primer sequence is recognized by the polymerase and serves as the starting site for the subsequent extension of the complementary strand The role.
  • the primer sequence may be added as an independent component with respect to the target polynucleotide.
  • the primer and the target polynucleotide may each be part of a single-stranded molecule, and the primer part and the target part form an intramolecular duplex, that is, a hairpin loop structure.
  • the structure can be fixed on the solid support at any position of the molecule.
  • Other conditions necessary to perform the polymerase reaction are well known to those skilled in the art, and these conditions include temperature, pH, and buffer composition.
  • the labeled nucleotide of the present invention is brought into contact with the target polynucleotide to enable polymerization.
  • the nucleotides can be added sequentially, that is, each type of nucleotide (A, C, G, or T / U) is added separately, or simultaneously.
  • the polymerization step is allowed to proceed for a time sufficient to incorporate one nucleotide.
  • the unincorporated nucleotides are then removed, for example, by performing a washing step on the array, and then detection of the incorporation label can be performed.
  • the detection can be performed by conventional methods. For example, methods of detecting fluorescent labels or signals are well known in the art. For example, it can be realized by a device that detects the wavelength of fluorescence. Such devices are well known in the art. For example, such a device may be a confocal scanning microscope, which scans the surface of the solid support with a laser in order to image the fluorophore on the nucleic acid molecule directly bound to be sequenced. In addition, each signal generated can be observed, for example, with a sensitive 2-D detector, such as a charge coupled detector (CCD). Other techniques such as scanning near-field optical microscopy (SNOM) can also be used, for example.
  • CCD charge coupled detector
  • SNOM scanning near-field optical microscopy
  • the label can be removed with suitable conditions.
  • nucleotides of the present invention is not limited to DNA sequencing technology, and other forms of polynucleotide synthesis, DNA hybridization analysis, and single nucleotide polymorphism studies can also be implemented using the nucleotides of the present invention .
  • Any technique involving the interaction between nucleotides and enzymes can utilize the molecules of the invention.
  • the molecule can be used as a substrate for reverse transcriptase or terminal transferase.
  • the labeled nucleotides of the present invention also have a 3 'protecting group.
  • the protecting group and label are typically two different groups on the 3 'blocked labeled nucleotide, but in other embodiments, the protecting group and label The objects can also be the same group.
  • protecting group means a group that prevents a polymerase (which incorporates a nucleotide containing the group into the polynucleotide chain being synthesized) from containing the group After the nucleotide of the group is incorporated into the polynucleotide chain being synthesized, it continues to catalyze the incorporation of another nucleotide.
  • Such protecting groups are also referred to herein as 3'-OH protecting groups. Nucleotides containing such protecting groups are also referred to herein as 3 ' blocked nucleotides.
  • the protecting group may be any suitable group that can be added to the nucleotide, as long as the protecting group can prevent additional nucleotide molecules from being added to the polynucleotide chain without destroying the polynucleotide chain Is easily removed from the sugar portion of the nucleotide.
  • the nucleotide modified by the protecting group needs to be resistant to polymerase or other suitable enzymes for incorporating the modified nucleotide into the polynucleotide chain.
  • the ideal protecting group exhibits long-term stability, can be efficiently incorporated by polymerase, prevents secondary incorporation or further incorporation of nucleotides, and can be used under mild conditions without damaging the structure of the polynucleotide It is preferably removed under aqueous conditions.
  • WO 91/06678 discloses that 3'-OH protecting groups include esters and ethers, -F, -NH 2 , -OCH 3 , -N 3 , -OPO 3 , -NHCOCH 3 , 2 nitrobenzene carbonate, 2 , 4-sulfenyl dinitro and tetrahydrofuran ether.
  • Metzker et al. disclose the synthesis of eight 3'-modified 2-deoxyribonucleoside 5'-triphosphates (3'-modified dNTPs) And application.
  • WO2002 / 029003 describes the use of allyl protecting groups to cap 3'-OH groups on the DNA growth chain in polymerase reactions.
  • various protecting groups reported in international application publications WO2014139596 and WO2004 / 018497 can be used, including, for example, those exemplified in FIG. Protective groups), and those such as those exemplified in Figures 3 and 4 of WO2004 / 018497 and those defined in the claims.
  • the above references are incorporated by reference in their entirety.
  • the protecting group may be directly attached to the 3 'position, or may be attached to the 2' position (the protecting group has a sufficient size or charge to block the interaction at the 3 'position).
  • the protecting group can be attached at the 3 'and 2' positions, and can be cleaved to expose the 3'OH group.
  • the sequencing protocol needs to remove the protecting group to produce a usable 3'-OH site for continuous strand synthesis.
  • an agent that can remove a protecting group from a modified nucleotide depends largely on the protecting group used. For example, removal of the ester protecting group from the 3 'hydroxyl functional group is usually achieved by alkaline hydrolysis. The ease of removing protective groups varies greatly; in general, the greater the electronegativity of the substituent on the carbonyl carbon, the greater the ease of removal.
  • a highly electronegative trifluoroacetic acid group can be rapidly cleaved from 3 'hydroxyl groups in methanol at pH 7 (Cramer et al., 1963), so it is unstable during polymerization at this pH.
  • the phenoxyacetate group is cleaved in less than 1 minute, but requires a significantly higher pH, for example with NH- / methanol (Reese and Steward, 1968).
  • a variety of hydroxyl protecting groups can be selectively cleaved using chemical methods other than alkaline hydrolysis.
  • 2,4-Dinitrophenylthio groups can be rapidly cleaved by treatment with nucleophiles such as thiophenol and thiosulfate (Letsinger et al., 1964). Allyl ether is cleaved by treatment with Hg (II) in acetone / water (Gigg and Warren, 1968). Tetrahydrothiopyranyl ethers were removed under neutral conditions using Ag (I) or Hg (II) (Cohen and Steele, 1966; Cruse et al., 1978). Photochemical deblocking can be used with photochemically cleavable protecting groups. There are several protecting groups available for this method.
  • o-nitrobenzyl ether as the protective group for the 2'-hydroxy functionality of ribonucleosides is known and proven (Ohtsuka et al., 1978); it is removed by irradiation at 260 nm.
  • the alkyl-o-nitrobenzyl carbonate protecting group was also removed by irradiation at pH 7 (Cama and Christensen, 1978). Enzymatic cleavage of the 3'-OH protecting group is also possible. It has been shown that T4 polynucleotide kinase can convert the 3'-phosphate end to the 3'-hydroxyl end, which can then be used as a primer for DNA polymerase I (Henner et al., 1983). This 3'-phosphatase activity is used to remove the 3 'protecting groups of those dNTP analogs that contain phosphates as protecting groups.
  • reagents that can remove protecting groups from 3 'blocked nucleotides include, for example, phosphines (such as tris (hydroxymethyl) phosphine (THP)), which can, for example, replace azide-containing 3'-OH protecting groups
  • phosphines such as tris (hydroxymethyl) phosphine (THP)
  • THP tris (hydroxymethyl) phosphine
  • the mass is removed from the nucleotide (for this application of phosphines, see for example the description in WO2014139596, the entire contents of which are incorporated herein by reference).
  • reagents that can remove the protecting group from the 3'-blocked nucleotides also include, for example, the removal of 3'-ene as a 3'-OH protecting group as described in pages 2004-116 of the specification of WO2004 / 018497
  • the corresponding reagents for propyl, 3,4-dimethoxybenzyloxymethyl or fluoromethoxymethyl are also included in the 3'-blocked nucleotides.
  • the label of the nucleotide is preferably removed together with the protecting group after detection.
  • the label can be incorporated into a protecting group, allowing it to be removed together with the protecting group after incorporating the 3'-blocked nucleotide into the nucleic acid strand.
  • the label can be attached to the nucleotide separately from the protecting group using a linking group.
  • a label may, for example, be linked to a purine or pyrimidine base of a nucleotide.
  • the linking group used is cleavable. The use of a cleavable linking group ensures that the label can be removed after detection, which avoids any signal interference with any labeled nucleotides that are subsequently incorporated.
  • a non-cleavable linking group can be used because after the labeled nucleotide is incorporated into the nucleic acid strand, no subsequent nucleotide incorporation is required, so there is no need to remove the label from the nucleotide Remove.
  • the label and / or linking group may have a size or structure sufficient to block the incorporation of other nucleotides into the polynucleotide chain (that is, the label itself may be used as a protection Group).
  • the blocking may be due to steric hindrance, or may be due to a combination of size, charge, and structure.
  • Cleavable linking groups are well known in the art, and conventional chemical methods can be used to link the linking group to the nucleotide base and the label.
  • the linking group can be connected at any position of the nucleotide base, provided that Watson-Crick base pairing can still be performed.
  • the linking group is through position 7 of the purine or the preferred deazapurine analogue, through 8-modified purine, through N-6 modified adenine or N-2 Modified guanine linkages will be preferred.
  • the connection is preferably via the 5th position on cytosine, thymine and uracil and the N-4 position on cytidine.
  • nucleoside cleavage site can be located on the linking group, which can ensure that a part of the linking group remains connected to the nucleotide base after cleavage.
  • Suitable linking groups include but are not limited to disulfide linking groups, acid labile linking groups (including dialkoxybenzyl linking groups, Sieber linking groups, indole linking groups, tert-butyl Sieber Linking group), electrophilic cleavable linking group, nucleophilic cleavable linking group, photo-cleavable linking group, linking group cleaved under reducing and oxidizing conditions, safety-catch ) Linking groups, and linking groups that are cleaved by elimination mechanisms.
  • Suitable linking groups can be modified with standard chemical protecting groups, as disclosed in the following documents: Greene & Wuts, Protective Groups, Organic Synthesis, John Wiley & Sons. Guillier et al. Disclose other suitable cleavable linking groups for solid-phase synthesis (Chem. Rev. 100: 2092-2157, 2000).
  • the linking group can be cleaved by any suitable method, including contact with acids, bases, nucleophiles, electrophiles, free radicals, metals, reducing or oxidizing reagents, light, temperature, enzymes, etc. Suitable cleavage of a cleavable linking group.
  • the cleavable linking group can be cleaved under the same conditions as the protecting group, so that only one treatment is required to remove the label and protecting group.
  • Electrophilic cleavage linking groups are typically cleaved by protons and include acid-sensitive cleavage.
  • Suitable electrophilic cleavage linking groups include modified benzyl systems, such as trityl, p-hydrocarbyloxybenzyl ester and p-hydrocarbyloxybenzylamide.
  • Other suitable linking groups include tert-butoxycarbonyl (Boc) groups and acetal systems.
  • thiophilic metals such as nickel, silver or mercury in the cleavage of thioacetals or other sulfur-containing protecting groups.
  • Nucleophilic cleavage linking groups include groups that are unstable in water (ie, can be easily cleaved at alkaline pH), such as esters, and groups that are unstable to non-aqueous nucleophiles. Fluoride ions can be used to cleave silicon-oxygen bonds in groups such as triisopropylsilane (TIPS) or tert-butyldimethylsilane (TBDMS). Photodegradable linking groups are widely used in sugar chemistry. Preferably, the light required to activate cleavage does not affect other components in the modified nucleotide.
  • Suitable linking groups include those based on O-nitrobenzyl compounds and nitroresveryl compounds.
  • a linking group based on benzoin chemistry can also be used (Lee et al., J. Org. Chem. 64: 3454-3460, 1999).
  • Various linking groups sensitive to reductive cleavage are known. Catalytic hydrogenation using palladium-based catalysts has been used to cleave benzyl and benzyloxycarbonyl groups. Disulfide bond reduction is also known in the art.
  • Oxidation-based methods are well known in the art. These methods include the oxidation of hydrocarbyloxybenzyl groups and the oxidation of sulfur and selenium linking groups. It is also within the scope of the present invention to use an iodine solution to cleave disulfides and other sulfur or selenium-based linking groups.
  • Safety-catchlinkers are those that are cleaved in two steps. In the preferred system, the first step is the generation of reactive nucleophilic centers, and the subsequent second step involves intramolecular cyclization, which results in cleavage.
  • the levulinate linkage can be treated with hydrazine or photochemical methods to release the active amine, which is then cyclized to cleave the ester elsewhere in the molecule (Burgess et al., J. Org. Chem. 62: 5165 -5168,1997). Elimination reactions can also be used to cleave the linking group. Base-catalyzed elimination of groups such as fluorenylmethoxycarbonyl and cyanoethyl groups and palladium-catalyzed reduction elimination of allyl systems can be used.
  • the linking group may include spacer units.
  • the length of the linking group is not important, as long as the label maintains a sufficient distance from the nucleotide so as not to interfere with the interaction between the nucleotide and the enzyme.
  • the linking group may be composed of functional groups similar to the 3'-OH protecting group. This allows the label and protecting groups to be removed in a single process.
  • a particularly preferred linking group is an azide-containing linking group cleavable by phosphine.
  • the reagent that can remove the label from the modified nucleotide depends to a large extent on the label used.
  • the label removing agent is removed using the reagent for removing the protecting group described above.
  • the reagent that cleaves the linking group as described above is used to remove the label.
  • the same reagents are used to remove the label and the protecting group from the modified nucleotide, for example in the case where the linking group consists of functional groups similar to the 3'-OH protecting group.
  • the invention relates to a method for determining the sequence of a target polynucleotide, which comprises:
  • the nucleotide is selected from one or more of the following: a first nucleotide, a second nucleotide, a third nucleotide, and a fourth nucleotide, wherein the first nucleotide A first nucleotide labeled with a first marker, the second nucleotide includes a second nucleotide labeled with a second marker, and the third nucleotide is selected from: (1) The first label The third nucleotide labeled with the substance and the third nucleotide labeled with the second marker, or (2) the third nucleotide labeled with the first marker and the second marker simultaneously, the fourth nucleus
  • the glucuronide contains an unlabeled fourth nucleotide
  • each of the nucleotides contains a protecting group attached via a 2 ’or 3’ oxygen atom
  • the first marker is a luminescent marker
  • step (e) The partial duplex of step (c) is then brought into contact with a ligand labeled with a luminescent label that specifically binds to the second marker, and then the luminescent label on the partial duplex is detected.
  • step (f) optionally remove the protecting group and label on the nucleotide incorporated in the partial duplex of step (c),
  • luminescent markers are the same luminescent markers.
  • the invention relates to a method for determining the sequence of a target polynucleotide, which comprises:
  • the nucleotide is selected from one or more of the following: a first nucleotide, a second nucleotide, a third nucleotide, and a fourth nucleotide, wherein the first nucleotide A first nucleotide labeled with a first marker, the second nucleotide includes a second nucleotide labeled with a second marker, and the third nucleotide is selected from: (1) The first label The third nucleotide labeled with the substance and the third nucleotide labeled with the second marker, or (2) the third nucleotide labeled with the first marker and the second marker simultaneously, the fourth nucleus
  • the glucuronide contains an unlabeled fourth nucleotide
  • each of the nucleotides contains a protecting group attached via a 2 ’or 3’ oxygen atom
  • step (d) contacting the partial duplex of step (c) with a ligand labeled with a luminescent label that specifically binds to the first label, and then detecting the luminescent label on the partial duplex The presence,
  • the ligand is then removed from the partial duplex,
  • step (e) contacting the partial duplex of step (c) with a ligand labeled with a luminescent label that specifically binds to the second marker, and then detecting the luminescent label on the partial duplex The presence,
  • step (f) optionally remove the protecting group and label on the nucleotide incorporated in the partial duplex of step (c),
  • luminescent markers are the same luminescent markers.
  • the inventors also found that by adding a part of unlabeled nucleotides, the signal value generated by a single labeled nucleotide can be controlled, which is conducive to the differentiation of different nucleotides and subsequent data Analyze and significantly improve sequencing results.
  • the first nucleotide in addition to the first nucleotide labeled with the first marker, may also include the unlabeled first nucleotide. In addition to the second nucleotide labeled with the second marker, the second nucleotide may also include an unlabeled second nucleotide.
  • the ratio of the first nucleotide labeled with the first marker and the unlabeled first nucleotide in the first nucleotide is 4: 1 to 3: 2. In a specific embodiment, the ratio of the second nucleotide labeled with the second marker and the unlabeled second nucleotide in the second nucleotide is 4: 1 to 3: 2.
  • the invention only performs sequencing based on single excitation fluorescence detection. Compared with the detection method using 4 or 2 fluorescent dyes to label 4 nucleotides, the sequencing method only requires a single excitation light source and a single camera, which can reduce the volume of sequencing equipment. Reduce the manufacturing cost of sequencing equipment.
  • the invention only generates one kind of fluorescence during the sequencing process, which can avoid interference between different fluorescent signals caused by labeling different fluorescent dyes. Compared with the detection of two fluorescent dyes, the mutual interference of two-color fluorescence and single-color fluorescence is also avoided.
  • the 3 'terminal hydroxyl group of the nucleotide used in the present invention is modified and blocked.
  • the sequencing process only one deoxyribonucleotide can be synthesized per reaction, and no utilization occurs
  • the present invention helps to improve the accuracy of sequencing.
  • nucleic acid molecule to be sequenced connected to the support, or connect the nucleic acid molecule to be sequenced to the support;
  • nucleotide A is connected to a first molecular label (such as biotin, N3G and other small molecules)
  • a second nucleotide such as nucleotide T
  • a second molecular label such as ground high Xin, FITC, etc.
  • the third nucleotide such as nucleotide C is partially connected with the first molecular label and the second molecular label
  • the fourth nucleotide such as nucleo
  • the signal value generated by a single labeled nucleotide is controlled, and some corresponding unlabeled nucleotides, such as A-cold and T-cold, are added.
  • the ratio of labeled nucleotide A to A-cold ranges from 4: 1 to 3: 2; the ratio of labeled nucleotide T to T-cold ranges from 4: 1 to 3: 2.
  • Class 4 nucleosides The final concentration of acid in the reaction solution is between 0.5-5 ⁇ M.
  • elution reagent PBS or TBS 300-400 ⁇ l, at a rate of 150-350 ⁇ l / min to elute the free fluorescently labeled ligand, in the photographic buffer, detect the fluorescent signal emitted under 50-1000ms exposure conditions .
  • elution reagent PBS or TBS
  • PBS or TBS elution reagent
  • the nucleotides were labeled with biotin and digoxin, and streptavidin and digoxin antibodies were used as their corresponding ligands.
  • C-biotin + C-digoxin C-biotin + C-digoxin
  • T-digoxin + T-cold T-digoxin + T-cold
  • the four groups of nucleotide analogs are formulated into a mixed solution at the above concentration and ratio.
  • the first group A-Biotin (1 ⁇ M)
  • C-biotin + C-digoxin C-biotin + C-digoxin
  • the four groups of nucleotide analogs are formulated into a mixed solution at the above concentration and ratio.
  • PBS Phosphate buffer solution
  • This reagent is both antibody ligand buffer and elution reagent
  • the circular single-stranded DNA is copied by rolling circles to prepare DNA nanospheres. Then continue to refer to the instructions of BGISEQ-500 High-throughput Sequencing Kit (SE100), and load the prepared DNA nanospheres onto the sequencing chip.
  • the barcode splitting efficiency is 82%.
  • Figure 1 is a picture of the signal extraction of the first base of the barcode sequence to be tested. From the figure, it can be seen that the four deoxyribonucleotides are divided into four signal groups according to the detection rules. In the lower left corner is the G base signal group; the horizontal signal arm is the A base signal group; the vertical signal arm is the T base signal group; the signal signal group located at the diagonal of the AT signal arm is the C base signal group.
  • FIG. 2 is a signal extraction diagram of the tenth base of the barcode sequence to be tested, and the signal arm is distinguished from the signal extraction diagram of the first base.
  • FIG. 5 is a signal extraction diagram of the first base in the experiment without adding unlabeled nucleotides, and the signal arm is distinguished from the signal extraction diagram in the experiment of adding unlabeled nucleotides as described above.
  • mapping rate was 70%
  • error rate was 2%
  • nucleic acid molecule to be sequenced connected to the support, or connect the nucleic acid molecule to be sequenced to the support;
  • the signal value generated by a single labeled nucleotide is controlled, and some corresponding unlabeled nucleotides, such as A-cold and T-cold, are added.
  • the ratio of labeled nucleotide A to A-cold ranges from 4: 1 to 3: 2; the ratio of labeled nucleotide T to T-cold ranges from 4: 1 to 3: 2.
  • Class 4 deoxyribose The final concentration of nucleotide analogues in the reaction solution is between 0.5-5 ⁇ M.
  • elution reagent PBS or TBS
  • PBS or TBS elution reagent
  • the four groups of nucleotide analogs are formulated into a mixed solution at the above concentration and ratio.
  • the first group A-AF532 (1 ⁇ M)
  • the four groups of nucleotide analogs are formulated into a mixed solution at the above concentration and ratio.
  • PBS Phosphate buffer solution
  • This reagent is both antibody ligand buffer and elution reagent
  • the circular single-stranded DNA is copied by rolling circles to prepare DNA nanospheres. Then continue to refer to the instructions of BGISEQ-500 High-throughput Sequencing Kit (SE100), and load the prepared DNA nanospheres onto the sequencing chip.
  • mapping rate was 67% and the error rate was 2%.
  • Figure 3 is a picture of the signal extraction of the first base of the sequence to be tested. From the figure, it can be seen that the four deoxyribonucleotides are divided into four signal groups according to the detection rules. In the lower left corner is the G base signal group; the horizontal signal arm is the A base signal group; the vertical signal arm is the T base signal group; the signal signal group located at the diagonal of the AT signal arm is the C base signal group.
  • FIG. 4 is a signal extraction diagram of the 50th base of the barcode sequence to be tested, and the signal arm is distinguished from the signal extraction diagram of the first base.
  • FIG. 6 is a signal extraction diagram of the first base in the experiment without adding unlabeled nucleotides, and the signal arms are distinguished from the signal extraction diagram in the experiment of adding unlabeled nucleotides as described above.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供了一种对多核苷酸进行测序的方法,其中通过相同的发光信号检测不同核苷酸的依次掺入,从而实现多核苷酸序列的测定。

Description

对多核苷酸进行测序的方法 发明领域
本发明涉及对多核苷酸进行测序的方法,其中通过相同的发光信号检测不同核苷酸的依次掺入,从而实现多核苷酸序列的测定。
发明背景
1977年,桑格发明了双脱氧末端终止测序法,成为第一代测序技术的代表。2001年,依托第一代测序技术,完成了人类基因组草图。桑格测序法具有实验操作简单、结果直观准确和实验周期短等特点,在对检测结果时效性要求较高的临床基因突变检测以及基因分型等领域有着广泛的应用。然而,桑格测序法通量小、成本高,限制了其在大规模基因测序中的应用。
为克服桑格测序法的缺点,第二代测序技术应运而生。与第一代测序技术相比,第二代测序技术具有通量大、成本低、自动化程度高等优点,适合于大规模测序。目前已开发的第二代测序技术主要涉及边连接边测序(sequencing by ligation,SBL)技术和边合成边测序(sequencing by synthesis,SBS)技术。这些测序技术的典型实例包括Roche 454测序法、Applied Biosystems公司开发的SOLiD测序法、Complete Genomics自主开发的联合探针锚定连接法(cPAL)和华大基因开发的联合探针锚定合成法(cPAS)、Illumina公司和Solexa technology公司合作开发的Illumina测序法等等。测序检测方法主要有电化学法、光信号检测法等,其中,较为主流的检测方法为光信号检测。为了实现4种碱基(A、T/U、C和G)的鉴别和区分,需要使用4种荧光染料来分别标记4种碱基。目前也有报道使用2种荧光染料标记4种碱基,通过2种荧光染料的不同组合来实现4种碱基的鉴别和区分。而Roche 454测序法运用自发荧光原理,将dNTP合成到待测序列产生的焦磷酸转变为ATP,利用生成的ATP和荧光素酶共同氧化荧光素产生荧光,通过检测荧光信号的有无及强弱,区分4种碱基及合成碱基的个数。第二代测序技术由于硬件要求,仪器普遍较为庞大,不利于携带及搬运。
目前测序技术已发展至第三代,其克服了第二代测序技术仪器庞大的缺点,如Oxford Nanopore的测序仪因为其测序原理的不同使其测序仪体积大大缩小,甚至可以被携带至太空进行测序实验。但是目前第三代测序技术错误率较高,限制了其大规模推广。
以Illumina、Complete Genomics和华大基因开发的测序仪为例,用4种荧光染料分别标记4种碱基,通过激光激发,采集不同的荧光信号,以区分不同碱基。参见例如Sara Goodwin,John D.McPherson and W.Richard McCombie,Coming of age:ten years of next-generation sequencing technologies.Nature reviews,2016,17:333-351c。
Illumina公司开发的NextSeq测序系统和Mini-Seq测序系统,以及华大基因BGISEQ-50测序系统均使用2种荧光染料标记4种碱基,通过2种荧光染料的不同组合来实现4种碱基的鉴别和区分。如,通过用第一荧光染料标记碱基A,用第二荧光染料标记碱基G,用第一和第二荧光染料同时标记碱基C,且不对碱基T进行标记,从而区分四种碱基。参见例如美国专利US 9453258 B2。
在Roche 454测序法中,依次分别通入一种脱氧核糖核苷酸(dNTP),若该dNTP能与待测序列配对,则在dNTP合成后释放焦磷酸,焦磷酸与测序反应体系中的ATP硫酸化酶反应生成ATP,生成的ATP再与体系中的荧光素酶共同氧化荧光素发出荧光,荧光信号被检测器捕捉,经计算机分析转换为测序结果。参见例如Martin Kircher and Janet Kelso.High-throughput DNA sequencing–concepts and limitations. Bioessays,2010,32:524-536。
Ion torrent测序系统与Roche 454测序法类似,依次分别通入一种脱氧核糖核苷酸(dNTP),若该dNTP能与待测序列配对,则在dNTP合成后释放氢离子,产生的氢离子改变反应体系的pH值,集成在测序芯片上的电器元件将pH值变化转变为电信号传输至计算机,经计算机分析转换为测序结果。参见例如Sara Goodwin,John D.McPherson and W.Richard McCombie,Coming of age:ten years of next-generation sequencing technologies.Nature reviews,2016,17:333-351。
这些技术存在以下缺陷:
1.利用4种荧光染料标记4种碱基,为了区分不同的荧光信号,测序设备至少配备2种单色激发光源和2个相机,这导致测序装置的制造成本昂贵且体积巨大。
2.利用2种荧光染料标记4种碱基,相比于利用4种荧光染料标记,虽然降低了设备制造成本,缩小了设备体积,但是实验证明,由于该方案中其中一种dNTP同时标记了两种荧光,激光同时激发两种荧光发光,随着测序长度增加,模板状态变差(目前存在的二代测序技术,无论何种原理,都存在随着测序读长增加,测序质量变差的情况),导致标记的两种荧光不能被平衡激发(其中一种荧光发光强度明显高于另一种)使这种融合荧光的dNTP信号趋向于和单种荧光标记的dNTP信号揉合,导致无法区分不同的dNTP,因此其测序质量明显低于用4种荧光染料标记的检测方法。
3.不论是利用4种或者2种荧光染料标记4种碱基的检测方法,不同荧光之间均可能存在信号互相干扰,影响测序质量。
4.Roche测序法与Ion torrent测序法,虽然不需要激发光源与相机等设备,但是其使用的脱氧核糖核苷酸为天然状态,当遇到待测序列具有重复的碱基排布时,如5’-ATTTG-3’,与碱基排布为5’-ATG-3’的序列相比,只能通过信号强弱进行区分(理论上序列5’-ATTTG-3’信号值约为5’-ATG-3’的3倍),这种判别方法受测序条件干扰较大,也不易控制,尤其在读长较长时,很难将两者区分开。
因此,本领域仍然存在对成本更低、效果更好的测序方法的需要。
发明内容
本发明涉及对多核苷酸进行测序的方法,其中通过相同的发光信号检测不同核苷酸的依次掺入,从而实现多核苷酸序列的测定。
在一个方面,本发明涉及用于确定靶多核苷酸的序列的方法,其包括:
(a)提供靶多核苷酸,
(b)使所述靶多核苷酸与引物接触,以使所述引物杂交至所述靶多核苷酸,从而形成靶多核苷酸与引物的部分双链体,
(c)在允许聚合酶进行核苷酸聚合反应的条件下,使所述部分双链体与聚合酶和核苷酸接触,以使得所述核苷酸掺入到所述引物上,
其中所述核苷酸选自以下的一种或多种:第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,其中所述第一核苷酸包含经第一标记物标记的第一核苷酸和任选未经标记的第一核苷酸,所述第二核苷酸包含经第二标记物标记的第二核苷酸和任选未经标记的第二核苷酸,所述第三核苷酸选自:(1)经第一标记物标记的第三核苷酸和经第二标记物标记的第三核苷酸,或(2)经第一标记物和第二标记物同时标记的第三核苷酸,所述第四核苷酸包含未经标记的第四核苷酸,
其中所述核苷酸各自的核糖或脱氧核糖部分包含通过2’或3’氧原子附接的保护基团,
(d)检测步骤(c)的所述部分双链体上第一标记物的存在,
(e)检测步骤(c)的所述部分双链体上第二标记物的存在,
(f)任选地去除在步骤(c)的所述部分双链体中掺入的核苷酸上的保护基团和标记物,
(g)任选地重复步骤(c)-(f)一次或多次,从而获得所述靶多核苷酸的序列信息,
其中所述第一标记物和所述第二标记物的存在通过相同的发光信号来检测。
在具体的实施方案中,所述第一标记物是发光标记物。
在具体的实施方案中,步骤(d)包括使步骤(c)的所述部分双链体与特异性结合所述第一标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在。
在具体的实施方案中,在去除步骤(c)的所述部分双链体中掺入的核苷酸上的保护基团和标记物时一起去除所述配体。
在具体的实施方案中,步骤(e)包括使步骤(c)的所述部分双链体与特异性结合所述第二标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在。
在具体的实施方案中,步骤(e)在步骤(d)之后进行。
在具体的实施方案中,所述发光标记物是相同的发光标记物。
在具体的实施方案中,所述发光标记物是荧光标记物,例如是荧光团,例如选自香豆素、AlexaFluor、Bodipy、荧光素、四甲基罗丹明、Cy5、Cy3、得克萨斯红及其衍生物。
在具体的实施方案中,所述第一核苷酸中所述经第一标记物标记的第一核苷酸和所述未经标记的第一核苷酸的比率为4:1至3:2。
在具体的实施方案中,所述第二核苷酸中所述经第二标记物标记的第二核苷酸和所述未经标记的第二核苷酸的比率为4:1至3:2。
在其他方面,本发明还涉及用于对多核苷酸进行测序的试剂盒,其包含:(a)选自以下的一种或多种核苷酸:第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,其中所述第一核苷酸包含经第一标记物标记的第一核苷酸和任选未经标记的第一核苷酸,所述第二核苷酸包含经第二标记物标记的第二核苷酸和任选未经标记的第二核苷酸,所述第三核苷酸选自:(1)经第一标记物标记的第三核苷酸和经第二标记物标记的第三核苷酸,或(2)经第一标记物和第二标记物同时标记的第三核苷酸,所述第四核苷酸包含未经标记的第四核苷酸;和(b)它们的包装材料,其中所述核苷酸各自的核糖或脱氧核糖部分包含通过2’或3’氧原子附接的保护基团。
在具体的实施方案中,所述第一标记物是发光标记物。
在具体的实施方案中,试剂盒还包含特异性结合所述第一标记物的经发光标记物标记的配体。
在具体的实施方案中,试剂盒还包含特异性结合所述第二标记物的经发光标记物标记的配体。
在具体的实施方案中,所述发光标记物是相同的发光标记物。
在具体的实施方案中,其中所述发光标记物是荧光标记物,例如是荧光团,例如选自香豆素、AlexaFluor、Bodipy、荧光素、四甲基罗丹明、Cy5、Cy3、得克萨斯红及其衍生物。
在具体的实施方案中,试剂盒还包含酶和适合所述酶起作用的缓冲液。
附图说明
图1显示了实施例1中对大肠杆菌条形码序列进行测序的第1位碱基信号提取图。
图2显示了实施例1中对大肠杆菌条形码序列进行测序的第10位碱基信号提取图。
图3显示了实施例2中对大肠杆菌条形码序列进行测序的第1位碱基信号提取图。
图4显示了实施例2中对大肠杆菌条形码序列进行测序的第50位碱基信号提取图。
图5显示了实施例1中未添加不带标记核苷酸实验的第一位碱基的信号提取图。
图6显示了实施例2中未添加不带标记核苷酸实验的第一位碱基的信号提取图。
发明详述
除非另外定义,否则本文使用的所有技术和科学术语具有与本发明所属领域的普通技术人员通常理解的含义相同的含义。本文提及的所有专利、申请和其他出版物均通过引用整体并入本文。如果本文中提出的定义与通过引用并入本文的专利、申请和其他出版物中所述的定义相抵触或不一致,则以本文所述的定义为准。
如本文所用,术语“多核苷酸”是指脱氧核糖核酸(DNA)、核糖核酸(RNA)或其类似物。多核苷酸可以是单链的、双链的或含有单链和双链序列两者。多核苷酸分子可以来源于双链DNA(dsDNA)形式(例如,基因组DNA、PCR和扩增产物等),或者可以来源于单链形式的DNA(ssDNA)或RNA并且其可以转化为dsDNA形式,并且反之亦然。多核苷酸分子的准确序列可以是已知的或未知的。以下是多核苷酸的示例性实例:基因或基因片段(例如,探针、引物、EST或SAGE标签)、基因组DNA、基因组DNA片段、外显子、内含子、信使RNA(mRNA)、转运RNA、核糖体RNA、核糖酶、cDNA、重组多核苷酸、合成多核苷酸、分枝多核苷酸、质粒、载体、任何序列的分离的DNA、任何序列的分离的RNA、任何上述序列的核酸探针、引物或扩增拷贝。
多核苷酸可以包括核苷酸或核苷酸类似物。核苷酸通常含有糖(如核糖或脱氧核糖)、碱基和至少一个磷酸基。核苷酸可以是无碱基的(即,缺少碱基)。核苷酸包括脱氧核糖核苷酸、修饰的脱氧核糖核苷酸、核糖核苷酸、修饰的核糖核苷酸、肽核苷酸、修饰的肽核苷酸、修饰磷酸盐糖主链核苷酸及其混合物。核苷酸的实例包括(例如)腺苷一磷酸(AMP)、腺苷二磷酸(ADP)、腺苷三磷酸(ATP)、胸苷一磷酸(TMP)、胸苷二磷酸(TDP)、胸苷三磷酸(TTP)、胞苷酸(CMP)、胞苷二磷酸(CDP)、胞苷三磷酸(CTP)、鸟苷一磷酸(GMP)、鸟苷二磷酸(GDP)、鸟苷三磷酸(GTP)、尿苷一磷酸(UMP)、尿苷二磷酸(UDP)、尿苷三磷酸(UTP)、脱氧腺苷酸(dAMP)、脱氧腺苷二磷酸(dADP)、脱氧腺苷三磷酸(dATP)、脱氧胸腺嘧啶核苷一磷酸(dTMP)、脱氧胸腺嘧啶核苷二磷酸(dTDP)、脱氧胸苷三磷酸(dTTP)、去氧胞二磷(dCDP)、脱氧胞苷三磷酸(dCTP)、脱氧鸟苷一磷酸(dGMP)、脱氧鸟苷二磷酸(dGDP)、脱氧鸟苷三磷酸(dGTP)、脱氧尿苷一磷酸(dUMP)、脱氧尿苷二磷酸(dUDP)和脱氧尿苷三磷酸(dUTP)。还可以在本文所述的方法中使用包含修饰的碱基的核苷酸类似物。无论是具有天然主链还是类似结构,可以包含在多核苷酸中的示例性修饰的碱基包括(例如)肌苷、黄嘌呤(xathanine)、次黄嘌呤(hypoxathanine)、异胞嘧啶、异鸟嘌呤、2-氨基嘌呤、5-甲基胞嘧啶、5-羟甲基胞嘧啶、2-氨基腺嘌呤、6-甲基腺嘌呤、6-甲基鸟嘌呤、2-丙基鸟嘌呤、2-丙基腺嘌呤、2-硫脲嘧啶、2-硫胸腺嘧啶、2-硫胞嘧啶、15-卤代脲嘧啶、15-卤代胞嘧啶、5-丙炔基尿嘧啶、5-丙炔基胞嘧啶、6-偶氮尿嘧啶、6-偶氮胞嘧啶、6-偶氮胸腺嘧啶、5-尿嘧啶、4-硫尿嘧啶、8-卤代腺嘌呤或鸟嘌呤、8-氨基腺嘌呤或鸟嘌呤、8-硫腺嘌呤或鸟嘌呤、8-硫烷基腺嘌呤或鸟嘌呤、8-羟基腺嘌呤或鸟嘌呤、5-卤素取代的尿嘧啶或胞嘧啶、7-甲基鸟嘌呤、7-甲基腺嘌呤、8-氮杂鸟嘌呤、8-氮杂腺嘌呤、7-去氮鸟嘌呤、7-去氮腺嘌呤、3-去氮鸟嘌呤、3-去氮腺嘌呤等。如本领域中已知的,某些核苷酸类似物不能引入多核苷酸,例如,核苷酸类似物,如腺苷5’-磷酰硫酸。
通常而言,核苷酸包括核苷酸A、C、G、T或U。如本文所用,术语“核苷酸A”是指含有腺嘌呤(A)或其修饰物或类似物的核苷酸,例如ATP、dATP。“核苷酸G”是指含有鸟嘌呤(G)或其修饰物或类似物的核苷酸,例如GTP、dGTP。“核苷酸C”是指含有胞嘧啶(C)或其修饰物或类似物的核苷酸,例如CTP、dCTP。“核苷酸T”是指含 有胸腺嘧啶(T)或其修饰物或类似物的核苷酸,例如TTP、dTTP。“核苷酸U”是指含有尿嘧啶(U)或其修饰物或类似物的核苷酸,例如UTP、dUTP。
核苷酸的标记
本发明涉及用不同的标记物单独或组合地标记核苷酸,从而使得可以区分不同的核苷酸,其中所述不同的标记物可以通过相同的发光信号检测。
在具体的实施方案中,通过将不同的标记物与各自的用可产生相同的发光信号的发光标记物标记的配体特异性结合,来实现通过相同的发光信号检测不同的标记物。在优选的实施方案中,所述可产生相同的发光信号的发光标记物是相同的发光标记物。
如本文所用,用于标记核苷酸的所述标记物和与其特异性结合的配体可以是任何能够彼此特异性结合的分子,其结合对在本文中称为抗配体对。抗配体对的成员之间的结合可以是非共价的。抗配体对还不必限于成对的单分子。例如,单个配体可以通过两种或更多种抗配体的协同作用结合。抗配体对的成员之间的结合导致结合复合物的形成,有时称为配体/抗配体复合物或简单地作为配体/抗配体。示例性的抗配体对包括但不限于:(a)与相应抗体或其结合部分或片段组合的半抗原或抗原性化合物,例如地高辛-地高辛抗体,N3G-N3G抗体,FITC-FITC抗体;(b)核酸适配体和蛋白质;(c)非免疫结合对(例如生物素-抗生物素蛋白、生物素-链霉亲和素、生物素-中性抗生蛋白);(d)激素-激素结合蛋白;(e)受体-受体激动剂或拮抗剂;(f)凝集素-碳水化合物;(g)酶-酶辅因子;(h)酶-酶抑制剂;和(i)能够形成核酸双链体的互补的寡核苷酸或多核苷酸对。
在另一个具体的实施方案中,所述不同的标记物之一可以是发光标记物,从而可以直接检测。而其他的标记物则仍然通过与各自的用可产生相同的发光信号的发光标记物标记的配体特异性结合来检测。在优选的实施方案中,与所述不同的标记物相关的发光标记物是相同的发光标记物。
如本文所用,术语“发光标记物”是指在被合适的激发波长激发时,能够以特定的发射波长发射荧光的任何物质。这样的发光标记物可以是例如荧光团,例如选自香豆素、AlexaFluor、Bodipy、荧光素、四甲基罗丹明、吩噁嗪、吖啶、Cy5、Cy3、AF532、得克萨斯红及其衍生物。
多核苷酸的测序
本发明的用不同的标记物单独或组合地标记的核苷酸可用于各种核酸测序方法。优选地,本发明的用不同的标记物单独或组合地标记的核苷酸适用于合成法测序。如本文所用的合成法测序是本领域熟知的各种合成法测序方法。基本地,合成法测序涉及首先将被测序的核酸分子与测序引物杂交,随后在聚合酶的存在下,以被测序的核酸分子为模板在测序引物的3’端聚合如本文所述的经标记的核苷酸。聚合之后,通过检测所述标记来鉴定该经标记的核苷酸。在从经标记的核苷酸上除去标记(即如本文所述的化学发光标记物)之后,开始下一个聚合测序循环。
此外核酸测序方法还可以用本文所述的核苷酸进行公开于美国专利号5302509中的方法。
用于测定靶多核苷酸序列的方法可以这样进行:使靶多核苷酸序列变性,使靶多核苷酸分别与不同的核苷酸接触,以便形成所述靶核苷酸的互补体,并且检测所述核苷酸的掺入。所述方法利用了聚合,使得聚合酶通过掺入互补于所述靶的正确的核苷酸,以延伸所述互补链。所述聚合反应还需要特殊引物来启动聚合作用。
对每一轮反应来说,所述经标记的核苷酸的掺入是通过聚合酶进行的,并随后测定所述掺入事件。存在很多不同的聚合酶,并且对本领域普通技术人员来说容易确定最适合的聚合酶。优选的酶包括DNA聚合酶I、Klenow片段、DNA聚合酶III、T4或T7DNA聚合酶、Taq聚合酶或vent聚合酶。还可以使用通过工程方法改造成 具有特定性质的聚合酶。
所述测序方法优选对排列在固体支持物上的靶多核苷酸进行。可以通过接头分子将多个靶多核苷酸固定在所述固体支持物上,或者可以连接在诸如微球体的颗粒上,所述颗粒还可以连接在固体支持材料上。
可以通过多种方法将所述多核苷酸连接在所述固体支持物上,包括使用生物素-链亲和素相互作用。用于将多核苷酸固定在固体支持物上的方法为本领域所公知,并且包括石板印刷技术以及将每一种多核苷酸点样在固体支持物的特定位置上。合适的固体支持物为本领域所公知,并且包括玻璃载玻片和珠、陶瓷和硅表面和塑料材料。所述支持物通常是平面,尽管也可以使用微珠(微球体),并且还可以通过已知方法将后者连接在其他固体支持物上。所述微球体可以具有任何合适的大小,其直径通常为10-100纳米。在优选实施方案中,将所述多核苷酸直接连接在平面上,优选连接在平的玻璃表面上。连接优选通过共价键的形式进行。所使用的阵列优选是单分子阵列,它包括位于独特的光学可分辨区域的多核苷酸,例如在国际申请号WO00/06770中所描述的。
进行聚合的必须条件对本领域技术人员来说是熟知的。为了进行所述聚合酶反应,通常首先必须使引物序列与所述靶多核苷酸退火,所述引物序列是由所述聚合酶识别的,并且起着所述互补链随后延伸的起始位点的作用。所述引物序列可以相对所述靶多核苷酸作为独立的成分添加。另外,所述引物和靶多核苷酸可以分别是一个单链分子的一部分,由所述引物部分与所述靶的一部分形成分子内双链体,即发卡环结构。可以在所述分子的任何位点,将该结构固定在所述固体支持物上。进行所述聚合酶反应所必需的其他条件,对本领域技术人员来说是熟知的,这些条件包括温度、pH、缓冲液组成。
随后,使本发明的经标记的核苷酸与所述靶多核苷酸接触,以便能够进行聚合。所述核苷酸可以依次添加,即分别添加每一种类型的核苷酸(A,C,G或T/U),或同时添加。
使所述聚合步骤进行足以掺入一个核苷酸的时间。
然后除去未掺入的核苷酸,例如,通过对所述阵列实施洗涤步骤,并且随后可以进行对掺入标记的检测。
检测可以通过常规方法进行。例如,检测荧光标记或信号的方式是本领域熟知的。例如,可以通过检测荧光的波长的装置来实现。这样的装置是本领域熟知的。例如,这样的装置可以是共焦扫描显微镜,其用激光扫描固体支持物的表面,以便使直接结合被测序的核酸分子上的荧光团成像。另外,可以例如用灵敏的2-D探测器,如电荷偶连的探测器(CCD)观察所产生的每一种信号。还可以例如使用诸如扫描近场光学显微方法(SNOM)的其他技术。
在检测之后,可以用合适条件除去所述标记。
本发明的经标记的核苷酸的使用并不局限于DNA测序技术,还可用本发明的核苷酸实施包括多核苷酸合成,DNA杂交分析,和单核苷酸多态性研究的其他形式。涉及到核苷酸和酶之间的相互作用的任何技术,都可以利用本发明的分子。例如,可以将所述分子用作逆转录酶或末端转移酶的底物。
在具体的实施方案中,本发明的经标记的核苷酸还具有3’保护基团。在本发明的一些实施方案中,保护基团和标记物通常是3’阻断的经标记的核苷酸上的两种不同的基团,但在另一些实施方案中,保护基团和标记物也可以是同一基团。
如本文所用,术语“保护基团”意指这样的基团,其阻止聚合酶(其将含有该基团的核苷酸掺入到正在合成的多核苷酸链上的)在将含有该基团的核苷酸掺入到正在合成的多核苷酸链上后继续催化另一核苷酸的掺入。这样的保护基团在本文中也被称为3’-OH保护基团。包含这样的保护基团的核苷酸在本文中也被称为3’阻断的核苷酸。 保护基团可以是能够被添加到核苷酸上任何合适的基团,只要该保护基团能防止另外的核苷酸分子被加入至多核苷酸链中且同时在不破坏该多核苷酸链的情况下易于从核苷酸的糖部分除去。此外,经保护基团修饰的核苷酸需要耐受聚合酶或用于将该修饰的核苷酸掺入多核苷酸链内的其他适合的酶。因此,理想的保护基团表现出长期的稳定性,可被聚合酶高效地掺入,阻止核苷酸的二次掺入或进一步掺入,并且能够在不破坏多核苷酸结构的温和条件下优选地在水性条件下被除去。
现有技术已描述了多种符合上述描述的保护基团。例如,WO 91/06678公开3'-OH保护基团包括酯和醚,-F,-NH 2,-OCH 3,-N 3,-OPO 3,-NHCOCH 3,2硝基苯碳酸酯,2,4-次磺酰二硝基和四氢呋喃醚。Metzker等人(Nucleic Acids Research,22(20):4259-4267,1994)公开了八种3’-修饰的2-脱氧核糖核苷5’-三磷酸酯(3’-修饰的dNTP)的合成和应用。WO2002/029003描述了在聚合酶反应中使用烯丙基保护基团对DNA生长链上的3’-OH基团加帽。优选地,可以使用国际申请公开WO2014139596和WO2004/018497中报导的各种保护基团,包括例如WO2014139596的图1A中示例的那些保护基团和权利要求书中限定的那些3’羟基保护基(即保护基团),和例如WO2004/018497的图3和4中示例的那些保护基团和权利要求书中限定的那些保护基团。上述参考文献均通过引用整体并入本文。
本领域技术人员将会理解如何将合适的保护基团连接在核糖环上,以便阻断与3′-OH的相互作用。所述保护基团可以直接连接在3’位置上,或者可以连接在2’位置上(所述保护基团具有足够的大小或电荷,以便阻断3’位置上的相互作用)。另外,所述保护基团可以连接在3′和2′位置,并且可以被裂解,以便暴露出3′OH基团。
在成功地将3'阻断的核苷酸掺入核酸链后,测序方案需要除去保护基团以产生用于连续链合成的可用的3'-OH位点。如本文所用的可从经修饰的核苷酸上除去保护基团的试剂在很大程度上取决于所使用的保护基团。例如,从3'羟基官能团除去酯保护基团通常通过碱水解来实现。除去保护基团的容易程度差异很大;通常,羰基碳上取代基的电负性越大,除去的容易度越大。例如,高电负性的三氟乙酸基团在甲醇中在pH7下能够从3'羟基快速裂解(Cramer等人,1963),因此其在该pH下的聚合期间是不稳定的。苯氧基乙酸酯基团在少于1分钟内裂解,但是需要显著更高的pH,例如用NH-/甲醇实现(Reese和Steward,1968)。使用除碱水解以外的化学方法可以选择性地切割各种各样的羟基保护基团。通过用亲核试剂例如苯硫酚和硫代硫酸盐处理可迅速裂解2,4-二硝基苯硫基(Letsinger等,1964)。烯丙基醚通过用丙酮/水中的Hg(II)处理而裂解(Gigg and Warren,1968)。使用Ag(I)或Hg(II)在中性条件下除去四氢噻喃基醚(Cohen and Steele,1966;Cruse等人,1978)。光化学去阻断可以与可光化学裂解的保护基团一起使用。有几种保护基团可用于这种方法。使用邻硝基苄醚作为核糖核苷的2'-羟基官能性的保护基团是已知且被证明的(Ohtsuka等,1978);其通过在260nm的照射进行除去。碳酸烷基邻硝基苄基碳酸酯保护基也通过在pH7的照射下被除去(Cama and Christensen,1978)。3'-OH保护基团的酶解解阻断也是可能的。已经证明T4多核苷酸激酶可以将3'-磷酸酯末端转化成3'-羟基末端,然后可以用作DNA聚合酶I的引物(Henner等,1983)。该3'-磷酸酶活性用于除去含有磷酸酯作为保护基团的那些dNTP类似物的3'保护基团。
可从3'阻断的核苷酸上除去保护基团的其它试剂包括例如膦(例如三(羟甲基)膦(THP)),其可以例如将含叠氮化物的3’-OH保护基团从核苷酸上除去(关于膦的此应用可参见例如WO2014139596中的记载,其全部内容通过引用并入本文)。可从3'阻断的核苷酸上除去保护基团的其它试剂还包括例如WO2004/018497的说明书中第114-116页描述的用于除去作为3’-OH保护基团的3’-烯丙基、3,4-二甲氧基苄氧基甲基或氟甲氧甲基的相应试剂。
在本发明的实施方案中,核苷酸的标记物优选在检测后与保护基团一起被除去。
在某些实施方案中,标记物可以被掺入保护基团,从而允许在将3'阻断的核苷酸掺入核酸链后其能够与保护基团一起被除去。
在其他实施方案中,标记物可以利用连接基团与保护基团分开地连接在核苷酸上。这样的标记物可以例如连接到核苷酸的嘌呤或嘧啶碱基上。在某些实施方案中,所用的连接基团是可裂解的。可裂解的连接基团的使用能确保所述标记可以在检测之后除去,这避免了与后续掺入的任何经标记的核苷酸的任何信号干扰。在另一些实施方案中,可以使用不可裂解的连接基团,因为在经标记的核苷酸掺入核酸链之后,不需要后续的核苷酸掺入,因此不需要将标记从核苷酸中除去。
在另外的实施方案中,标记物和/或连接基团可以具有足以发挥阻断其他核苷酸掺入到多核苷酸链上的大小或结构(也就是说,所述标记本身可用作保护基团)。所述阻断可能是由于空间位阻造成的,或者可能是由于大小、电荷和结构的组合造成的。
可裂解的连接基团为本领域所公知,并且可以采用常规化学方法,以便将连接基团连接在核苷酸碱基和标记物上。连接基团可以连接在核苷酸碱基的任何位置上,其前提是,仍然能进行Watson-Crick碱基配对。对于嘌呤碱基来说,如果所述连接基团是通过所述嘌呤或优选的脱氮嘌呤类似物的7号位置,通过8-修饰的嘌呤,通过N-6修饰的腺嘌呤或N-2修饰的鸟嘌呤连接的话将是优选的。对于嘧啶来说,连接优选是通过胞嘧啶,胸腺嘧啶和尿嘧啶上的5号位置和胞苷上的N-4位置连接的。
使用术语“可裂解的连接基团”并非意味着需要除去整个连接基团(例如,从核苷酸碱基中除去)。当标记物与碱基相连接时,核苷裂解位点可位于连接基团上的位置,该位置能够确保在裂解后一部分的连接基团仍与所述核苷酸碱基保持连接。
合适的连接基团包括但不局限于二硫连接基团,酸不稳定性连接基团(包括二烷氧基苄基连接基团,Sieber连接基团,吲哚连接基团,叔丁基Sieber连接基团),亲电可裂解的连接基团,亲核可裂解的连接基团,光可裂解的连接基团,在还原条件、氧化条件下裂解的连接基团,保险栓(safety-catch)连接基团,以及通过消除机制进行裂解的连接基团。合适的连接基团可以用标准化学保护基团改良,正如在以下文献中所披露的:Greene&Wuts,Protective Groups in Organic Synthesis,John Wiley&Sons。Guillier等披露了用于固相合成的其他合适的可裂解的连接基团(Chem.Rev.100:2092-2157,2000)。
所述连接基团可以通过任何合适的方法裂解,包括接触酸,碱,亲核试剂,亲电试剂,自由基,金属,还原或氧化试剂,光照,温度,酶等,下文将示例性描述各种可裂解的连接基团的合适裂解方式。通常地,所述可裂解的连接基团可以在与所述保护基团相同的条件下裂解,以使得仅需要一次处理即可除去所述标记物和保护基团。
亲电裂解的连接基团典型地被质子所裂解,并包括对酸敏感的裂解。合适的亲电裂解的连接基团包括修饰的苄基系统,诸如三苯甲基、对烃氧基苄基酯和对烃氧基苄基酰胺。其他适合的连接基团包括叔丁氧羰基(Boc)基团和缩醛系统。为制备合适的连接分子,还可以考虑在硫缩醛或其他含硫保护基的裂解中使用诸如镍、银或汞的亲硫金属。亲核裂解的连接基团包括在水中不稳定的基团(即,能够在碱性pH值下简单地裂解),例如酯类,以及对非水性亲核试剂不稳定的基团。氟离子可用于裂解诸如三异丙基硅烷(TIPS)或叔丁基二甲基硅烷(TBDMS)的基团中的硅氧键。可光解的连接基团在糖化学中被广泛使用。优选地,激活裂解所需的光不影响修饰的核苷酸中的其他组分。例如,如果使用荧光团作为标记,优选地,该荧光团吸收与裂解所述连接分子所需的光不同波长的光。适合的连接基团包括那些基于O-硝基苄基化合物和硝基藜芦基化合物的连接基团。也可以使用基于安息香化学的连接基团(Lee等人,J.Org.Chem.64:3454-3460,1999)。已知多种对还原裂解敏感的连接基团。使用基于钯催化剂的催化氢化已用于裂解苄基和苄氧羰基基团。二硫键还原也为本领域所知。 基于氧化的方法为本领域所公知。这些方法包括对烃氧基苄基的氧化以及硫和硒连接基团的氧化。使用碘溶液(aqueous iodine)来使二硫化物和其他基于硫或硒的连接基团裂解也在本发明的范围内。安全拉手型连接基团(safety-catchlinker)为那些在两步中裂解的连接基团。在优选的系统中,第一步是反应性亲核中心的产生,随后的第二步涉及分子内环化,这导致裂解。例如,可以用肼或光化学方法处理乙酰丙酸酯连接来释放活性的胺,然后所述胺被环化以使分子中其他位置的酯裂解(Burgess等人,J.Org.Chem.62:5165-5168,1997)。也可以使用消除反应裂解连接基团。可以使用诸如芴甲氧羰基和氰基乙基的基团的碱催化的消除以及烯丙基系统的钯催化的还原消除。
在某些实施方案中,连接基团可包含间隔单元。连接基团的长度并不重要,只要所述标记物与核苷酸保持足够的距离,以免干扰核苷酸与酶之间的相互作用。
在某些实施方案中,连接基团可由与3’-OH保护基团类似的官能团组成。这会使得仅需要单一处理就除去标记物和保护基团。特别优选的连接基团是可通过膦裂解的含叠氮化物的连接基团。
如本文所用的可从经修饰的核苷酸上除去标记物的试剂在很大程度上取决于所使用的标记物。例如,在标记物掺入保护基团的情况下,使用上文所述的除去保护基团的试剂除去标记物。或者,在标记物通过可裂解的连接基团连接至核苷酸的碱基时,使用如上文所述的裂解连接基团的试剂除去标记物。在优选的实施方案中,使用相同的试剂来从经修饰的核苷酸上除去标记物和保护基团,例如在连接基团由与3’-OH保护基团类似的官能团组成的情况下。
本发明的示例性实施方案
在一个具体的实施方案中,本发明涉及用于确定靶多核苷酸的序列的方法,其包括:
(a)提供靶多核苷酸,
(b)使所述靶多核苷酸与引物接触,以使所述引物杂交至所述靶多核苷酸,从而形成靶多核苷酸与引物的部分双链体,
(c)在允许聚合酶进行核苷酸聚合反应的条件下,使所述部分双链体与聚合酶和核苷酸接触,以使得所述核苷酸掺入到所述引物上,
其中所述核苷酸选自以下的一种或多种:第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,其中所述第一核苷酸包含经第一标记物标记的第一核苷酸,所述第二核苷酸包含经第二标记物标记的第二核苷酸,所述第三核苷酸选自:(1)经第一标记物标记的第三核苷酸和经第二标记物标记的第三核苷酸,或(2)经第一标记物和第二标记物同时标记的第三核苷酸,所述第四核苷酸包含未经标记的第四核苷酸,
其中所述核苷酸各自的核糖或脱氧核糖部分包含通过2’或3’氧原子附接的保护基团,
其中所述第一标记物是发光标记物,
(d)检测步骤(c)的所述部分双链体上所述发光标记物的存在,
(e)随后使步骤(c)的所述部分双链体与特异性结合所述第二标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在,
(f)任选地去除在步骤(c)的所述部分双链体中掺入的核苷酸上的保护基团和标记物,
(g)任选地重复步骤(c)-(f)一次或多次,从而获得所述靶多核苷酸的序列信息,
其中所述发光标记物是相同的发光标记物。
在另一个具体的实施方案中,本发明涉及用于确定靶多核苷酸的序列的方法,其包括:
(a)提供靶多核苷酸,
(b)使所述靶多核苷酸与引物接触,以使所述引物杂交至所述靶多核苷酸,从而形成靶多核苷酸与引物的部分双链体,
(c)在允许聚合酶进行核苷酸聚合反应的条件下,使所述部分双链体与聚合酶和核苷酸接触,以使得所述核苷酸掺入到所述引物上,
其中所述核苷酸选自以下的一种或多种:第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,其中所述第一核苷酸包含经第一标记物标记的第一核苷酸,所述第二核苷酸包含经第二标记物标记的第二核苷酸,所述第三核苷酸选自:(1)经第一标记物标记的第三核苷酸和经第二标记物标记的第三核苷酸,或(2)经第一标记物和第二标记物同时标记的第三核苷酸,所述第四核苷酸包含未经标记的第四核苷酸,
其中所述核苷酸各自的核糖或脱氧核糖部分包含通过2’或3’氧原子附接的保护基团,
(d)使步骤(c)的所述部分双链体与特异性结合所述第一标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在,
随后从所述部分双链体上除去所述配体,
(e)使步骤(c)的所述部分双链体与特异性结合所述第二标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在,
(f)任选地去除在步骤(c)的所述部分双链体中掺入的核苷酸上的保护基团和标记物,
(g)任选地重复步骤(c)-(f)一次或多次,从而获得所述靶多核苷酸的序列信息,
其中所述发光标记物是相同的发光标记物。
本发明的改进的实施方案
在作出本发明的过程中,本发明人还发现,通过添加一部分不带标记的核苷酸,可以控制单一标记的核苷酸产生的信号值,有利于不同核苷酸的区分及后续的数据分析并显著改善测序效果。
因此,在具体的实施方案中,除了经第一标记物标记的第一核苷酸之外,所述第一核苷酸还可以包含未经标记的第一核苷酸。除了经第二标记物标记的第二核苷酸之外,所述第二核苷酸还可以包含未经标记的第二核苷酸。
在具体的实施方案中,所述第一核苷酸中所述经第一标记物标记的第一核苷酸和所述未经标记的第一核苷酸的比率为4:1至3:2。在具体的实施方案中,所述第二核苷酸中所述经第二标记物标记的第二核苷酸和所述未经标记的第二核苷酸的比率为4:1至3:2。
本发明的有益技术效果
本发明仅基于单一激发荧光检测进行测序,相比于利用4种或者2种荧光染料标记4种核苷酸的检测方法,该测序方法仅需要单一激发光源与单一相机,可以缩小测序设备体积,降低测序设备制造成本。
本发明在测序过程仅产生一种荧光,可以避免因标记不同荧光染料而造成的不同荧光信号间干扰。与2种荧光染料标记检测相比,也避免了双色荧光与单色荧光的相互干扰。
与Roche测序法和Ion torrent测序法相比,本发明使用的核苷酸的3’端羟基被修饰阻断,在测序过程中,每次反应只能合成一个脱氧核糖核苷酸,不会出现利用天然状态的脱氧核糖核苷酸测序过程中,遇到具有重复碱基的序列,一次反应合成多个脱氧核糖核苷酸的情况。因此本发明有助于提高测序的准确性。
实施例
实施例1:
方法概述
(1)提供连接于支持物上的待测序的核酸分子,或者将待测序的核酸分子连接于支持物上;
(2)添加用于起始核苷酸聚合反应的引物,使引物退火至待测序的核酸分子上,引物作为起始的生长的核酸链,与待测序的核酸分子一起形成连接于支持物上的双链体;
(3)添加用于进行核苷酸聚合反应的聚合酶,以及四种核苷酸,从而形成含有溶液相和固相的反应体系;其中,四种核苷酸分别为核苷酸A、(T/U)、C和G的衍生物,并且具有碱基互补配对能力;所述四种化合物的核糖或脱氧核糖的3'位置处的羟基(-OH)被保护基团保护;并且,第一核苷酸(如核苷酸A)连接有第一分子标记(如生物素、N3G等小分子),第二核苷酸(如核苷酸T)连接有第二分子标记(如地高辛、FITC等),第三核苷酸(如核苷酸C)部分连接有第一分子标记和第二分子标记,第四核苷酸(如核苷酸G)没有连接分子标记。为了利于不同核苷酸的区分及后续的数据分析,控制单一标记的核苷酸产生的信号值,添加部分对应的不带标记的核苷酸,如A-cold和T-cold。带标记的核苷酸A与A-cold的比例范围为4:1到3:2;带标记的核苷酸T与T-cold的比例范围为4:1到3:2。4类核苷酸在反应液中的终浓度在0.5-5μM之间。
(4)在允许聚合酶进行核苷酸聚合反应的条件下,以150-350μl/min的速度加入聚合反应液150-200μl,反应温度40-60℃,反应时间1-2min,从而将所述四种核苷酸中的一种掺入生长的核酸链的3'端;
(5)用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度移除前一步骤的反应体系的溶液相,保留连接于支持物上的双链体。以150-350μl/min的速度加入与第一分子标记(生物素、N3G等)特异性结合的配体(如SA、N3G抗体等)150-200μl,该配体被荧光基团标记(如AF532、CY3等),在30-55℃条件下孵育1-5min。后用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度洗脱游离的带荧光标记的配体,在拍照缓冲液中,以50-1000ms曝光条件检测发出的荧光信号。
(6)用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度置换前述的拍照缓冲液,后以150-350μl/min的速度加入与第二分子标记(地高辛、FITC等)特异性结合的配体(地高辛抗体、FITC抗体等)150-200μl,该配体被荧光基团标记(如AF532、CY3等),在30-55℃条件下孵育1-5min。后用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度洗脱游离的带荧光标记的抗体,在拍照缓冲液中,以10-200ms曝光条件检测发出的荧光信号。
(7)检测完毕后,以150-200ul/min的速度通入切除缓冲液300-400μl,,50-60℃孵育1-2min,同时去除连接在脱氧核糖核苷酸类似物上的小分子标记物和3'位置处的羟基(-OH)保护基团。
(8)重复步骤(3)-(7)。
(9)采集的信号经软件分析转化为序列信息。
大肠杆菌条形码序列的测定与分析
以生物素和地高辛标记核苷酸,以链霉亲和素和地高辛抗体作为其对应的配体。
1.实验材料
1).大肠杆菌
2).BGISEQ-500高通量测序试剂盒(SE100)
MGIEasy TM DNA文库制备试剂盒
3).脱氧核糖核苷酸类似物及聚合反应混合液
(1)生物素(biotin)修饰的腺嘌呤脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000001
(2)生物素(biotin)修饰的胞嘧啶脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000002
(3)地高辛(digoxin)修饰的胞嘧啶脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000003
(4)地高辛(digoxin)修饰的胸腺嘧啶脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000004
(5)鸟嘌呤脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000005
脱氧核糖核苷酸类似物的混合反应液1:
第一组:A-生物素+A-cold(A-生物素:A-cold为4:1,A-生物素+A-cold=1μM)
第二组:C-生物素+C-地高辛(C-生物素:C-地高辛为2:1,C-生物素+C-地高辛=2μM)
第三组:T-地高辛+T-cold(T-地高辛:T-cold为4:1,T-地高辛+T-cold=1μM)
第四组:G-cold(1μM)
四组核苷酸类似物配按上述浓度及比例制成混合液。
脱氧核糖核苷酸类似物的混合反应液2:
第一组:A-生物素(1μM)
第二组:C-生物素+C-地高辛(C-生物素:C-地高辛为2:1,C-生物素+C-地高辛=2μM)
第三组:T-地高辛(1μM)
第四组:G-cold(1μM)
四组核苷酸类似物配按上述浓度及比例制成混合液。
4).磷酸盐缓冲液(PBS)(生工生物)
该试剂即为抗体配体缓冲液也为洗脱试剂
5).2μg/ml链霉亲和素标记的CY3荧光(试剂厂商:Thermo Fisher scientific;试剂货号:434315);2μg/ml地高辛抗体标记的CY3荧光(试剂厂商:Jackson ImmunoResearch;试剂货号:200-162-156)。
上述荧光标记抗体均用PBS配制。
2.实验步骤
1)参照如下文献提取大肠杆菌基因组DNA。
So A,Pel J,Rajan S,Marziali A.Efficient genomic DNA extraction为low target concentration bacterial cultures using SCODA DNA extraction technology.Cold Spring Harb Protoc.2010(10):pdb.prot5506。
2)参照MGIEasy TM DNA文库制备试剂盒及说明书,制备环状单链DNA。制备好的单链环状DNA已经标记有条形码序列。
3)参照BGISEQ-500高通量测序试剂盒(SE100)说明书,将环状单链DNA通过滚环复制,制备成DNA纳米球。后继续参照BGISEQ-500高通量测序试剂盒(SE100)说明书,将制备好的DNA纳米球装载到测序芯片上。
4)向装载好DNA纳米球的芯片通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min。
5)参照BGISEQ-500高通量测序试剂盒(SE100)说明书制备测序反应液,将其中的脱氧核糖核苷酸替换为上述实验材料中的4组脱氧核苷酸类似物1或脱氧核苷酸类似物2,浓度参考实验材料。将新制备的测序反应液通入芯片,通液体积300μl,通液速度200ul/min。55℃孵育1min。后通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min。
6)向测序芯片通入链霉亲和素标记的CY3荧光(2μg/ml,Thermo Fisher),通液体积150μl,通液速度150ul/min,使带有荧光标记的链霉亲和素和生物素结合。35℃孵育3min。后通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min,去除游离的链霉亲和素标记的CY3荧光。
7)向测序芯片通入信号采集缓冲液(BGISEQ-500高通量测序试剂盒(SE100)中已有),通液体积300μl,通液速度200ul/min,后由激光器激发结合在待测序列上的荧光(曝光时间100ms)并记录信号。
8)向测序芯片通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min。后通入地高辛抗体标记的CY3荧光(2μg/ml,Jackson ImmunoResearch),通液体积150μl,通液速度150ul/min,35℃孵育5min。后通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min,去除游离的地高辛抗体标记的CY3荧光。
9)向测序芯片通入信号采集缓冲液,通液体积300μl,通液速度200ul/min, 后由激光器激发结合在待测序列上的荧光(曝光时间20ms)并记录信号。
10)通入切除反应液(BGISEQ-500高通量测序试剂盒(SE100)中已有),通液体积300μl,通液速度200ul/min,57℃孵育1min。
11)循环重复步骤4-9。
12)每次反应循环记录的荧光信号信息经分析软件转换为脱氧核糖核苷酸信息。
13)共进行10个测序反应循环(测条形码),根据500平台自带的软件对所有读长进行条形码的拆分,计算每个条形码的拆分率。
3.实验结果
经条形码序列分析软件分析,条形码拆分效率为82%。
图1为待测条形码序列第一位碱基的信号提取图片,从图上可知4种脱氧核糖核苷酸根据检测规则被区分为4个信号团。左下角的为G碱基的信号团;水平的信号臂为A碱基信号团;竖直的信号臂为T碱基信号团;位于AT信号臂对角线的为C碱基信号团。
图2为待测条形码序列第十位碱基的信号提取图,信号臂的区分同第一位碱基的信号提取图。
图5为未添加不带标记核苷酸实验的第一位碱基的信号提取图,信号臂的区分同前述添加不带标记核苷酸实验的信号提取图。
另用上述相同的实验方法进行50个测序反应循环,经过分析显示映射率(mapping rate)为70%,错误率为2%。
实施例2:
方法概述
(1)提供连接于支持物上的待测序的核酸分子,或者将待测序的核酸分子连接于支持物上;
(2)添加用于起始核苷酸聚合反应的引物,使引物退火至待测序的核酸分子上,引物作为起始的生长的核酸链,与待测序的核酸分子一起形成连接于支持物上的双链体;
(3)添加用于进行核苷酸聚合反应的聚合酶,以及四种核苷酸,从而形成含有溶液相和固相的反应体系;其中,四种核苷酸分别为核苷酸A、(T/U)、C和G的衍生物,并且具有碱基互补配对能力;所述四种核苷酸的核糖或脱氧核糖的3'位置处的羟基(-OH)被保护基团保护;并且,第一核苷酸(如核苷酸A)连接有第一分子标记(任意可被激发荧光,如AF532、CY3等),第二核苷酸(如核苷酸T)连接有第二分子标记(如生物素、地高辛等小分子),第三核苷酸(如核苷酸C)部分连接有第一分子标记和第二分子标记,第四核苷酸(如核苷酸G)没有连接分子标记。为了利于不同核苷酸的区分及后续的数据分析,控制单一标记的核苷酸产生的信号值,添加部分对应的不带标记的核苷酸,如A-cold和T-cold。带标记的核苷酸A与A-cold的比例范围为4:1到3:2;带标记的核苷酸T与T-cold的比例范围为4:1到3:2。4类脱氧核糖核苷酸类似物在反应液中的终浓度在0.5-5μM之间。
(4)在允许聚合酶进行核苷酸聚合反应的条件下,以150-350μl/min的速度加入聚合反应液150-200μl,反应温度40-60℃,反应时间1-2min,从而将所述四种核苷酸中的一种并入生长的核酸链的3'端;
(5)用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度移除前一步骤的反应体系的溶液相,保留连接于支持物上的双链体。在拍照缓冲液中,以50-1000ms曝光条件检测发出的荧光信号。
(6)用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度置换前述的拍照缓冲液,后以150-350μl/min的速度加入与第二分子标记(biotin、digoxin等) 特异性结合的配体(SA、digoxin抗体等)150-200μl,该物质被荧光基团标记(与第一分子标记相同的荧光),在30-55℃条件下孵育1-5min。后用洗脱试剂(PBS或TBS)300-400μl,以150-350μl/min的速度洗脱游离的带荧光标记的抗体,在拍照缓冲液中,以10-200ms曝光条件检测发出的荧光信号。
(7)检测完毕后,以150-200ul/min的速度通入切除缓冲液300-400μl,50-60℃孵育1-2min,同时去除连接在脱氧核糖核苷酸类似物上的小分子标记物和3'位置处的羟基(-OH)保护基团。
(8)重复步骤(3)-(7)。
(9)采集的信号经软件分析转化为序列信息。
大肠杆菌SE50的测定与分析
1.实验材料
1)大肠杆菌
2)BGISEQ-500高通量测序试剂盒(SE100)
MGIEasy TM DNA文库制备试剂盒
3)脱氧核糖核苷酸类似物及聚合反应混合液
(1)荧光AF532修饰的腺嘌呤脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000006
(2)荧光AF532修饰的胞嘧啶脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000007
(3)生物素(biotin)修饰的胞嘧啶脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000008
(4)生物素(biotin)修饰的胸腺嘧啶脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000009
(5)鸟嘌呤脱氧核糖核苷酸类似物
Figure PCTCN2018114281-appb-000010
脱氧核糖核苷酸类似物的混合反应液1:
第一组:A-AF532+A-cold(A-生物素:A-cold为4:1,A-生物素+A-cold=1μM)
第二组:C-生物素+C-AF532(C-生物素:C-AF532为2:1,C-生物素+C-AF532=2μM)
第三组:T-生物素+T-cold(T-生物素:T-cold为4:1,T-生物素+T-cold=1μM)
第四组:G-cold(1μM)
四组核苷酸类似物配按上述浓度及比例制成混合液。
脱氧核糖核苷酸类似物的混合反应液2:
第一组:A-AF532(1μM)
第二组:C-生物素+C-AF532(C-生物素:C-AF532为2:1,C-生物素+C-AF532=2μM)
第三组:T-生物素(1μM)
第四组:G-cold(1μM)
四组核苷酸类似物配按上述浓度及比例制成混合液。
4)磷酸盐缓冲液(PBS)(生工生物)
该试剂即为抗体配体缓冲液也为洗脱试剂
5)2μg/ml链霉亲和素标记的AF532荧光(试剂厂商:Thermo Fisher scientific;试剂货号:434315);
上述荧光标记抗体均用PBS配制。
2.实验步骤
1).参照以下文献提取大肠杆菌基因组DNA。
So A,Pel J,Rajan S,Marziali A.Efficient genomic DNA extraction为low target concentration bacterial cultures using SCODA DNA extraction technology.Cold Spring Harb Protoc.2010(10):pdb.prot5506.
2)参照MGIEasy TM DNA文库制备试剂盒及说明书,制备环状单链DNA。制备好的单链环状DNA已经标记有条形码序列。
3)参照BGISEQ-500高通量测序试剂盒(SE100)说明书,将环状单链DNA通过滚环复制,制备成DNA纳米球。后继续参照BGISEQ-500高通量测序试剂盒(SE100)说明书,将制备好的DNA纳米球装载到测序芯片上。
4)向装载好DNA纳米球的芯片通入磷酸盐缓冲液(生工),通液体积300μl, 通液速度200ul/min。
5)参照BGISEQ-500高通量测序试剂盒(SE100)说明书制备测序反应液,将其中的脱氧核糖核苷酸替换为上述实验材料中的4组脱氧核苷酸类似物1或脱氧核苷酸类似物2,浓度参考实验材料。将新制备的测序反应液通入芯片,通液体积300μl,通液速度200ul/min。55℃孵育1min。后通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min。
6)向测序芯片通入信号采集缓冲液(BGISEQ-500高通量测序试剂盒(SE100)中已有),通液体积300μl,通液速度200ul/min,后由激光器激发结合在待测序列上的荧光(曝光时间100ms)并记录信号。
7)向测序芯片通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min。后通入链霉亲和素标记的AF532荧光(2μg/ml,Thermo Fisher scientific),通液体积150μl,通液速度150ul/min,35℃孵育5min。后通入磷酸盐缓冲液(生工),通液体积300μl,通液速度200ul/min,去除游离的链霉亲和素标记的AF532荧光。
8)向测序芯片通入信号采集缓冲液,通液体积300μl,通液速度200ul/min,后由激光器激发结合在待测序列上的荧光(曝光时间20ms)并记录信号。
9)通入切除反应液(BGISEQ-500高通量测序试剂盒(SE100)中已有),通液体积300μl,通液速度200ul/min,57℃孵育1min。
10)循环重复步骤4-8。
11)每次反应循环记录的荧光信号信息经分析软件转换为脱氧核糖核苷酸信息。
12)共进行50个测序反应循环(测条形码),根据500平台自带的软件对所有读长进行条形码的拆分,计算每个条形码的拆分率。
3.实验结果
经条形码序列分析软件分析,条形码拆分效率为83.6%。
根据50个测序反应循环,经过分析显示映射率(mapping rate)为67%,错误率为2%。
图3为待测序列第一位碱基的信号提取图片,从图上可知4种脱氧核糖核苷酸根据检测规则被区分为4个信号团。左下角的为G碱基的信号团;水平的信号臂为A碱基信号团;竖直的信号臂为T碱基信号团;位于AT信号臂对角线的为C碱基信号团。
图4为待测条形码序列第50位碱基的信号提取图,信号臂的区分同第一位碱基的信号提取图。
图6为未添加不带标记核苷酸实验的第一位碱基的信号提取图,信号臂的区分同前述添加不带标记核苷酸实验的信号提取图。

Claims (15)

  1. 一种用于确定靶多核苷酸的序列的方法,其包括
    (a)提供靶多核苷酸,
    (b)使所述靶多核苷酸与引物接触,以使所述引物杂交至所述靶多核苷酸,从而形成靶多核苷酸与引物的部分双链体,
    (c)在允许聚合酶进行核苷酸聚合反应的条件下,使所述部分双链体与聚合酶和核苷酸接触,以使得所述核苷酸掺入到所述引物上,
    其中所述核苷酸选自以下的一种或多种:第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,其中所述第一核苷酸包含经第一标记物标记的第一核苷酸和任选未经标记的第一核苷酸,所述第二核苷酸包含经第二标记物标记的第二核苷酸和任选未经标记的第二核苷酸,所述第三核苷酸选自:(1)经第一标记物标记的第三核苷酸和经第二标记物标记的第三核苷酸,或(2)经第一标记物和第二标记物同时标记的第三核苷酸;所述第四核苷酸包含未经标记的第四核苷酸;
    其中所述核苷酸各自的核糖或脱氧核糖部分包含通过2’或3’氧原子附接的保护基团,
    (d)检测步骤(c)的所述部分双链体上第一标记物的存在,
    (e)检测步骤(c)的所述部分双链体上第二标记物的存在,
    (f)任选地去除在步骤(c)的所述部分双链体中掺入的核苷酸上的保护基团和标记物,
    (g)任选地重复步骤(c)-(f)一次或多次,从而获得所述靶多核苷酸的序列信息,
    其中所述第一标记物和所述第二标记物的存在通过相同的发光信号来检测。
  2. 权利要求1的方法,其中所述第一标记物是发光标记物。
  3. 权利要求1的方法,其中步骤(d)包括使步骤(c)的所述部分双链体与特异性结合所述第一标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在,
    任选地,在去除步骤(c)的所述部分双链体中掺入的核苷酸上的保护基团和标记物 时一起去除所述配体。
  4. 权利要求1-3中任一项的方法,其中步骤(e)包括使步骤(c)的所述部分双链体与特异性结合所述第二标记物的经发光标记物标记的配体接触,随后检测所述部分双链体上所述发光标记物的存在,
    例如,步骤(e)在步骤(d)之后进行。
  5. 权利要求2-4中任一项的方法,其中所述发光标记物是相同的发光标记物。
  6. 权利要求2-5中任一项的方法,其中所述发光标记物是荧光标记物,例如是荧光团,例如选自香豆素、AlexaFluor、Bodipy、荧光素、四甲基罗丹明、吩噁嗪、吖啶、Cy5、Cy3、AF532、得克萨斯红及其衍生物。
  7. 权利要求1-6中任一项的方法,其中所述第一核苷酸中所述经第一标记物标记的第一核苷酸和所述未经标记的第一核苷酸的比率为4:1至3:2。
  8. 权利要求1-7中任一项的方法,其中所述第二核苷酸中所述经第二标记物标记的第二核苷酸和所述未经标记的第二核苷酸的比率为4:1至3:2。
  9. 一种用于对多核苷酸进行测序的试剂盒,其包含:(a)选自以下的一种或多种核苷酸:第一核苷酸、第二核苷酸、第三核苷酸和第四核苷酸,其中所述第一核苷酸包含经第一标记物标记的第一核苷酸和任选未经标记的第一核苷酸,所述第二核苷酸包含经第二标记物标记的第二核苷酸和任选未经标记的第二核苷酸,所述第三核苷酸选自:(1)经第一标记物标记的第三核苷酸和经第二标记物标记的第三核苷酸,或(2)经第一标记物和第二标记物同时标记的第三核苷酸,所述第四核苷酸包含未经标记的第四核苷酸;和(b)它们的包装材料,
    其中所述核苷酸各自的核糖或脱氧核糖部分包含通过2’或3’氧原子附接的保护基团。
  10. 权利要求9的试剂盒,其中所述第一标记物是发光标记物。
  11. 权利要求9的试剂盒,其还包含特异性结合所述第一标记物的经发光标记物标记的配体。
  12. 权利要求9-11中任一项的试剂盒,其还包含特异性结合所述第二标记物的经发光标记物标记的配体。
  13. 权利要求9-12中任一项的试剂盒,所述发光标记物是相同的发光标记物。
  14. 权利要求9-13中任一项的试剂盒,其中所述发光标记物是荧光标记物,例如是荧光团,例如选自香豆素、AlexaFluor、Bodipy、荧光素、四甲基罗丹明、吩噁嗪、吖啶、Cy5、Cy3、AF532、得克萨斯红及其衍生物
  15. 权利要求9-14中任一项的试剂盒,其还包含酶和适合所述酶起作用的缓冲液。
PCT/CN2018/114281 2018-11-07 2018-11-07 对多核苷酸进行测序的方法 WO2020093261A1 (zh)

Priority Applications (9)

Application Number Priority Date Filing Date Title
US17/292,400 US20220010370A1 (en) 2018-11-07 2018-11-07 Method for sequencing polynucleotides
JP2021518956A JP7332235B2 (ja) 2018-11-07 2018-11-07 ポリヌクレオチドを配列決定する方法
KR1020217017101A KR20210088637A (ko) 2018-11-07 2018-11-07 폴리뉴클레오티드의 시퀀싱 방법
SG11202104099VA SG11202104099VA (en) 2018-11-07 2018-11-07 Method for sequencing polynucleotides
CA3118607A CA3118607A1 (en) 2018-11-07 2018-11-07 Method for sequencing polynucleotides
AU2018448937A AU2018448937A1 (en) 2018-11-07 2018-11-07 Method for sequencing polynucleotides
CN201880098581.3A CN112840035B (zh) 2018-11-07 2018-11-07 对多核苷酸进行测序的方法
EP18939391.1A EP3878968A4 (en) 2018-11-07 2018-11-07 POLYNUCLEOTIDE SEQUENCING METHOD
PCT/CN2018/114281 WO2020093261A1 (zh) 2018-11-07 2018-11-07 对多核苷酸进行测序的方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/114281 WO2020093261A1 (zh) 2018-11-07 2018-11-07 对多核苷酸进行测序的方法

Publications (1)

Publication Number Publication Date
WO2020093261A1 true WO2020093261A1 (zh) 2020-05-14

Family

ID=70610779

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/114281 WO2020093261A1 (zh) 2018-11-07 2018-11-07 对多核苷酸进行测序的方法

Country Status (9)

Country Link
US (1) US20220010370A1 (zh)
EP (1) EP3878968A4 (zh)
JP (1) JP7332235B2 (zh)
KR (1) KR20210088637A (zh)
CN (1) CN112840035B (zh)
AU (1) AU2018448937A1 (zh)
CA (1) CA3118607A1 (zh)
SG (1) SG11202104099VA (zh)
WO (1) WO2020093261A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020167574A1 (en) * 2019-02-14 2020-08-20 Omniome, Inc. Mitigating adverse impacts of detection systems on nucleic acids and other biological analytes
US20230383342A1 (en) * 2022-05-31 2023-11-30 Illumina Cambridge Limited Compositions and methods for nucleic acid sequencing
WO2024067674A1 (zh) * 2022-09-28 2024-04-04 深圳华大智造科技股份有限公司 测序方法

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991006678A1 (en) 1989-10-26 1991-05-16 Sri International Dna sequencing
US5302509A (en) 1989-08-14 1994-04-12 Beckman Instruments, Inc. Method for sequencing polynucleotides
WO2000006770A1 (en) 1998-07-30 2000-02-10 Solexa Ltd. Arrayed biomolecules and their use in sequencing
WO2002029003A2 (en) 2000-10-06 2002-04-11 The Trustees Of Columbia University In The City Of New York Massive parallel method for decoding dna and rna
WO2004018497A2 (en) 2002-08-23 2004-03-04 Solexa Limited Modified nucleotides for polynucleotide sequencing
CN102858995A (zh) * 2009-09-10 2013-01-02 森特瑞隆技术控股公司 靶向测序方法
CN103502474A (zh) * 2011-05-06 2014-01-08 凯杰有限公司 包含经由连接基连结的标记的寡核苷酸
CN103602719A (zh) * 2013-04-07 2014-02-26 北京迈基诺基因科技有限责任公司 一种基因测序方法
WO2014130388A1 (en) * 2013-02-20 2014-08-28 Emory University Methods of sequencing nucleic acids in mixtures and compositions related thereto
WO2014139596A1 (en) 2013-03-15 2014-09-18 Illumina Cambridge Limited Modified nucleosides or nucleotides
US9453258B2 (en) 2011-09-23 2016-09-27 Illumina, Inc. Methods and compositions for nucleic acid sequencing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993019205A1 (en) * 1992-03-19 1993-09-30 The Regents Of The University Of California Multiple tag labeling method for dna sequencing
WO2000058507A1 (en) * 1999-03-30 2000-10-05 Solexa Ltd. Polynucleotide sequencing
CA3049667A1 (en) 2016-12-27 2018-07-05 Bgi Shenzhen Single fluorescent dye-based sequencing method
HUE059673T2 (hu) * 2017-01-04 2022-12-28 Mgi Tech Co Ltd Nukleinsav-szekvenálás affinitási reagensek felhasználásával
KR102246285B1 (ko) * 2017-03-07 2021-04-29 일루미나, 인코포레이티드 단일 광원, 2-광학 채널 서열분석

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5302509A (en) 1989-08-14 1994-04-12 Beckman Instruments, Inc. Method for sequencing polynucleotides
WO1991006678A1 (en) 1989-10-26 1991-05-16 Sri International Dna sequencing
WO2000006770A1 (en) 1998-07-30 2000-02-10 Solexa Ltd. Arrayed biomolecules and their use in sequencing
WO2002029003A2 (en) 2000-10-06 2002-04-11 The Trustees Of Columbia University In The City Of New York Massive parallel method for decoding dna and rna
WO2004018497A2 (en) 2002-08-23 2004-03-04 Solexa Limited Modified nucleotides for polynucleotide sequencing
CN102858995A (zh) * 2009-09-10 2013-01-02 森特瑞隆技术控股公司 靶向测序方法
CN103502474A (zh) * 2011-05-06 2014-01-08 凯杰有限公司 包含经由连接基连结的标记的寡核苷酸
US9453258B2 (en) 2011-09-23 2016-09-27 Illumina, Inc. Methods and compositions for nucleic acid sequencing
WO2014130388A1 (en) * 2013-02-20 2014-08-28 Emory University Methods of sequencing nucleic acids in mixtures and compositions related thereto
WO2014139596A1 (en) 2013-03-15 2014-09-18 Illumina Cambridge Limited Modified nucleosides or nucleotides
CN103602719A (zh) * 2013-04-07 2014-02-26 北京迈基诺基因科技有限责任公司 一种基因测序方法

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
BURGESS ET AL., J. ORG. CHEM., vol. 62, 1997, pages 5165 - 5168
CHEM. REV., vol. 100, 2000, pages 2092 - 2157
LEE ET AL., J. ORG. CHEM., vol. 64, 1999, pages 3454 - 3460
MARTIN KIRCHERJANET KELSO: "High-throughput DNA sequencing - concepts and limitations", BIOESSAYS, vol. 32, 2010, pages 524 - 536, XP055103847, DOI: 10.1002/bies.200900181
METZKER ET AL., NUCLEIC ACIDS RESEARCH, vol. 22, no. 20, 1994, pages 4259 - 4267
SARA GOODWINJOHN D. MCPHERSONW. RICHARD MCCOMBIE: "Coming of age: ten years of next-generation sequencing technologies", NATURE REVIEWS, vol. 17, 2016, pages 333 - 351c
SARA GOODWINJOHN D. MCPHERSONW. RICHARD MCCOMBIE: "Coming of age: ten years of next-generation sequencing technologies", NATURE, vol. 17, 2016, pages 333 - 351, XP055544186, DOI: 10.1038/nrg.2016.49
SO APEL JRAJ AN SMARZIALI A.: "Efficient genomic DNA extraction for low target concentration bacterial cultures using SCODA DNA extraction technology", COLD SPRING HARB PROTOC, vol. 10, 2010

Also Published As

Publication number Publication date
AU2018448937A1 (en) 2021-05-27
CN112840035B (zh) 2024-01-30
JP2022513574A (ja) 2022-02-09
KR20210088637A (ko) 2021-07-14
CN112840035A (zh) 2021-05-25
US20220010370A1 (en) 2022-01-13
CA3118607A1 (en) 2020-05-14
EP3878968A4 (en) 2022-08-17
JP7332235B2 (ja) 2023-08-23
EP3878968A1 (en) 2021-09-15
SG11202104099VA (en) 2021-05-28

Similar Documents

Publication Publication Date Title
US10808244B2 (en) Method of normalizing biological samples
ES2873850T3 (es) Bibliotecas de secuenciación de próxima generación
JP2009512452A (ja) 迅速並行核酸分析
CN113748216B (zh) 一种基于自发光的单通道测序方法
WO2020093261A1 (zh) 对多核苷酸进行测序的方法
JP6510978B2 (ja) 核酸を配列決定する方法および装置
CN114286867B (zh) 一种基于发光标记物光信号动力学及二次发光信号对多核苷酸进行测序的方法
CN116323974A (zh) 多路复用covid-19锁式测定
WO2020073274A1 (zh) 对多核苷酸进行测序的方法
US20220213542A1 (en) Sequencing by synthesis with energy transfer dye pairs
RU2794177C1 (ru) Способ одноканального секвенирования на основе самолюминесценции
US20230304086A1 (en) Labeled avidin and methods for sequencing
WO2022129439A1 (en) Methods, systems and compositions for nucleic acid sequencing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18939391

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021518956

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 3118607

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018448937

Country of ref document: AU

Date of ref document: 20181107

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20217017101

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2018939391

Country of ref document: EP

Effective date: 20210607