WO2007142608A1 - Nucleic acid concatenation - Google Patents

Nucleic acid concatenation Download PDF

Info

Publication number
WO2007142608A1
WO2007142608A1 PCT/SG2007/000159 SG2007000159W WO2007142608A1 WO 2007142608 A1 WO2007142608 A1 WO 2007142608A1 SG 2007000159 W SG2007000159 W SG 2007000159W WO 2007142608 A1 WO2007142608 A1 WO 2007142608A1
Authority
WO
WIPO (PCT)
Prior art keywords
oligonucleotide
ligatable
nucleotide
fragment
concatenated
Prior art date
Application number
PCT/SG2007/000159
Other languages
French (fr)
Inventor
Yijun Ruan
Patrick Wei Pern Ng
Melissa Jane Fullwood
Yen Ling Lee
Original Assignee
Agency For Science, Technology And Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency For Science, Technology And Research filed Critical Agency For Science, Technology And Research
Priority to EP07748704.9A priority Critical patent/EP2032721B1/en
Publication of WO2007142608A1 publication Critical patent/WO2007142608A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6804Nucleic acid analysis using immunogens
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection

Definitions

  • the present invention generally relates to the field of nucleic acids. Specifically, the present invention relates to concatenation of nucleic acids.
  • EST expressed sequence tag
  • cDNA clones are sequenced from 5' and/or 3' nucleotides (Adams, M., et al., 1991 , Science, 252, 1651-1656). Each EST sequence read would generate; on average, a 500bp tag per transcript. The number of identical or overlapping ESTs would indicate the relative level of gene expression activity. Though this is an effective approach to identifying genes, it is prohibitively expensive to tag every transcript in a transcriptome. In practice, sequencing usually ceases after 10,000 or fewer ESTs are obtained from a cDNA library where millions of transcripts might be cloned.
  • SAGE SAGE
  • multiple tags are concatenated into long DNA fragments and cloned for sequencing.
  • Each SAGE sequence readout can usually reveal 20-30 SAGE tags.
  • a modest SAGE sequencing effort of less than 10,000 reads will have significant coverage of a transcriptome. Transcript abundance is measured by simply counting the numerical frequency of the SAGE tags.
  • the present invention solves the problems mentioned above by providing a new method of manipulating nucleic acids. More specifically, the present invention relates to manipulation of nucleic acids. In particular, the invention relates to methods for the preparation of nucleotide fragments by concatenation. [0011] In one aspect, the present invention provides a new method of concatenation of nucleotide fragments. In particular, there is provided a length-controlled concatenation of nucleotide fragments such that concatemers having a desired number of nucleotide fragments; or having a particular length, may be prepared. The present invention also provides molecules and components prepared by the method.
  • the nucleotide fragment according to the invention may comprise at least one ditag and/or at least one tag. Accordingly, a concatenation of at least two nucleotide fragments may comprise at least two concatenated ditags and/or tags
  • a method of length-controlled concatenating nucleotide fragments comprising: (a) providing at least two nucleotide fragments, wherein each fragment has one ligatabie end and one non-ligatable end; and (b) allowing the two fragments to ligate at the ligatabie ends to form at least one oligonucleotide comprising of at least two concatenated nucleotide fragments.
  • the method may further comprise the steps of treating the at least one oligonucleotide to produce at least one oligonucleotide having one ligatabie end and one non-ligatable end, and allowing the oligonucleotide to ligate with a further oligonucleotide or a nucleotide fragment to form an oligonucleotide comprising more than two concatenated nucleotides.
  • the method may be repeated one or more times to make at least one oligonucleotide with an increasing number of nucleotide fragments. According to one aspect, the repetition of concatenation yields a doubling of the number of concatenated nucleotide fragments.
  • the method may further comprise treating the at least one oligonucleotide to produce at least one oligonucleotide with two ligatabie ends and allowing the oligonucleotide to self-circularize at the iigatable ends.
  • the method may further comprise selecting the at least one circularized oligonucleotide and/or amplifying the oligonucleotide.
  • the method may further comprise treating the circularized and/or amplified oligonucleotide to produce at least one oligonucleotide having one ligatabie end and one non-ligatable end, and allowing at least two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two concatenated oligonucleotides.
  • the method may further comprise: (a) treating the at least one oligonucleotide to produce at least one oligonucleotide having two ligatable ends compatible with each other, and allowing the oligonucleotide to self- circularize at its ligatable ends; (b) selecting the at least one self-circularized oligonucleotide; (c) optionally amplifying the selected oligonucleotide; (d) treating the oligonucleotide (from step b or c) to produce at least one oligonucleotide with one ligatable end and one non-ligatable end; and (e) allowing two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two concatenated oligonucleotides.
  • the method comprises the steps of (a) providing at least one oligonucleotide, wherein the oligonucleotide has ligatable ends; (b) allowing the at least one oligonucleotide to self-circularize at its ligatable ends; (c) selecting the at least one self-circularized oligonucleotide; (d) treating the selected circularized oligonucleotide with at least one restriction enzyme to obtain at least one oligonucleotide with one ligatable end and one non-ligatable end; and (e) concatenating at least two oligonucleotides at the ligatable ends to form a concatemer of at least two oligonucleotides.
  • the nucleotide fragments, oligonucleotide(s) and/or concatemer(s) may be amplified.
  • the amplification may be by bacterial amplification, by rolling circle amplification, and/or by polymerase chain reaction.
  • the method may comprise repeating the steps one or more times to obtain concatemers having desired lengths and/or number of oligonucleotides or nucleotide fragments. The repeating may result in a doubling of the number of oligonucleotides in the concatemers.
  • the ligatable end of each fragment may be a palindromic cohesive end.
  • the ligatable end and/or the non-ligatable end may be located in at least one adaptor.
  • the adaptor may be part of a plasmid or vector.
  • the nucleotide fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
  • Each nucleotide fragment of the concatemer may have an orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream.
  • the method may further comprise sequencing the concatemer.
  • the sequencing may be by any suitable method, for example, by pyrosequencing.
  • the present invention provides an isolated concatemer (one oligonucleotide or at least two concatenated oligonucleotides) comprising at least two nucleotide fragments, wherein each fragment has at least one ligatable end and one non-ligatable end, and the fragments are ligated at the ligatable ends to form the concatemer.
  • the ligatable ends may be palindromic cohesive ends.
  • the fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
  • Each nucleotide fragment of the concatemer may have orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream.
  • the concatemer may be inserted into a plasmid or vector.
  • the polynucleotide may be DNA or RNA.
  • FIG.1 illustrates the overview of one embodiment of the method of the present invention for preparing concatenated nucleotide fragments (concatements).
  • Each nucleotide fragment comprises a ditag.
  • the ditag comprises a first tag comprising a 5' terminus (gray in the figure) and a second tag comprising a 3' terminus (black in the figure) of a polynucleotide (for example, a full-length cDNA).
  • the ditag may have one end sticky or ligatable and the other end non-sticky (FIG 1A) or blunt-ended (may be either ligatable or non-ligatable) (FIG. 1 B)
  • FIG. 1A end sticky or ligatable
  • blunt-ended may be either ligatable or non-ligatable
  • the two concatenated nucleotide fragments obtained comprise two concatenated ditags.
  • the ditags are suitable for sequencing by large scale parallel sequencing methods, such as "454" sequencing.
  • the ditags may be prepared by using single paired-end ditag (PET) plasmids or from insertion of other DNA sequences wherein the 5' and 3' termini are of interest.
  • the oligonucleotides in the concatemers may be added in a precise manner as the capacity of the sequencing technology used increases.
  • Two different types of adaptors are ligated to the ends of the diPET. These adaptors will contain the different restriction sites necessary for subsequent restriction digestion.
  • the adaptors are ligated, such that only those diPETs with different adaptors ligated on will be circularized, and thus selected by an optional exonuclease treatment.
  • Rolling circle amplification is performed to amplify the DNA.
  • the DNA is then cut with the appropriate restriction enymes to generate a sticky end and an end that is not sticky, such that ligation may be used to form an n-PET.
  • the cycle may then be repeated as desired to generate larger n-pets. Amplification and selection by PCR is also possible.
  • FIG.3 illustrates electroeluted PETs in a PAGE gel.
  • Lanes 1 and 9 Invitrogen 25 bp ladder.
  • Lane 2 Invitrogen 100 bp ladder.
  • Lane 3 BseR1 and BamH1 cut PETs from control library.
  • Lane 4 BseR1 and BamH1 cut PETs from experimental library.
  • Lane 5 2 ul Invitrogen Low Mass ladder.
  • Lane 6 4 ul Invitrogen Low Mass ladder.
  • Lane 7 Blunted BseR1 and BamH1 cut PETs from control library.
  • Lane 8 Blunted BseR1 and BamH1 cut PETs from experimental library.
  • FIG. 4 illustrates electroeluted diPETs in a PAGE gel.
  • Lane 1 Invitrogen 25 bp ladder.
  • Lane 2 Invitrogen 100 bp ladder.
  • Lane 3 Ligation product of control library; as expected, this library formed concatemers.
  • Lane 4 Ligation product of experimental library - this library formed length-controlled diPETs, as can be seen by the single clear, sharp band.
  • Lane 5 2 ul of Invitrogen Low Mass ladder.
  • Lane 6 4 ul of Invitrogen Low Mass ladder.
  • FIG. 5 illustrates two examples of vectors used in the method of the present invention.
  • FIG. 5A is the pGIS4a2 vector and
  • FIG. 5B is the pGIS3h vector.
  • FIG. 6 illustrates concatenation of ditags to obtain a concatemer of a desired length for sequencing.
  • SEQ ID NO:1 Gsul-oligo dT primer: ⁇ '-GAGCTAGTTCTGGAGTTTTTTTTTTTTTTTTVN-S'
  • SEQ ID NO:2 GIS-(N)6 adapter upper strand: ⁇ '-CTAAACTCGAGGCGGCCGCGGATCCGACNNNNNN-S'
  • SEQ ID NO:3 GIS-(N)6 adapter lower strand: ⁇ '-p-GTCGGATCCGCGGCCGCCTCGAGTTT-S'
  • SEQ ID NO:4 GIS-(N)5 adapter upper strand: 5'-CTAAACTCGAGGCGGCCGCGGATCCGACGNNNNN-S'
  • SEQ ID NO:5 GIS-(N)5 adapter lower strand: ⁇ '-p-GTCGGATCCGCGGCCGCCTCGAGTTT-S'
  • SEQ ID NO:6 palindromic upper strand: ⁇ '-GTCGGATCCGAC-S'
  • SEQ ID NO:7 palindromic lower strand 5'- GTCGGATCCGAC-3'
  • SEQ ID NO:8 n-PET TT-tailed adaptor (PMR 011 ) - upper strand:
  • SEQ ID NO:10 n-PET TT-tailed adaptor (PMR 012) - upper strand:
  • SEQ ID NO:11 n-PET TT-tailed adaptor (PMR 012) - lower strand:
  • SEQ ID NO:14 pGIS4a2 sense(FIG. 5A) (3531 bp) See sequence listing.
  • SEQ ID NO:15 pGIS4a2 antisense(FIG. 5A) (3531 bp) See sequence listing.
  • SEQ ID NO:16 pGIS3h sense (FIG. 5B) (2765 bp) See sequence listing.
  • SEQ ID NO:17 pGIS3h antisense (FIG. 5B) (2765 bp) See sequence listing.
  • SEQ ID NO:18 diPET from pGIS4a diPETtinq sense (82bp)
  • SEQ ID NO: 19 diPET from pGIS4a diPETtinq antisense (82bp)
  • SEQ ID NO:20 diPET from pGIS3h diPETtinq sense (90bp)
  • SEQ ID NO:21 diPET from pGIS3h diPETtinq antisense (90bp) Detailed Description of the invention
  • Restriction enzyme is an enzyme that cuts double-stranded DNA.
  • the enzyme makes two incisions, one through each of the phosphate backbones of the double helix without damaging the bases.
  • the chemical bonds cleaved by the enzymes may be reformed by other enzymes known as ligases, enabling restriction fragments obtained from different chromosomes or genes to be joined or spliced together, provided their ends are complementary or compatible.
  • Type Il enzymes recognize specific nucleic sequences (recognition sites) and cut DNA at defined positions close to or within their recognition sequence sites. They produce discrete restriction fragments and distinct gel banding patterns.
  • Type Ms enzymes cleave outside of their recognition sequence to one side.
  • Type III enzymes are also large combination restriction-and-modification enzymes. They cleave outside of their recognition sequences and require two such sequences in opposite orientations within the same DNA molecule to accomplish cleavage.
  • Homing endonucleases are rare double-stranded DNases that have large, asymmetric recognition sites (12-40 base pairs) and coding sequences that are usually embedded in either introns (DNA) or inteins (proteins).
  • Restriction enzymes may make cuts that leave either non-sticky (blunt) end or sticky (ligatable) ends with overhangs.
  • a sticky-end fragment can be iigated not only to the fragment from which it was originally cleaved, but also to any other fragment with a complementary, compatible, cohesive or sticky end.
  • ends produced by different enzymes may also be compatible.
  • Many type Il restriction enzymes cut palindromic DNA sequences. If a restriction enzyme cuts a non- degenerate palindromic cleavage site, all the ends produced are compatible.
  • a "palindromic" sequence is found where the sequence on one strand reads the same in the opposite direction on the complementary strand, allowing nucleic sequences cleaved to obtain palindromic cohesive ends can self-circularize when the two ends on the same strand mate.
  • the meaning of "palindromic" in this context is different from its linguistic usage.
  • the sequence GTAATG is not a palindromic DNA sequence, while the sequence GTATAC is.
  • restriction enzymes leaving cohesive or sticky ends include BamH1 , EcoR1 and Hindlll.
  • An example of restriction enzymes leaving blunt, non-cohesive or non- sticky ends is AIuI.
  • an end of a nucleic acid strand is said to be ligatable or capable of being ligated if it has a complementary, compatible, cohesive or sticky end or phosphorylated blunt end.
  • An end of a nucleic is said not to be ligatable or not capable of being ligated if it and the other strand of nucleic acid both have dephosphorylated ends, or if it does not have an end that another strand of nucleic acid is complementary, compatible, cohesive or sticky to.
  • a restriction enzyme name (such as EcoR1) can also refer to the nucleic acid sequence or recognition site recognized by the enzyme as readily understood in the context in which the enzyme name or recognition site appears.
  • Nucleotide - a phosphoric ester of nucleoside; the basic structural unit of nucleic acids (DNA or RNA). Nucleotides form base pairs - one of the pairs of chemical bases joined by hydrogen bonds that connect the complementary strands of a DNA molecule or of an RNA molecule that has two strands; the base pairs are adenine with thymine and guanine with cytosine in DNA and adenine with uracil and guanine with cytosine in RNA. Nucleotides may be joined with or concatenated with other nucleotides. The term nucleotide may be used interchangeably with the term nucleic acid.
  • a strand of nucleic acids may also possess a 5' end and a 3' end.
  • the end regions of a strand of nucleic acids may be referred to as the 5' terminus and the 3' terminus respectively.
  • Nucleic acid sequences are conventionally read in the 5' to 3' direction which gives the orientation of the nucleotides. Short strands of nucleotides are referred to as oligonucleotides while longer strands are referred to as polynucleotides.
  • an oligonucleotide can comprise at least one nucleotide fragment, tag or ditag.
  • a fragment is a length of nucleic acids obtained, derived or prepared from a longer length of nucleic acids.
  • a fragment can comprise at least one tag or ditag and can represent a larger nucleic acid molecule.
  • a polynucleotide can refer to a gene, a message RNA transcript of a gene, parts of a gene or a cDNA sequence representing a gene.
  • a second oligonucleotide may be referred to as being "upstream” from it; if the second oligonucleotide is positioned nearer to the 5' end of the first oligonucleotide or "downstream” if the second nucleotide is nearer to the 3' end of the first oligonucleotide.
  • Concatemer It is composed by at least two nucleotide monomers sequences linked end to end, optionally separated by a linker or spacer.
  • a concatemer comprises at least two tags, two ditags, two nucleotide fragments or two oligonucleotides prepared according to the method of the invention.
  • two oligonucleotides may be concatenated such that the 5' to 3' orientation of one nucleotide fragment in an oligonucleotide is opposite to the orientation of an adjacent nucleotide fragment positioned upstream or downstream of it.
  • Plasmid - With the term vector or recombinant vector it is intended a plasmid, virus or other vehicle known in the art that has been manipulated by insertion or incorporation of the ditag genetic sequences. Such vectors contain a promoter sequence that facilitates the efficient transcription of the inserted sequence.
  • the vector typically contains an origin of replication, a promoter, as well as specific genes which allow phenotypic selection of the transformed cells.
  • Vectors suitable for use in the present invention include for example, pBlueScript (Stratagene, La JoIIa, CA); pBC, pZErO-1 (Invitrogen, Carlsbad, CA) and pGEM3z (Promega, Madison, Wl) or modified vectors thereof, as well as other similar vectors known to those of skill in the art.
  • the pGEM vectors have also been disclosed in US 4,766,072, herein incorporated by reference.
  • the plasmid PGIS4a2 (clone B4-1) (FIG. 5A) was used.
  • Amplification increasing the copy number of nucleic acids.
  • One method commonly used is that of polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • Other amplification methods known to a skilled person such as bacterial amplification or rolling circle amplification may also be used.
  • Tag - A tag or signature is an identifiable sequence of nucleic acids. It may refer to either the 5'- or 3'-most terminal nucleic acid sequence (terminus; of any length but usually 18-20 bp) derived from any contiguous DNA region.
  • the terms tag and signature may be used interchangeably under the present invention.
  • a single tag signature (about 20bp) from each of two nucleotide fragments may be iigated to form a "tag1-linker-tag2" (also referred to as "first tag-linker-second tag) paired end ditag (PET) structure.
  • Linker-tag-tag-linker structure where a linker flanks a tag (that is, a linker is positioned upstream and/or downstream to at least one of the tag).
  • Linker - A linker is an artificial sequence of nucleic, usually containing one or more restriction enzyme recognition sites.
  • Ditag A short (usually 12-60 bp) strand of nucleotides comprising at least one tag or signature derived from a longer strand of nucleotides.
  • a ditag may be prepared according to US 20050255501 and/or US 20050059022, the contents of which are herein incorporated by reference.
  • a ditag may comprise either or both the 5' end region (also indicated as 5' tag) and 3' end region (also indicated as 3' tag) of a nucleic acid molecule.
  • a single tag signature (about 20bp) from each of two nucleotide fragments may be Iigated to form a "tag1-linker-tag2" (also referred to as “first tag-linker-second tag) paired end ditag (PET) structure.
  • tag1-linker-tag2 also referred to as "first tag-linker-second tag
  • PET paired end ditag
  • the present invention relates to a new method of manipulating nucleic acids. More specifically, the present invention relates to manipulation of nucleic acids by concatenating them. In particular, the invention relates to methods for the preparation of ditags and/or tags representing polynucleotides by concatenation. [0038] In one aspect, the present invention provides a method for length-controlled concatenation of signature tags representing polynucleotides such that concatemers having desired number of ditags and/or tags or having a particular length may be prepared. The present invention also provides molecules and components prepared by the method.
  • a method of length-controlled concatenating nucleotide fragments comprising: (a) providing at least two nucleotide fragments, wherein each fragment has one ligatable end and one non-ligatable end; and (b) allowing the two fragments to ligate at the ligatable ends to form at least one oligonucleotide comprising at least two concatenated nucleotide fragments (FIG. 1 ). This method may be repeated one or more times.
  • the method above may further comprise treating the at least one obtained oligonucleotide to produce at least one oligonucleotide with two iigatable ends and allowing the oligonucleotide to self-circularize at the ligatable ends.
  • the method may further comprise selecting the at least one circularized oligonucleotide and/or amplifying the oligonucleotide.
  • the method may further comprise treating the circularized and/or amplified oligonucleotide to produce at least one oligonucleotide having one ligatable end and one non- ligatable end, and allowing at least two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two oligonucleotides.
  • This method may be repeated one or more times. In one aspect, the repetition of concatenating results in a doubling of the number of concatenated nucleotide fragments.
  • a method comprising the steps of (a) providing at least one oligonucleotide comprising at least one nucleotide fragments, preferably comprising at least two nucleotide fragments, wherein the oligonucleotide has ligatable ends; (b) allowing the at least one oligonucleotide to self-circularize at its ligatable ends; (c) selecting the at least one self-circularized oligonucleotide; (d) treating the selected circularized oligonucleotide with at least one restriction enzyme to obtain at least one oligonucleotide with one ligatable end and one non- ligatable end; and (e) concatenating at least two oligonucleotides at the ligatable ends to form a concatemer of at least two oligonucleotides or at least two nucleotide fragments.
  • the provided oligonucleotide in step (a) may comprise at least two concatenated nucleotide fragments
  • the obtained concatenated oligonucleotide in step (e) comprises at least two four-concatenated nucleotide fragments (as shown in FIG.2).
  • Each nucleotide fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
  • the first ditag of nucleotide fragment has opposite orientation to the second ditag of the same nucleotide fragment.
  • each nucleotide fragment of the oligonucleotide has the opposite orientation to a nucleotide fragment positioned upstream and/or downstream.
  • This method may be repeated one or more times.
  • each repeating of concatenating results in a doubling of the number of concatenated nucleotide fragments.
  • each read can only read 100 bp. This restriction places a limit on the number of tags that can be used. When increasing the number of tags that can be accommodated, the length of each tag will necessarily have to be shortened; concomitantly decreasing specificity and increasing ambiguity. In contrast; if ditags are used in the method of the present invention to prepare length-controlled concatemers, each read will reveal hundreds or even thousands of base pairs of information as demarcated by the ditags. Furthermore, using diPETs as templates, the sequencing throughput and capacity of "454" sequencing are increased two-fold.
  • the method of the present invention when applied to preparing concatemers for sequencing and coupled with multiplex sequencing methods, is at least 500-fold more efficient than any of the currently existing methods for DNA sequencing analysis.
  • This method of length-controlled concatenation can be extended for a number of cycles to prepare concatemers having multiple ditags and may be applied to generate desired lengths of concatemers of any kind of tag fragment such as SAGE or "454" multiplex sequencing technology.
  • the method of the present invention may be applied to cDNA sequencing and ChIP DNA fragment sequencing as well as other sequencing technologies such as SAGE or MPSS.
  • the present invention provides, in one embodiment, a method for length-controlled concatenation of signature tags representing polynucleotides such that concatemers having desired number of tags or ditags and having a particular length may be prepared.
  • concatemers may be used in various sequencing technologies.
  • the ditags of the present invention may be prepared or obtained from the paired end ditagging (PET) strategy (US 20050255501).
  • PET paired end ditagging
  • a tag is a fragment obtained from a nucleic acid molecule and represents the polynucleotide from which the tag was obtained or derived from.
  • the polynucleotide which is intended to shrink or represent may be RNA, mRNA, genomic DNA, full-length cDNA, or cDNA.
  • two tags or fragments that are present in an oligonucleotide of the present invention may also be called a ditag.
  • a ditag is shorter than the original nucleic acid molecule from which it originates or which it represents.
  • the ditag must be much shorter than the original nucleic acid molecule.
  • the ditag may essentially comprise either or both the 5' end region (also indicated as 5' tag) and 3' end region (also indicated as 3' tag) of the original nucleic acid molecule.
  • the portion of the original nucleic acid molecule that is between or inside the 5' tag and 3' tag is not included in the ditag.
  • the ditag according to the invention retains the most informative features of the original nucleic acid molecule, namely: the start and the end signatures of the nucleic acid.
  • the 5' tag and 3' tag forming the ditag may have the same or different size. Preferably, they have the same number of nucleotides.
  • the ditag may be of any size, but needs to be meaningful and advantageous over the size of the parental sequence from which it is derived.
  • the preferred size of a tag or ditag is determined by genome complexity. For a bacterial genome a tag from about 8 bp to about 16 bp may be sufficient whereas for a complex genome like the human genome, a 16-20 bp tag (which results in a 32-40bp ditag) may be considered. In general, the size of the ditag is from about 12-60 bp.
  • each 5'-end and 3'-end represents a region or portion closest to the extremity; farthest from the middle region of the nucleic acid molecule or polynucleotide.
  • each ditag comprises sufficient information to characterize a specific polynucleotide. Hence, the ditag is representative of the structure and identity of the polynucleotide.
  • Concatenating oligonucleotides, nucleotide fragments, ditaqs or tags involve technical difficulties.
  • Current methods for tag concatenation randomly generate concatemers with a range of lengths: monomers, dimers, trimers and so forth.
  • concatemers have to be run in an electrophoresis gel to separate the concatemers, and the desired size of concatemers are excised from the gel - a laborious task. This technique is inefficient and requires large amounts of input DNA.
  • the present invention provides a method of length-controlled concatenation to generate oligonucleotides of a predetermined length.
  • the present invention achieves this by preparing fragments; ditags or tags with a ligatabie end and a non-ligatable end. Using this technique, a compatible, cohesive or sticky end on one fragment or tag will join or ligate to another sticky end on another fragment or tag. When this happens, the non-sticky ends will not permit further ligation and concatenation stops. This technique is further illustrated in the examples below. Should ligatabie ends not be found readily in the fragments or tags, suitable adaptors possessing the appropriate restriction enzyme recognition sites may be ligated to the fragments or tags.
  • the ligatabie ends may be palindromic ends.
  • the enzymes may be used sequentially, and after the first restriction digest, one end may be "blocked" by dephosphorylation or other means, such as attachment to a solid substrate.
  • tags with the single PET structures in the plasmids are flanked by the restriction enzyme recognition sites for a cohesive palindromic enzyme and an enzyme leaving a blunt end.
  • the two sites may be BamH1 (B) at one side and BseR1 (Bs) at the other side (FIG. 1A), such that the BamH1 cut leaves a palindromic cohesive end compatible to each other, while the BseR1 cut is designed to leave an AA residual or any non-palindrome sequence, which does not match to itself.
  • the PETs may be amplified, whether by bacterial amplification, rolling circle amplification, or other amplification methods (FIGS. 1A and 1B).
  • the PETs are then first cut with one restriction enzyme, in this embodiment BseR1 , followed by cutting with a different restriction enzyme, in this embodiment, BamHL
  • Released PETs may be purified by any suitable method such as gel purification.
  • any two of the BamH1 cohesive ends will find each other and mate, resulting in oligonucleotide concatemers having a dimer PET or diPET structures with two non-palindromic ends on each side of the oligonucleotide (FIG. 1B). These non-palindromic ends prevent further ligation with other PETs, stopping concatenation.
  • This embodiment gives rise to a diPET oligonucleotide concatemer made of two PETs of about 80bp: which is below the maximum capacity of the current "454" sequencing system.
  • any tag can also be turned into "diPETs" by this method. It is also preferable to use at least one type Hs restriction enzyme, such as BseR1 , as this will minimize the length of the border sequences. As long as the cut sites of the type Hs restriction enzyme are different, just one type Hs restriction enzyme site may be used.
  • two different types of adaptors (labeled as A and B) are ligated to the ends of the diPET. These adaptors contain the different restriction sites necessary for restriction digest later.
  • the adaptors are ligated, such that only those diPETs with different adaptors ligated will be circularized by self- circularization, and thus selected by an exonuclease treatment. Rolling circle amplification is performed to amplify the DNA. Alternatively, amplification and selection by PCR is also possible as the adaptor sequences are known.
  • the DNA is then cut with the appropriate restriction enzymes to generate a palindromic end and a non-palindromic end, such that ligation may be used to form a 4-PET oligonucleotide.
  • the cycle may then be repeated as desired to generate larger oligonucleotide concatemers comprising n-PETs.
  • Adaptors which are compatible will snap together, preventing PCR from taking place, allowing only adaptors which are different to be amplified.
  • the DNA may then be cut with the appropriate restriction enzymes, and the cycle repeated if desired.
  • the method of the present invention can also make use of fragments that do not have ligatable ends by adding suitable adaptors to them.
  • suitable adaptors may also be ligated to oligonucleotide concatemers to allow them to ligate to another nucleotide fragment, tag or oligonucleotide with a compatible cohesive end.
  • the method may further comprise the steps of treating the at least one oligonucleotide to produce at least one oligonucleotide having one ligatable end and one non-ligatable end, and allowing the oligonucleotide to ligate with a further oligonucleotide or nucleotide fragment to form a concatemer comprising at least two oligonucleotides or at least one oligonucleotide and at least one nucleotide fragment.
  • the method may further comprise: (a) treating the at least one oligonucleotide to produce at least one oligonucleotide having two ligatable ends compatible with each other, and allowing the oligonucleotide to self- circularize; (b) selecting the at least one self-circularized oligonucleotide; (c) optionally amplifying the selected oligonucleotide; (d) treating the oligonucleotide from the previous step to produce at least one oligonucleotide with one ligatable end and one non- ligatable end; and (e) allowing two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two oligonucleotides.
  • an isolated oligonucleotide comprising at least two nucleotide fragments, wherein each fragment has at least one ligatable end and and one non-ligatable end, and the fragments are ligated at the ligatable ends to form the oligonucleotide.
  • the ligatable ends are palindromic cohesive ends.
  • the concatemer or concatenated oligonucleotide(s) according to the invention comprises at least one nucleotide fragment, the fragment comprising at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
  • the polynucleotide may be a full-length cDNA or one or more exons. Accordingly, the ditag may be representative of the full-length cDNA.
  • the concatemer or oligonucleotide according to the invention has each nucleotide fragment (or ditag) in an orientation opposite to the orientation of a nucleotide fragment (or ditag) positioned upstream and/or downstream.
  • the concatemer, oligonucleotide, nucleotide fragment, ditag or tag according to the invention may be inserted into a plasmid or vector.
  • plasmid or vector comprising at least one concatemer, oligonucleotide, nucleotide fragment, ditag or tag according to the invention.
  • the plasmid or vector may be inserted in a host cell.
  • kits for concatenating oligonucleotides, nucleotide fragments, ditags and/or tag comprising at least one of a restriction enzyme, at least one nucleotide fragment, ditag or tag, optionally a vector, and any reagents as herein disclosed (for instance as described in the examples) for the reaction of concatenation.
  • the kit may further comprise illustration and/or information pertaining to the use of the kit.
  • a library comprising at least a concatemer, concatenated oligonucleotides, concatenated nucleotide fragments, concatenated ditags and/or concatenated tags according to any embodiment of the invention.
  • the tags or nucleotide fragments, and/or oligonucleotides or concatemers may be amplified.
  • the method amplification may be by bacterial amplification, by rolling circle amplification, and/or by polymerase chain reaction.
  • the method may comprise repeating the steps one or more times to obtain concatemers of desired lengths or number of oligonucleotides. The repeating may result in a doubling of the number of oligonucleotides in the concatemers.
  • the ligatable end of each fragment, tag or ditag may be a palindromic cohesive end.
  • the ligatable end and/or the non-ligatable end may be located in at least one adaptor.
  • the adaptor may be part of a plasmid or vector.
  • the nucleotide fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3 1 terminus of a polynucleotide.
  • Each nucleotide fragment or tag of the concatemer may have an orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream.
  • the method may further comprise sequencing the concatemer.
  • the sequencing may be by any suitable method, for example, by pyrosequencing.
  • any suitable vector may be used to clone and amplify the sequence of interest, for example, pGIS4a2 (FIG. 5A; SEQ ID NO:14) or pGIS3h (FIG. 5B; SEQ ID NO:15)
  • the yield is about 1.2 mg from 10 Q-trays, with a concentration of about 400 ng/ul
  • phase-lock gel Minimum volume: 5ml Maximum volume: 20ml [0071] lsopropanol precipitation was performed next due to the large volume involved:
  • the 1 M MgCb is to aid in the precipitation of small fragments of DNA. Incubate at -2O 0 C for an hour. Centrifuge at 13,000 RPM for 30mins at 4°C. Wash 1 x with 75% ethanol. Resuspend the PETs in a final volume of 12ul EB.
  • PET DNA can be quantified on a mini 4-20% PAGE gel (Invitrogen). Run 0.2ul of PET DNA together with 25bp ladder and Low Mass DNA ladder (both from Invitrogen). The latter will help in the estimation of the PET DNA amount. Currently, 2ul and 4ul of Low Mass ladder are used in the quantification. Below shows the preparation for the ladders for per loading:
  • PET DNA (at least 5 ug) 7 ul
  • 10x ligation buffer with Spermidine is made up of: 6OmM Tris-HCI pH7.5 (Ambion) 6OmM MgCI2 (Ambion) 5OmM NaCI (Ambion) 1 mg/ml BSA (New England Biolabs) 7OmM Beta-mercaptoethanol (Sigma) 1 mM ATP (Invitrogen) 2OmM DTT (Invitrogen) 1OmM spermidine (Sigma)
  • Example 2 The n-PET embodiment as example
  • the adaptors may contain source- identifying tags if desired.
  • the adaptors should preferably have one end that is complementary to each other, and ideally, on the other end, the adaptors have sticky ends complementary to sticky ends of the DNA such that ligation will be easy and the adaptors will not ligate to themselves. It is best if the DNA already contains sticky ends, like diPETs. If the DNA is blunt, however, the DNA may be A-tailed with DNA polymerase, and then T-tailed adaptors may be used.
  • the total volume is 25 ul. Incubate using in a PCR machine with the following program: 72 0 C for 30 minutes, followed by 4 0 C forever.
  • the DNA should be diluted to a concentration of approximately 2 ng/ul in the final ligation solution - the dilute solution will favour intramolecular ligation. Incubate at 16 0 C for 16 hours.
  • Restriction enzyme digest eg, Hindlll, as according to manufacturer's protocols
  • the total volume is 15 ul. Incubate at 16 0 C for 16 hours. Adjust the volume to 200 ul with Nuclease-free Water. Perform phenol chloroform pH 7.9 extraction with phase lock gel. Ethanol precipitate with glycoblue
  • the BD PCR-Select Bacterial Genome Subtraction Kit may be used instead.
  • Example 3 Variation of diPETS or n-PETs having cohesive and non- palindromic, or blunt, ends
  • This variation is possible by dividing the sample after amplification (Maxiprep or rolling circle amplification or other) into two lots.
  • One lot may be treated with a combination of restriction enzymes, phosphatases and kinases that produce tags with one end that is ligatable, for example a blunt phosphorylated end or a phosphorylated end with an overhang.
  • the other end is not ligatable, for example, the other end is a blunt, dephosphorylated end or has either a phosphorylated or dephosphorylated end with an overhang that is not complementary to the first end.
  • the other lot may be treated with a combination of restriction enzymes, phosphatases and kinases to produce one end that may be ligated, for example a blunt phosphorylated end or a phosphorylated end with an overhang, and another end that cannot be ligated, for example a blunt dephosphorylated end or a phosphorylated or dephosphorylated end with an overhang that is not complementary to the first end, wherein the ligatable end, may be ligated to the ligatable end of the first lot.
  • phosphorylated, blunt ends may be ligated to each other, and phsophorylated, cohesive ends complementary to each other may be ligated to each other.
  • a digest may produce AA tails in one half, and another digest may produce TT tails in the other half, which can then be mixed in a 1 :1 or other suitable ratio to result in dimerization to produce length-controlled diPETs.
  • n-PET consisting of 7 PETs
  • first separate the sample into 3 aiiquots Then prepare diPETs for two aliquots, and leave one aliquot as it is. With one of the aliquots of diPETs, prepare a 4-PET according to the n-PET method. Finally, combine aliquots: adding the remaining diPET aliquot to the single PET aliquot to form a 3-PET, and then combine the 3-PET with the 4- PET.
  • Example 5 - diPETs that have 5' to 3' directionality in different orders [00107]
  • Treat the other lot with a combination of restriction enzymes, phosphatases, and kinases such that the 3' end is a cohesive, non-palindromic end or a blunt end, which may be ligated, while the 5' end cannot be ligated.
  • ChIP-PET is a method to identify DNA regions that interact with proteins such as those found in chromatin structures. This variation requires a different vector, pGIS3h (FIG. 5B, SEQ ID NO:15). When cut, this produces diPETs of about 88 base pairs and 4 residues (there are 2 CG residues on either end). There are no AA tails. In contrast, the other variations of the present invention uses pGIS4a2 (FIG. 5A; SEQ ID NO: 14). This produces diPETs of a total size of 80 base pairs and 4 residues (there are 2 AA residues on either end).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The present invention provides a method of manipulating nucleic acids. In particular, of length-controlled concatenating of nucleotide fragments, the method comprising: (a) providing at least two nucleotide fragments, wherein each fragment has one ligatable end and one non-ligatable end; and (b) allowing the two fragments to ligate at the ligatable ends to form an oligonucleotide comprising at least two concatenated nucleotide fragments. The present invention also provides an isolated oligonucleotide comprising at least two nucleotide fragments, wherein each fragment has at least one ligatable end and one non-ligatable end, and the fragments are ligated at the ligatable ends to form the oligonucleotide.

Description

NUCLEIC ACID CONCATENATION
Field of the Invention
[001] The present invention generally relates to the field of nucleic acids. Specifically, the present invention relates to concatenation of nucleic acids.
Background of the Invention
[002] One of the most important goals of the human genome project is to provide complete lists of genes for the genomes of human and model organisms. Complete genome annotation of genes relies on comprehensive transcriptome analysis by experimental and computational approaches. Ab initio predictions of genes must be validated by experimental data. However; due to the complexity and immense volume of transcripts expressed in the various developmental stages of an organism's life cycle, complete sequencing analysis of all different transcriptomes still remains unrealistic.
[003] In the past decade this problem has been overcome with a cDNA tagging method in which partial sequences that represent full transcripts are obtained. This strategy has been widely used for determining genes and characterizing transcriptomes.
[004] In the expressed sequence tag (EST) approach, cDNA clones are sequenced from 5' and/or 3' nucleotides (Adams, M., et al., 1991 , Science, 252, 1651-1656). Each EST sequence read would generate; on average, a 500bp tag per transcript. The number of identical or overlapping ESTs would indicate the relative level of gene expression activity. Though this is an effective approach to identifying genes, it is prohibitively expensive to tag every transcript in a transcriptome. In practice, sequencing usually ceases after 10,000 or fewer ESTs are obtained from a cDNA library where millions of transcripts might be cloned. [005] To increase the efficiency of sequencing and counting large numbers of transcripts, Serial Analysis of Gene Expression (SAGE) (Velculescu, V. E., et al., 1995, Science, 270, 484-487) and the recent Massively Parallel Signature Sequencing (MPSS) technique (Brenner S, et al., 2000, Nature Biotechnology, 18, 630-634) were developed based on how a short signature sequence (14-20bp) of a transcript can be sufficiently specific to represent that gene. [006] Experimentally, one short tag per transcript can be extracted from cDNA. Such short tags can be sequenced efficiently either by a concatenation tactic (as for SAGE), or by a hybridization-based methodology for MPSS. For example; in SAGE, multiple tags are concatenated into long DNA fragments and cloned for sequencing. Each SAGE sequence readout can usually reveal 20-30 SAGE tags. A modest SAGE sequencing effort of less than 10,000 reads will have significant coverage of a transcriptome. Transcript abundance is measured by simply counting the numerical frequency of the SAGE tags.
[007] In theory, short DNA tags of about 20bp can be specifically mapped to a single location within a complex mammalian genome and uniquely represent a transcript in the content of whole transcriptome. However, in reality, there still exist a large number of "ambiguous" SAGE tags (14-21bp) and MPSS tags (17bp) that have multiple locations in a genome, and may be shared by many genes. [008] A recent sequencing technology is that of "pyrosequencing" or "454" technology (Margulies et al, 2005). In recently published findings, a "454" sequencing run can simultaneously read 300,000 templates and achieve a 100- fold efficiency increase and 10-fold cost reduction compared with current sequencing instruments. However; each "454" sequencing read can only read about 100 bp, seriously limiting its potential because it is difficult to sequence large contiguous stretches of DNA.
[009] Hence; the main problems in the current art of genomic sequencing are of ambiguity when using short nucleotide tags to represent gene transcripts, and the limitation of current multiplex sequencing technologies to reading short stretches of nucleotides.
Summary of the Invention
[0010] The present invention solves the problems mentioned above by providing a new method of manipulating nucleic acids. More specifically, the present invention relates to manipulation of nucleic acids. In particular, the invention relates to methods for the preparation of nucleotide fragments by concatenation. [0011] In one aspect, the present invention provides a new method of concatenation of nucleotide fragments. In particular, there is provided a length- controlled concatenation of nucleotide fragments such that concatemers having a desired number of nucleotide fragments; or having a particular length, may be prepared. The present invention also provides molecules and components prepared by the method. The nucleotide fragment according to the invention may comprise at least one ditag and/or at least one tag. Accordingly, a concatenation of at least two nucleotide fragments may comprise at least two concatenated ditags and/or tags
[0012] Accordingly, there is provided a method of length-controlled concatenating nucleotide fragments, the method comprising: (a) providing at least two nucleotide fragments, wherein each fragment has one ligatabie end and one non-ligatable end; and (b) allowing the two fragments to ligate at the ligatabie ends to form at least one oligonucleotide comprising of at least two concatenated nucleotide fragments.
[0013] In one embodiment, the method may further comprise the steps of treating the at least one oligonucleotide to produce at least one oligonucleotide having one ligatabie end and one non-ligatable end, and allowing the oligonucleotide to ligate with a further oligonucleotide or a nucleotide fragment to form an oligonucleotide comprising more than two concatenated nucleotides. The method may be repeated one or more times to make at least one oligonucleotide with an increasing number of nucleotide fragments. According to one aspect, the repetition of concatenation yields a doubling of the number of concatenated nucleotide fragments.
[0014] The method may further comprise treating the at least one oligonucleotide to produce at least one oligonucleotide with two ligatabie ends and allowing the oligonucleotide to self-circularize at the iigatable ends. The method may further comprise selecting the at least one circularized oligonucleotide and/or amplifying the oligonucleotide. The method may further comprise treating the circularized and/or amplified oligonucleotide to produce at least one oligonucleotide having one ligatabie end and one non-ligatable end, and allowing at least two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two concatenated oligonucleotides.
[0015] In another embodiment, the method may further comprise: (a) treating the at least one oligonucleotide to produce at least one oligonucleotide having two ligatable ends compatible with each other, and allowing the oligonucleotide to self- circularize at its ligatable ends; (b) selecting the at least one self-circularized oligonucleotide; (c) optionally amplifying the selected oligonucleotide; (d) treating the oligonucleotide (from step b or c) to produce at least one oligonucleotide with one ligatable end and one non-ligatable end; and (e) allowing two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two concatenated oligonucleotides.
[0016] In another embodiment, the method comprises the steps of (a) providing at least one oligonucleotide, wherein the oligonucleotide has ligatable ends; (b) allowing the at least one oligonucleotide to self-circularize at its ligatable ends; (c) selecting the at least one self-circularized oligonucleotide; (d) treating the selected circularized oligonucleotide with at least one restriction enzyme to obtain at least one oligonucleotide with one ligatable end and one non-ligatable end; and (e) concatenating at least two oligonucleotides at the ligatable ends to form a concatemer of at least two oligonucleotides.
[0017] For the embodiments in this aspect of the invention, the nucleotide fragments, oligonucleotide(s) and/or concatemer(s) may be amplified. The amplification may be by bacterial amplification, by rolling circle amplification, and/or by polymerase chain reaction. The method may comprise repeating the steps one or more times to obtain concatemers having desired lengths and/or number of oligonucleotides or nucleotide fragments. The repeating may result in a doubling of the number of oligonucleotides in the concatemers. The ligatable end of each fragment may be a palindromic cohesive end. The ligatable end and/or the non-ligatable end may be located in at least one adaptor. The adaptor may be part of a plasmid or vector. The nucleotide fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide. Each nucleotide fragment of the concatemer may have an orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream. The method may further comprise sequencing the concatemer. The sequencing may be by any suitable method, for example, by pyrosequencing. [0018] In another aspect, the present invention provides an isolated concatemer (one oligonucleotide or at least two concatenated oligonucleotides) comprising at least two nucleotide fragments, wherein each fragment has at least one ligatable end and one non-ligatable end, and the fragments are ligated at the ligatable ends to form the concatemer. The ligatable ends may be palindromic cohesive ends. The fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide. Each nucleotide fragment of the concatemer may have orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream. The concatemer may be inserted into a plasmid or vector. The polynucleotide may be DNA or RNA.
Brief Description of the Drawings
[001] FIG.1 illustrates the overview of one embodiment of the method of the present invention for preparing concatenated nucleotide fragments (concatements). Each nucleotide fragment comprises a ditag. The ditag comprises a first tag comprising a 5' terminus (gray in the figure) and a second tag comprising a 3' terminus (black in the figure) of a polynucleotide (for example, a full-length cDNA). The ditag may have one end sticky or ligatable and the other end non-sticky (FIG 1A) or blunt-ended (may be either ligatable or non-ligatable) (FIG. 1 B) The diPET as shown in FIG. 1A may be further ligated to nucleotide fragments with complementary ends and the diPET shown in FIG.1 B may be further ligated to nucleotide fragments with ligatable (blunt-ended, phosphorylated) ends. The two concatenated nucleotide fragments obtained comprise two concatenated ditags. In the figure, it is shown that the first and second ditag of each two concatenated nucleotide fragments have opposite orientation. The ditags are suitable for sequencing by large scale parallel sequencing methods, such as "454" sequencing. The ditags may be prepared by using single paired-end ditag (PET) plasmids or from insertion of other DNA sequences wherein the 5' and 3' termini are of interest. Mme1 sites are present, but not shown on the plasmid. [0019] FIG.2 illustrates the overview of another embodiment of the method of the present invention for preparing concatemers comprising n-PETs (where n=4,8, 16,32....). allowing the scaling up of the number of paired-end ditags (PETs). The oligonucleotides in the concatemers may be added in a precise manner as the capacity of the sequencing technology used increases. Two different types of adaptors are ligated to the ends of the diPET. These adaptors will contain the different restriction sites necessary for subsequent restriction digestion. The adaptors are ligated, such that only those diPETs with different adaptors ligated on will be circularized, and thus selected by an optional exonuclease treatment. Rolling circle amplification is performed to amplify the DNA. The DNA is then cut with the appropriate restriction enymes to generate a sticky end and an end that is not sticky, such that ligation may be used to form an n-PET. The cycle may then be repeated as desired to generate larger n-pets. Amplification and selection by PCR is also possible.
[0020] FIG.3 illustrates electroeluted PETs in a PAGE gel. Lanes 1 and 9: Invitrogen 25 bp ladder. Lane 2: Invitrogen 100 bp ladder. Lane 3: BseR1 and BamH1 cut PETs from control library. Lane 4: BseR1 and BamH1 cut PETs from experimental library. Lane 5: 2 ul Invitrogen Low Mass ladder. Lane 6: 4 ul Invitrogen Low Mass ladder. Lane 7: Blunted BseR1 and BamH1 cut PETs from control library. Lane 8: Blunted BseR1 and BamH1 cut PETs from experimental library.
[0021] FIG. 4 illustrates electroeluted diPETs in a PAGE gel. Lane 1 : Invitrogen 25 bp ladder. Lane 2: Invitrogen 100 bp ladder. Lane 3: Ligation product of control library; as expected, this library formed concatemers. Lane 4: Ligation product of experimental library - this library formed length-controlled diPETs, as can be seen by the single clear, sharp band. Lane 5: 2 ul of Invitrogen Low Mass ladder. Lane 6: 4 ul of Invitrogen Low Mass ladder. [0022] FIG. 5 illustrates two examples of vectors used in the method of the present invention. FIG. 5A is the pGIS4a2 vector and FIG. 5B is the pGIS3h vector.
[0023] FIG. 6 illustrates concatenation of ditags to obtain a concatemer of a desired length for sequencing.
Sequence numbering of primers, adaptors and vectors
SEQ ID NO:1: Gsul-oligo dT primer: δ'-GAGCTAGTTCTGGAGTTTTTTTTTTTTTTTTVN-S'
SEQ ID NO:2: GIS-(N)6 adapter upper strand: δ'-CTAAACTCGAGGCGGCCGCGGATCCGACNNNNNN-S'
SEQ ID NO:3: GIS-(N)6 adapter lower strand: δ'-p-GTCGGATCCGCGGCCGCCTCGAGTTT-S'
SEQ ID NO:4: GIS-(N)5 adapter upper strand: 5'-CTAAACTCGAGGCGGCCGCGGATCCGACGNNNNN-S'
SEQ ID NO:5: GIS-(N)5 adapter lower strand: δ'-p-GTCGGATCCGCGGCCGCCTCGAGTTT-S'
SEQ ID NO:6: palindromic upper strand: δ'-GTCGGATCCGAC-S'
SEQ ID NO:7: palindromic lower strand 5'- GTCGGATCCGAC-3'
SEQ ID NO:8: n-PET TT-tailed adaptor (PMR 011 ) - upper strand:
5' GCTTGTAAGCTACTCCTCGATGTGCTGCAAGGCGATTAAG 3' (40 nt) SEQ ID NO:9: n-PET TT-tailed adaptor (PMR 011) - lower strand:
3' TTCGAACATTCGATGAGGAGCTACACGACGTTCCGCTAATTC 5' (42 nt)
SEQ ID NO:10: n-PET TT-tailed adaptor (PMR 012) - upper strand:
5' GCTTGTAAGCTACTCCTCAGCGGATAACAATTTCACACAGG 3' (41 nt)
SEQ ID NO:11 : n-PET TT-tailed adaptor (PMR 012) - lower strand:
3' TTCGAACATTCGATGAGGAGTCGCCTATTGTTAAAGTGTGTCC 5' (43 nt)
SEQ ID NO:12: PMR011 :
5' GATGTGCTGCAAGGCGATTAAG 3' (22 nt) 5' ends are phosphorylated
SEQ ID NO:13: PMR012
5' AGCGGATAACAATTTCACACAGG 3' (23 nt)
5' ends are phosphorylated
SEQ ID NO:14: pGIS4a2 sense(FIG. 5A) (3531 bp) See sequence listing.
SEQ ID NO:15: pGIS4a2 antisense(FIG. 5A) (3531 bp) See sequence listing.
SEQ ID NO:16: pGIS3h sense (FIG. 5B) (2765 bp) See sequence listing.
SEQ ID NO:17: pGIS3h antisense (FIG. 5B) (2765 bp) See sequence listing.
SEQ ID NO:18: diPET from pGIS4a diPETtinq sense (82bp)
SEQ ID NO: 19: diPET from pGIS4a diPETtinq antisense (82bp)
SEQ ID NO:20: diPET from pGIS3h diPETtinq sense (90bp)
SEQ ID NO:21 : diPET from pGIS3h diPETtinq antisense (90bp) Detailed Description of the invention
Definitions.
Restriction enzyme - A restriction enzyme (or restriction endonuclease) is an enzyme that cuts double-stranded DNA. The enzyme makes two incisions, one through each of the phosphate backbones of the double helix without damaging the bases. The chemical bonds cleaved by the enzymes may be reformed by other enzymes known as ligases, enabling restriction fragments obtained from different chromosomes or genes to be joined or spliced together, provided their ends are complementary or compatible. Type Il enzymes recognize specific nucleic sequences (recognition sites) and cut DNA at defined positions close to or within their recognition sequence sites. They produce discrete restriction fragments and distinct gel banding patterns. Type Ms enzymes cleave outside of their recognition sequence to one side. Mmel, as well as most of the type Hs restriction enzymes, produce variable end lengths. Dunn et al (2002) showed that Mmel can cut 18/20 or 19/21 bases away in a rough proportion of 1 :1. Type III enzymes are also large combination restriction-and-modification enzymes. They cleave outside of their recognition sequences and require two such sequences in opposite orientations within the same DNA molecule to accomplish cleavage. Homing endonucleases are rare double-stranded DNases that have large, asymmetric recognition sites (12-40 base pairs) and coding sequences that are usually embedded in either introns (DNA) or inteins (proteins). Restriction enzymes may make cuts that leave either non-sticky (blunt) end or sticky (ligatable) ends with overhangs. A sticky-end fragment can be iigated not only to the fragment from which it was originally cleaved, but also to any other fragment with a complementary, compatible, cohesive or sticky end. As such, ends produced by different enzymes may also be compatible. Many type Il restriction enzymes cut palindromic DNA sequences. If a restriction enzyme cuts a non- degenerate palindromic cleavage site, all the ends produced are compatible. A "palindromic" sequence is found where the sequence on one strand reads the same in the opposite direction on the complementary strand, allowing nucleic sequences cleaved to obtain palindromic cohesive ends can self-circularize when the two ends on the same strand mate. The meaning of "palindromic" in this context is different from its linguistic usage. For example, the sequence GTAATG is not a palindromic DNA sequence, while the sequence GTATAC is. Examples of restriction enzymes leaving cohesive or sticky ends include BamH1 , EcoR1 and Hindlll. An example of restriction enzymes leaving blunt, non-cohesive or non- sticky ends is AIuI. Under the present invention, an end of a nucleic acid strand is said to be ligatable or capable of being ligated if it has a complementary, compatible, cohesive or sticky end or phosphorylated blunt end. An end of a nucleic is said not to be ligatable or not capable of being ligated if it and the other strand of nucleic acid both have dephosphorylated ends, or if it does not have an end that another strand of nucleic acid is complementary, compatible, cohesive or sticky to. Also, a restriction enzyme name (such as EcoR1) can also refer to the nucleic acid sequence or recognition site recognized by the enzyme as readily understood in the context in which the enzyme name or recognition site appears.
[0024] Nucleotide - a phosphoric ester of nucleoside; the basic structural unit of nucleic acids (DNA or RNA). Nucleotides form base pairs - one of the pairs of chemical bases joined by hydrogen bonds that connect the complementary strands of a DNA molecule or of an RNA molecule that has two strands; the base pairs are adenine with thymine and guanine with cytosine in DNA and adenine with uracil and guanine with cytosine in RNA. Nucleotides may be joined with or concatenated with other nucleotides. The term nucleotide may be used interchangeably with the term nucleic acid. Each nucleotide possesses a 5' end and a 3' end and accordingly, a strand of nucleic acids may also possess a 5' end and a 3' end. The end regions of a strand of nucleic acids may be referred to as the 5' terminus and the 3' terminus respectively. Nucleic acid sequences are conventionally read in the 5' to 3' direction which gives the orientation of the nucleotides. Short strands of nucleotides are referred to as oligonucleotides while longer strands are referred to as polynucleotides. Under the present invention, an oligonucleotide can comprise at least one nucleotide fragment, tag or ditag. A fragment is a length of nucleic acids obtained, derived or prepared from a longer length of nucleic acids. As such, a fragment can comprise at least one tag or ditag and can represent a larger nucleic acid molecule. Under the present invention, a polynucleotide can refer to a gene, a message RNA transcript of a gene, parts of a gene or a cDNA sequence representing a gene. With reference to a first oligonucleotide, a second oligonucleotide may be referred to as being "upstream" from it; if the second oligonucleotide is positioned nearer to the 5' end of the first oligonucleotide or "downstream" if the second nucleotide is nearer to the 3' end of the first oligonucleotide.
[0025] Concatemer - It is composed by at least two nucleotide monomers sequences linked end to end, optionally separated by a linker or spacer. For the purpose of the present invention, a concatemer comprises at least two tags, two ditags, two nucleotide fragments or two oligonucleotides prepared according to the method of the invention. In the present invention, two oligonucleotides may be concatenated such that the 5' to 3' orientation of one nucleotide fragment in an oligonucleotide is opposite to the orientation of an adjacent nucleotide fragment positioned upstream or downstream of it.
[0026] Plasmid - With the term vector or recombinant vector it is intended a plasmid, virus or other vehicle known in the art that has been manipulated by insertion or incorporation of the ditag genetic sequences. Such vectors contain a promoter sequence that facilitates the efficient transcription of the inserted sequence. The vector typically contains an origin of replication, a promoter, as well as specific genes which allow phenotypic selection of the transformed cells. Vectors suitable for use in the present invention include for example, pBlueScript (Stratagene, La JoIIa, CA); pBC, pZErO-1 (Invitrogen, Carlsbad, CA) and pGEM3z (Promega, Madison, Wl) or modified vectors thereof, as well as other similar vectors known to those of skill in the art. The pGEM vectors have also been disclosed in US 4,766,072, herein incorporated by reference. In the present invention, the plasmid PGIS4a2 (clone B4-1) (FIG. 5A) was used.
[0027] Obtain, derive, prepare - to use molecular biology and genetic engineering and manipulation techniques on biological material such as nucleic acids and proteins to confer upon the material certain desired characteristics. The terms obtain, derive and prepare may be used interchangeably under the present invention.
[0028] Amplification - increasing the copy number of nucleic acids. One method commonly used is that of polymerase chain reaction (PCR). Other amplification methods known to a skilled person such as bacterial amplification or rolling circle amplification may also be used.
[0029] Tag - A tag or signature is an identifiable sequence of nucleic acids. It may refer to either the 5'- or 3'-most terminal nucleic acid sequence (terminus; of any length but usually 18-20 bp) derived from any contiguous DNA region. The terms tag and signature may be used interchangeably under the present invention. Under the present invention, a single tag signature (about 20bp) from each of two nucleotide fragments may be iigated to form a "tag1-linker-tag2" (also referred to as "first tag-linker-second tag) paired end ditag (PET) structure. Another possible arrangement is a linker-tag-tag-linker structure where a linker flanks a tag (that is, a linker is positioned upstream and/or downstream to at least one of the tag). [0030] Linker - A linker is an artificial sequence of nucleic, usually containing one or more restriction enzyme recognition sites.
[0031] Ditag - A short (usually 12-60 bp) strand of nucleotides comprising at least one tag or signature derived from a longer strand of nucleotides. A ditag may be prepared according to US 20050255501 and/or US 20050059022, the contents of which are herein incorporated by reference. A ditag may comprise either or both the 5' end region (also indicated as 5' tag) and 3' end region (also indicated as 3' tag) of a nucleic acid molecule. Under the present invention, a single tag signature (about 20bp) from each of two nucleotide fragments may be Iigated to form a "tag1-linker-tag2" (also referred to as "first tag-linker-second tag) paired end ditag (PET) structure. When two paired end ditags (PET) each comprises a 5' tag and a 3' tag of each nucleotide fragment, for example the structure of "PET1-linker- PET2", is called a diPET. [0032] Sequencing - The methods used to determine the order of constituents in a biopolymer, in this case, a nucleic acid. Sequencing techniques used include Sanger method and modified variations thereof, as well as pyrosequencing or the "454 method" of sequencing.
[0033] In the following description, details are provided to describe the embodiments of the present invention. It shall be apparent to one skilled in the art, however the invention may be practiced without such details. Some of the details may not be described at length so as not to obscure the invention.
[0034] For the performance of the methods of the present invention for a particular embodiment, any description disclosed for the purpose of carrying out other embodiments of this invention may also be used and are herein incorporated by reference. In particular, technique(s), reagents, experimental conditions, restriction sites, enzymes, vectors, primers, and the like. In particular, it will be evident to any skilled person how to adapt techniques and material disclosed for the other embodiments to the present embodiment of the invention.
[0035] Bibliographic references mentioned in the present specification are for convenience listed in the form of a list of references and added at the end of the examples.
[0036] Standard molecular biology techniques known in the art and not specifically described were generally followed as described in standard molecular biology reference books such as Molecular Cloning: A Laboratory Manual by Sambrook and Russell, Third Edition, 2001 , published by Cold Spring Harbor Laboratory
Press.
Description
[0037] The present invention relates to a new method of manipulating nucleic acids. More specifically, the present invention relates to manipulation of nucleic acids by concatenating them. In particular, the invention relates to methods for the preparation of ditags and/or tags representing polynucleotides by concatenation. [0038] In one aspect, the present invention provides a method for length-controlled concatenation of signature tags representing polynucleotides such that concatemers having desired number of ditags and/or tags or having a particular length may be prepared. The present invention also provides molecules and components prepared by the method.
[0039] Accordingly, there is provided a method of length-controlled concatenating nucleotide fragments, the method comprising: (a) providing at least two nucleotide fragments, wherein each fragment has one ligatable end and one non-ligatable end; and (b) allowing the two fragments to ligate at the ligatable ends to form at least one oligonucleotide comprising at least two concatenated nucleotide fragments (FIG. 1 ). This method may be repeated one or more times. [0040] The method above may further comprise treating the at least one obtained oligonucleotide to produce at least one oligonucleotide with two iigatable ends and allowing the oligonucleotide to self-circularize at the ligatable ends. The method may further comprise selecting the at least one circularized oligonucleotide and/or amplifying the oligonucleotide. The method may further comprise treating the circularized and/or amplified oligonucleotide to produce at least one oligonucleotide having one ligatable end and one non- ligatable end, and allowing at least two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two oligonucleotides. This method may be repeated one or more times. In one aspect, the repetition of concatenating results in a doubling of the number of concatenated nucleotide fragments.
[0041] In another embodiment, there is provided a method (see FIG.2) comprising the steps of (a) providing at least one oligonucleotide comprising at least one nucleotide fragments, preferably comprising at least two nucleotide fragments, wherein the oligonucleotide has ligatable ends; (b) allowing the at least one oligonucleotide to self-circularize at its ligatable ends; (c) selecting the at least one self-circularized oligonucleotide; (d) treating the selected circularized oligonucleotide with at least one restriction enzyme to obtain at least one oligonucleotide with one ligatable end and one non- ligatable end; and (e) concatenating at least two oligonucleotides at the ligatable ends to form a concatemer of at least two oligonucleotides or at least two nucleotide fragments. In particular, the provided oligonucleotide in step (a) may comprise at least two concatenated nucleotide fragments, and the obtained concatenated oligonucleotide in step (e) comprises at least two four-concatenated nucleotide fragments (as shown in FIG.2). Each nucleotide fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide. In one embodiment, the first ditag of nucleotide fragment has opposite orientation to the second ditag of the same nucleotide fragment. Accordingly, each nucleotide fragment of the oligonucleotide has the opposite orientation to a nucleotide fragment positioned upstream and/or downstream. This method may be repeated one or more times. In one aspect, each repeating of concatenating results in a doubling of the number of concatenated nucleotide fragments.
Multiplex Sequencing
[0042] The "454" sequencing or pyrosequencing technology has been developed for genomic DNA sequencing but each read can only read 100 bp. This restriction places a limit on the number of tags that can be used. When increasing the number of tags that can be accommodated, the length of each tag will necessarily have to be shortened; concomitantly decreasing specificity and increasing ambiguity. In contrast; if ditags are used in the method of the present invention to prepare length-controlled concatemers, each read will reveal hundreds or even thousands of base pairs of information as demarcated by the ditags. Furthermore, using diPETs as templates, the sequencing throughput and capacity of "454" sequencing are increased two-fold. Hence, the method of the present invention, when applied to preparing concatemers for sequencing and coupled with multiplex sequencing methods, is at least 500-fold more efficient than any of the currently existing methods for DNA sequencing analysis. This method of length-controlled concatenation can be extended for a number of cycles to prepare concatemers having multiple ditags and may be applied to generate desired lengths of concatemers of any kind of tag fragment such as SAGE or "454" multiplex sequencing technology. In addition, the method of the present invention may be applied to cDNA sequencing and ChIP DNA fragment sequencing as well as other sequencing technologies such as SAGE or MPSS.
[0043] To overcome the problems and limitations of the current art, the present invention provides, in one embodiment, a method for length-controlled concatenation of signature tags representing polynucleotides such that concatemers having desired number of tags or ditags and having a particular length may be prepared. Such concatemers may be used in various sequencing technologies.
Tags and ditags
[0044] The ditags of the present invention may be prepared or obtained from the paired end ditagging (PET) strategy (US 20050255501). For the purpose of the present application, a tag is a fragment obtained from a nucleic acid molecule and represents the polynucleotide from which the tag was obtained or derived from. The polynucleotide which is intended to shrink or represent may be RNA, mRNA, genomic DNA, full-length cDNA, or cDNA.
[0045] Under the present invention, two tags or fragments that are present in an oligonucleotide of the present invention may also be called a ditag. Like fragments or tags, a ditag is shorter than the original nucleic acid molecule from which it originates or which it represents. Preferably, the ditag must be much shorter than the original nucleic acid molecule. As consequence of the "shrinking", the ditag may essentially comprise either or both the 5' end region (also indicated as 5' tag) and 3' end region (also indicated as 3' tag) of the original nucleic acid molecule. Hence, the portion of the original nucleic acid molecule that is between or inside the 5' tag and 3' tag is not included in the ditag. The ditag according to the invention retains the most informative features of the original nucleic acid molecule, namely: the start and the end signatures of the nucleic acid. [0046] The 5' tag and 3' tag forming the ditag may have the same or different size. Preferably, they have the same number of nucleotides. The ditag may be of any size, but needs to be meaningful and advantageous over the size of the parental sequence from which it is derived. The preferred size of a tag or ditag is determined by genome complexity. For a bacterial genome a tag from about 8 bp to about 16 bp may be sufficient whereas for a complex genome like the human genome, a 16-20 bp tag (which results in a 32-40bp ditag) may be considered. In general, the size of the ditag is from about 12-60 bp.
[0047] For the purpose of the present application, the terms 5'-terminus, 5'-end and 5'-tag are equivalent to each other and may be used interchangeably. In the same way, the terms 3'-terminus, 3'-end and 3'-tag are equivalent to each other and may be used interchangeably. In an original nucleic acid molecule or polynucleotide, or portion inside a nucleic acid molecule or polynucleotide that one intends to reduce or represent, each 5'-end and 3'-end represents a region or portion closest to the extremity; farthest from the middle region of the nucleic acid molecule or polynucleotide. With a 5' or 3' terminus of a polynucleotide, it is understood that any region, fragment or whole piece of a polynucleotide that comprises the actual 5' or 3' terminus of the polynucleotide are included. [0048] Each ditag comprises sufficient information to characterize a specific polynucleotide. Hence, the ditag is representative of the structure and identity of the polynucleotide.
Concatenating oligonucleotides, nucleotide fragments, ditaqs or tags [0049] While fragments or tags of sufficient length may be prepared, obtaining concatemers of a predetermined length involve technical difficulties. Current methods for tag concatenation randomly generate concatemers with a range of lengths: monomers, dimers, trimers and so forth. In the current art, such concatemers have to be run in an electrophoresis gel to separate the concatemers, and the desired size of concatemers are excised from the gel - a laborious task. This technique is inefficient and requires large amounts of input DNA.
[0050] The present invention provides a method of length-controlled concatenation to generate oligonucleotides of a predetermined length. The present invention achieves this by preparing fragments; ditags or tags with a ligatabie end and a non-ligatable end. Using this technique, a compatible, cohesive or sticky end on one fragment or tag will join or ligate to another sticky end on another fragment or tag. When this happens, the non-sticky ends will not permit further ligation and concatenation stops. This technique is further illustrated in the examples below. Should ligatabie ends not be found readily in the fragments or tags, suitable adaptors possessing the appropriate restriction enzyme recognition sites may be ligated to the fragments or tags. The ligatabie ends may be palindromic ends. [0051] Alternatively, if two different enzymes cannot be used in order to yield one end that is palindromic and another end that will not self-ligate, the enzymes may be used sequentially, and after the first restriction digest, one end may be "blocked" by dephosphorylation or other means, such as attachment to a solid substrate.
The diPET embodiment
[0052] In this embodiment, tags with the single PET structures in the plasmids are flanked by the restriction enzyme recognition sites for a cohesive palindromic enzyme and an enzyme leaving a blunt end. For example, the two sites may be BamH1 (B) at one side and BseR1 (Bs) at the other side (FIG. 1A), such that the BamH1 cut leaves a palindromic cohesive end compatible to each other, while the BseR1 cut is designed to leave an AA residual or any non-palindrome sequence, which does not match to itself. The PETs may be amplified, whether by bacterial amplification, rolling circle amplification, or other amplification methods (FIGS. 1A and 1B). The PETs are then first cut with one restriction enzyme, in this embodiment BseR1 , followed by cutting with a different restriction enzyme, in this embodiment, BamHL
[0053] Released PETs may be purified by any suitable method such as gel purification. Upon exposure to another similarly-prepared PET, any two of the BamH1 cohesive ends will find each other and mate, resulting in oligonucleotide concatemers having a dimer PET or diPET structures with two non-palindromic ends on each side of the oligonucleotide (FIG. 1B). These non-palindromic ends prevent further ligation with other PETs, stopping concatenation. [0054] This embodiment gives rise to a diPET oligonucleotide concatemer made of two PETs of about 80bp: which is below the maximum capacity of the current "454" sequencing system.
[0055] While the preferred embodiment uses PETs, any tag can also be turned into "diPETs" by this method. It is also preferable to use at least one type Hs restriction enzyme, such as BseR1 , as this will minimize the length of the border sequences. As long as the cut sites of the type Hs restriction enzyme are different, just one type Hs restriction enzyme site may be used.
The n-PET embodiment
[0056] This method allows the creation of DNA sequences consisting of n numbers of PETs (where n = 4, 8, 16, 32....), allowing scaling up of the number of PETs added in a length-controlled manner to suit the capacity of the particular sequencing technology used.
[0057] Referring to FIG. 2, two different types of adaptors (labeled as A and B) are ligated to the ends of the diPET. These adaptors contain the different restriction sites necessary for restriction digest later. The adaptors are ligated, such that only those diPETs with different adaptors ligated will be circularized by self- circularization, and thus selected by an exonuclease treatment. Rolling circle amplification is performed to amplify the DNA. Alternatively, amplification and selection by PCR is also possible as the adaptor sequences are known. The DNA is then cut with the appropriate restriction enzymes to generate a palindromic end and a non-palindromic end, such that ligation may be used to form a 4-PET oligonucleotide. The cycle may then be repeated as desired to generate larger oligonucleotide concatemers comprising n-PETs.
[0058] Alternatively, two different types of adaptors may be ligated to the ends of the diPET, following which PCR is performed. Adaptors which are compatible will snap together, preventing PCR from taking place, allowing only adaptors which are different to be amplified. The DNA may then be cut with the appropriate restriction enzymes, and the cycle repeated if desired.
[0059] It may be seen, in the diPET and n-PET embodiments, that concatenation in these embodiments results in the 5' to 3' orientation of a nucleotide fragment, tag or ditag being opposite to another nucleotide fragment, tag or ditag adjacent (that is, positioned upstream or downstream) to it (FIGS. 1 and 2). Also seen in this embodiment is the doubling of the number of diPETs concatenated with each repeat of the ligation step.
[0060] While the embodiments of the method of the present invention above make use of fragments or tags that have ligatable ends, the method of the present invention can also make use of fragments that do not have ligatable ends by adding suitable adaptors to them. Alternatively, If two different enzymes cannot be used which result in one end that is palindromic and another end that will not match to itself, the enzymes may be used sequentially, and after the first restriction digest, one end may be "blocked" by dephosphorylation or other means, such as attachment to a solid substrate. [0061] The embodiments of the method for fragments, ditags or tags may also be applied to oligonucleotides or oligonucleotide concatemers. For example, suitable adaptors may also be ligated to oligonucleotide concatemers to allow them to ligate to another nucleotide fragment, tag or oligonucleotide with a compatible cohesive end.
[0062] Accordingly, in another embodiment, the method may further comprise the steps of treating the at least one oligonucleotide to produce at least one oligonucleotide having one ligatable end and one non-ligatable end, and allowing the oligonucleotide to ligate with a further oligonucleotide or nucleotide fragment to form a concatemer comprising at least two oligonucleotides or at least one oligonucleotide and at least one nucleotide fragment.
[0063] In another embodiment, the method may further comprise: (a) treating the at least one oligonucleotide to produce at least one oligonucleotide having two ligatable ends compatible with each other, and allowing the oligonucleotide to self- circularize; (b) selecting the at least one self-circularized oligonucleotide; (c) optionally amplifying the selected oligonucleotide; (d) treating the oligonucleotide from the previous step to produce at least one oligonucleotide with one ligatable end and one non- ligatable end; and (e) allowing two oligonucleotides to ligate at the ligatable ends to form a concatemer comprising at least two oligonucleotides. [0064] There is also provided an isolated oligonucleotide comprising at least two nucleotide fragments, wherein each fragment has at least one ligatable end and and one non-ligatable end, and the fragments are ligated at the ligatable ends to form the oligonucleotide. Preferably, the ligatable ends are palindromic cohesive ends. The concatemer or concatenated oligonucleotide(s) according to the invention comprises at least one nucleotide fragment, the fragment comprising at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide. The polynucleotide may be a full-length cDNA or one or more exons. Accordingly, the ditag may be representative of the full-length cDNA. In one embodiment; the concatemer or oligonucleotide according to the invention has each nucleotide fragment (or ditag) in an orientation opposite to the orientation of a nucleotide fragment (or ditag) positioned upstream and/or downstream. The concatemer, oligonucleotide, nucleotide fragment, ditag or tag according to the invention may be inserted into a plasmid or vector. Accordingly, there is also provided a plasmid or vector comprising at least one concatemer, oligonucleotide, nucleotide fragment, ditag or tag according to the invention. The plasmid or vector may be inserted in a host cell.
[0065] There is also provided a kit for concatenating oligonucleotides, nucleotide fragments, ditags and/or tag according to any embodiment of the invention, comprising at least one of a restriction enzyme, at least one nucleotide fragment, ditag or tag, optionally a vector, and any reagents as herein disclosed (for instance as described in the examples) for the reaction of concatenation. The kit may further comprise illustration and/or information pertaining to the use of the kit. [0066] There is also provided a library comprising at least a concatemer, concatenated oligonucleotides, concatenated nucleotide fragments, concatenated ditags and/or concatenated tags according to any embodiment of the invention.
Variations
[0067] For the embodiments in this aspect of the invention, many variations in the method are possible. For example; the tags or nucleotide fragments, and/or oligonucleotides or concatemers may be amplified. The method amplification may be by bacterial amplification, by rolling circle amplification, and/or by polymerase chain reaction. The method may comprise repeating the steps one or more times to obtain concatemers of desired lengths or number of oligonucleotides. The repeating may result in a doubling of the number of oligonucleotides in the concatemers. The ligatable end of each fragment, tag or ditag may be a palindromic cohesive end. The ligatable end and/or the non-ligatable end may be located in at least one adaptor. The adaptor may be part of a plasmid or vector. The nucleotide fragment may comprise at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 31 terminus of a polynucleotide. Each nucleotide fragment or tag of the concatemer may have an orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream. The method may further comprise sequencing the concatemer. The sequencing may be by any suitable method, for example, by pyrosequencing. [0068] Having now generally described the invention, the same will be more readily understood through reference to the following examples that are provided by way of illustration and are not intended to be limiting of the present invention.
Examples
Example 1 - BseRl Linearization of Single PET plasmid
Start from the Maxiprep amplication of the vector performed after transforming single PET plasmid DNA (described in US 20050255501 and US 20050059022). Any suitable vector may be used to clone and amplify the sequence of interest, for example, pGIS4a2 (FIG. 5A; SEQ ID NO:14) or pGIS3h (FIG. 5B; SEQ ID NO:15) Usually, the yield is about 1.2 mg from 10 Q-trays, with a concentration of about 400 ng/ul
IOOOug single PET plasmid DNA 2500 ul
10x NEBuffer 2 (New England Biolabs) 400ul
100x BSA (New England Biolabs) 4OuI
4U/ul BseRl (New England Biolabs) 100OuI (4 fold excess)
Nuclease-free water to 400OuI
For more efficient enzyme digestion, it is advisable to aliquot the reaction mix in tubes of 100ul each.
[0069] BseRl digest of single PET plasmid maxiprep - Incubate digestion at 37°C for 3 hours maximum. Perform phenol-chloroform extraction at pH 7.9 using
Eppendorf 50ml phase-lock gel.
[0070] Take note of the minimum and maximum volumes required for the various sizes of Phase-Lock gels available. To adjust the volume to the minimum volume required, add up to 1000 μl of nuclease-free water.
50ml phase-lock gel: Minimum volume: 5ml Maximum volume: 20ml [0071] lsopropanol precipitation was performed next due to the large volume involved:
BseRI digested single PET plasmid 50OuI
3M NaOAC pH5.2 (Ambion) 5OuI
Glycoblue (Ambion) 5ul lsopropanol (Sigma) 50OuI
Total 1055ul
Incubate at -200C for an hour. Centrifuge at 13,000 RPM for 30mins at 4°C (using a Eppendorf 5415R centrifuge; the same centrifuge may be used through the rest of the protocol) and wash once with 75% ethanol (Prepared from 100% ethanol, from Merck). Resuspend the pellet DNA in a final volume of 150OuI Qiagen Elution Buffer (EB).
[0072] For quality checking, run 400ng of BseRI cut and uncut single PET plasmid on 1 % agarose gel, 110V for 40mins using the medium gel. The presence of the supercoiled single-PET plasmid would lead to the generation of PETs that are of BamHI cohesive ends on both the 5' and 3' ends of the PETs. These would then concatenate and the final product would be a mixture of BamHI PETs concatemers and diPETs of interest. Repeat the BseRI digestion if majority of the supercoiled single PET plasmid remains uncut (FIG. 3).
[0073] If it is shown on the gel picture that there digestion is incomplete, i.e. supercoiled single PET plasmid bands are still present, it is advisable to set up another BseRI digestion.
[0074] Dephosphorylation of single PET plasmid after BseRI digestion Note: This step is only crucial for diPET generation via the pGIS3h vector. There is no harm in doing this for the pGIS4a vector but it is unnecessary. Save about 400ng of this sample i.e phosphorylated BseRI linearized single-PET plasmid for subsequent electrophoresis to check the quality of dephosphorylation. BseRI cut single PET plasmid 1500ul (~1000ug) 10x Antarctic Phosphatase buffer 60OuI
(New England Biolabs) 5U/ul Antarctic Phosphatase 100OuI (5 fold excess)
(New England Biolabs) Nuclease-free water 290OuI Total 600OuI
[0075] Perform isopropanol precipitation:
Dephosphorylated BseRI linearized single-PET plasmid 500ul
3M NaOAc pH 5.2 (Ambion) 5OuI
Glycoblue (Ambion) 5ul
Isopropanol (Sigma) 50OuI
Total 1055ul
Incubate at -200C for an hour. Centrifuge at 13,000 RPM for 30mins at 4°C. Wash 1 x with 75% ethanol. Resuspend the pellet in a final volume of 50OuI EB (Qiagen).
Take about 400ng and run 1 % agarose gel together with the previously saved, phosphorylated BseRI linearized single PET plasmid.
[0076] BamHI digestion to release single PETs
Phosphorylated/Dephosphorylated BseRI cut single PET plasmid 500ul (~1000ug)
10x Unique BamHI buffer 100ul
(New England Biolabs)
100x BSA (New England Biolabs) 10ul
20U/ul BamHI (New England Biolabs) 100ul
Nuclease free water 29OuI
Total 100OuI [0077] Aliquot 10OuI per reaction tube. Incubate at 37°C overnight. Perform ethanol precipitation using glycoblue to reduce the volume to allow for the ease of gel loading.
Prepare 5 tubes of the following:
BamH1 digested plasmid 200 ul
100% ethanol (Merck) 600 ul
3M NaOAc pH 5.2 (Ambion) 20 ul 1 M MgCI2 (0.0225x of the sample volume) 4.5 ul
(Ambion)
Glycoblue (Ambion) 2 ul.
Incubate at -8O0C for at least 30 minutes, followed by a 30 min centrifugation at 40C. Wash with 75% ethanol once. Resuspend the pellet in a final volume of 35OuI EB. For gel loading, add 100ul of loading dye and load 45ul/well and not more than 100ug of DNA/well on a 2% agarose gel. Electrophorese at 80V for 1.5 hours.
[0078] Electroelution
Add 80OuI of sterile milli-Q water to Fermentas Eluta Tubes to hydrate the membrane. These will be used subsequently for electroelution.
[0079] Excise the ~40bp PETs from the gel, visualizing at 365nm UV. Discard the 80OuI sterile milli-Q water and place the cut gel slice into the Eluta Tube. Use only one gel slice per Eluta Tube. Fill the tube with IxTAE buffer without Ethidium Bromide and electroelute at 90V for 30mins and subsequent reversed polarity for 1 min in the cold room. Ensure that the Eluta Tube is free of air bubbles after adding the buffer.
[0080] Collect the eluted PETs in multiple 1.5ml Eppendorf tubes and centrifuge at 13,000 RPM for IOmins at 4°C. This is to pellet pieces of agarose that might be present. Pipette whatever is possible and carry out isopropanol precipitation: Eluted DNA 50OuI
3M NaOAc (Ambion) 5OuI
1M MgCI2 11.25ul (0.0225x of the sample volume, Ambion)
Glycoblue (Ambion) 5ul lsopropanol (Sigma) 50OuI
Total 1066.25ul
Note: The 1 M MgCb is to aid in the precipitation of small fragments of DNA. Incubate at -2O0C for an hour. Centrifuge at 13,000 RPM for 30mins at 4°C. Wash 1 x with 75% ethanol. Resuspend the PETs in a final volume of 12ul EB.
[0081] 4. Quantification of PET DNA
PET DNA can be quantified on a mini 4-20% PAGE gel (Invitrogen). Run 0.2ul of PET DNA together with 25bp ladder and Low Mass DNA ladder (both from Invitrogen). The latter will help in the estimation of the PET DNA amount. Currently, 2ul and 4ul of Low Mass ladder are used in the quantification. Below shows the preparation for the ladders for per loading:
[0082] 25bp ladder preparation
1ul of 1 ug/ul 25bp ladder + 9ul of EB + 2ul loading dye. Total 12ul
[0083] Low Mass ladder preparation
[0084] 2ul Low Mass Ladder Preparation
2ul of Low Mass ladder + 8ul of EB + 2ul loading dye. Total 12ul
[0085] 4ul Low Mass Ladder Preparation
4ul of Low Mass ladder + 6ul of EB + 2ul loading dye. Total 12ul Run the PET DNA to be quantified together with these ladders on a 4-20% mini PAGE gel for 30mins at 200V and TBE buffer (Ambion).
Stain the PAGE gel with Sybr Green I (Molecular Probes) in TBE buffer (final concentration should be 1X, Ambion) for 10-15mins.
Note: If there is no difference in the intensity of the DNA bands for quantification, it is advisable to load less of the DNA. A rough estimation would be sufficient as the final DiPET would be run on Agilent Bioanalyzer 500/1000 kit for quantification and sizing for '454' pyrosequencing.
[0086] DiPETting Reaction
PET DNA (at least 5 ug) 7 ul
10x Spermidine buffer 1 ul
(prepared in-house; below)
5U/ul T4 DNA ligase (Invitrogen) 1ul
Nuclease-free water to 10ul
10x ligation buffer with Spermidine is made up of: 6OmM Tris-HCI pH7.5 (Ambion) 6OmM MgCI2 (Ambion) 5OmM NaCI (Ambion) 1 mg/ml BSA (New England Biolabs) 7OmM Beta-mercaptoethanol (Sigma) 1 mM ATP (Invitrogen) 2OmM DTT (Invitrogen) 1OmM spermidine (Sigma)
Incubate at 16°C overnight. Adjust volume to 20OuI with nuclease-free water and perform phenol chloroform pH 7.9 extraction. [0087] Perform ethanol precipitation as follows:
DiPETs 20OuI
3M NaOAC (Ambion) 2OuI
1M MgCI2 (Ambion) 4.5ul
Glycoblue (Ambion) 2.2ul
Abs EtOH (Merck) 80OuI
Incubate at -200C for an hour. Centrifuge at 13,000 RPM for 30mins at 4°C. Wash 1 x with 75% ethanol. Resuspend diPETs in 2OuI EB.
[0088] After the diPETting reaction, run 1ul of the DiPET DNA on Agilent Bioanalyzer DNA 500/1000 kit according to the manufacturer's protocol to estimate the concentration and quantity of the DiPETs available. Then, proceed to any preferred method of sequencing (such as '454' sequencing).
Example 2 - The n-PET embodiment as example
[0089] Ligation of two different adaptors. The adaptors may contain source- identifying tags if desired. The adaptors should preferably have one end that is complementary to each other, and ideally, on the other end, the adaptors have sticky ends complementary to sticky ends of the DNA such that ligation will be easy and the adaptors will not ligate to themselves. It is best if the DNA already contains sticky ends, like diPETs. If the DNA is blunt, however, the DNA may be A-tailed with DNA polymerase, and then T-tailed adaptors may be used.
[0090] A-tailing (not required if diPETs are used)
10 mm dATP 0.5 ul
ExTaq polymerase 0.5 ul
10x ExTaq buffer 2.5 ul
2 ug of sample 2O uI
Nuclease-free Water 1.5 ul
The total volume is 25 ul. Incubate using in a PCR machine with the following program: 720C for 30 minutes, followed by 40C forever.
Perform phenol chloroform extract and ethanol precipitation with glycoblue. Resuspend in 12 ul of Elution Buffer.
[0091] Ligation of adaptors
DNA (approx 200-1000 ng) say 5 ul
10x ligase buffer (with spermidine) 1 ul
T4 DNA ligase (5 U/ul) 1 ul
Nuclease-free Water to 10 ul
Incubate at 160C for 1-2 hours. Ethanol precipitate DNA.
[0092] Phosphorylation of adaptors
DNA say 20 ul 10x T4 PNK buffer 5 ul
10 mM ATP solution 5 ul
T4 Polynucleotide kinase (3U/ul) 1 ul
Nuclease-free Water to 50 ul
Incubate at 370C for 30 minutes, then heat-inactivate at 7O0C for 5 minutes
[0093] Circularization of adaptors
Approximately 200 ng DNA say 50 ul
5x Invitrogen ligation buffer (with PEG) 20 ul T4 DNA ligase (5U/ul) 1 ul
Nuclease-free Water to 100 ul
The DNA should be diluted to a concentration of approximately 2 ng/ul in the final ligation solution - the dilute solution will favour intramolecular ligation. Incubate at 160C for 16 hours.
[0094] Exonuclease treatment
Adapter-Ligated DNA say 2 ul
Lambda Exonuclease 1 ul
Exonuclease I 1 ul
10x Lambda Exonuclease buffer top up to 5 ul if volume of ligated material is less than 5 ul.
Nuclease-free Water To 50 ul
Incubate at 370C for 1 hour.
Perform phenol chloroform extraction and ethanol precipitation with glycoblue. Wash once with 75% ethanol. Resuspend in 12 ul of Elution Buffer. No gel purification is necessary but if desired, if may be performed as follows.
[0095] Gel purification
Gel purify the circularized adaptor-ligated DNA by adding of 0.2 volumes of bromophenol blue gel loading buffer. Load 60 ul per well of a medium sized 2% agarose gel. Electrophorese at 80V for 1.5 hours. Extract circularized mate-pairs with a sharp scalpel, with minimal UV light exposure.
Use Qiagen Gel extraction kit if DNA is larger than 150 bp. Use GeBA-flex Elutatubes if DNA is smaller than 100 bp; perform according to manufacturers' protocols.
[0096] Restriction enzyme digest (eg, Hindlll, as according to manufacturer's protocols)
[0097] Ligation reaction
BamH1/BseR1 cut single PET 12 ul
10x ligase buffer with spermidine 1.5 ul
T4 DNA ligase (5U/ul) 1 ul Nuclease-free Water 0.5 ul
The total volume is 15 ul. Incubate at 160C for 16 hours. Adjust the volume to 200 ul with Nuclease-free Water. Perform phenol chloroform pH 7.9 extraction with phase lock gel. Ethanol precipitate with glycoblue
[0098] Quality Control and Quantitation with a PAGE gel
Load approximately 100 ng of sample on a 4-20% gradient PAGE gel, together with Takara Wide Range DNA ladders and Invitrogen Low Mass DNA ladders or other ladders as required for quantitation. Load 1 ul and 2 ul of the Invitrogen Low Mass DNA ladders for more accurate quantitation (FIG. 4).
Repeat to obtain desired length of concatemer.
[0099] PCR modification
If PCR is used for amplification, no exonuclease treatment should be performed.
The BD PCR-Select Bacterial Genome Subtraction Kit may be used instead.
[00100] Example 3 - Variation of diPETS or n-PETs having cohesive and non- palindromic, or blunt, ends
This variation is possible by dividing the sample after amplification (Maxiprep or rolling circle amplification or other) into two lots. One lot may be treated with a combination of restriction enzymes, phosphatases and kinases that produce tags with one end that is ligatable, for example a blunt phosphorylated end or a phosphorylated end with an overhang. In this lot, the other end is not ligatable, for example, the other end is a blunt, dephosphorylated end or has either a phosphorylated or dephosphorylated end with an overhang that is not complementary to the first end.
[00101] The other lot may be treated with a combination of restriction enzymes, phosphatases and kinases to produce one end that may be ligated, for example a blunt phosphorylated end or a phosphorylated end with an overhang, and another end that cannot be ligated, for example a blunt dephosphorylated end or a phosphorylated or dephosphorylated end with an overhang that is not complementary to the first end, wherein the ligatable end, may be ligated to the ligatable end of the first lot.
[00102] Thus, phosphorylated, blunt ends may be ligated to each other, and phsophorylated, cohesive ends complementary to each other may be ligated to each other. For example a digest may produce AA tails in one half, and another digest may produce TT tails in the other half, which can then be mixed in a 1 :1 or other suitable ratio to result in dimerization to produce length-controlled diPETs.
[00103] Example 4 - n-PETs with multiples that are not of 2, 4, 8, 16... [00104] First, divide the sample after amplification into any number of aliquots. Then perform the n-pet procedure with these aliquots, repeating any number of times on these aliquots, to end with the production of a blunt, phosphorylated end or a phosphorylated end with an overhang, and another end that cannot be ligated in each aliquot. Ligate suitable aliquots to each other in successive rounds to generate n-PETs with the desired length.
[00105] For example, to generate an n-PET consisting of 7 PETs, first separate the sample into 3 aiiquots. Then prepare diPETs for two aliquots, and leave one aliquot as it is. With one of the aliquots of diPETs, prepare a 4-PET according to the n-PET method. Finally, combine aliquots: adding the remaining diPET aliquot to the single PET aliquot to form a 3-PET, and then combine the 3-PET with the 4- PET. These steps all require the use of the adapter ligation and selection steps as previously discussed in the n-PET protocol.
[00106] Example 5 - diPETs that have 5' to 3' directionality in different orders [00107] First, divide the sample after amplification into two lots. Then treat one lot with a combination of restriction enzymes, phosphatases and kinases such that the 5' end is a cohesive, non-palindromic end or a blunt end, which may be ligated, while the 3' end cannot be ligated. Treat the other lot with a combination of restriction enzymes, phosphatases, and kinases such that the 3' end is a cohesive, non-palindromic end or a blunt end, which may be ligated, while the 5' end cannot be ligated. Next, combine the two lots and treat with ligase. Fragments from the same lot should not ligate to each other, as their ends are not compatible with each other. However, fragments from different lots should ligate to each other, as their ends have been selected to be cohesive, and thus compatible, with each other.
[00108] Example 6 - Chromatin Interaction Precipitation diPETting (ChIP-PET). [00109] ChIP-PET is a method to identify DNA regions that interact with proteins such as those found in chromatin structures. This variation requires a different vector, pGIS3h (FIG. 5B, SEQ ID NO:15). When cut, this produces diPETs of about 88 base pairs and 4 residues (there are 2 CG residues on either end). There are no AA tails. In contrast, the other variations of the present invention uses pGIS4a2 (FIG. 5A; SEQ ID NO: 14). This produces diPETs of a total size of 80 base pairs and 4 residues (there are 2 AA residues on either end).
[00110] Although the present invention has been described in detail with reference to examples above, it is understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only the following claims. All cited patents, patent applications and publications referred to in this application are herein incorporated by reference in their entirety.
References
Adams, M., et al., 1991 , Science, 252, 1651-1656
Brenner S, et al., 2000, Nature Biotechnology, 18, 630-634
Dunn et al (2002) Genome Research 12(11 ):1756-1765
Jongeneel et al., 2003, Proc. Natl. Acad. Sci. USA 100: 4702-4705
Li and Chandrasegaran, Proc. Nat. Acad. Sciences USA 90:2764-8, 1993
Margulies M. et al., Nature, 31 July 2005.
Strausberg, R.L., et al., 1999, Science, 286: 455-457
Velculescu, V. E., et al., 1995, Science, 270, 484-487
US 4,766,072
US 20050255501
US 20050059022

Claims

Claims
1. A method of length-controlled concatenating nucleotide fragments, the method comprising:
(a) providing at least two nucleotide fragments, wherein each fragment has one ligatable end and one non-ligatable end; and
(b) allowing the two fragments to ligate at the ligatable ends to form at least one oligonucleotide comprising at least two concatenated nucleotide fragments.
2. The method according to claim 1 , wherein the ligatable end of each fragment is a palindromic cohesive end.
3. The method according to claim 1 or 2, wherein the ligatable end and/or the non-ligatable end is located in at least one adaptor.
4. The method according to claim 3, wherein the adaptor is part of a piasmid or vector.
5. The method according to any one of claims 1 to 4, wherein the nucleotide fragments are amplified.
6. The method according to claim 5, wherein the amplification is by bacterial amplification, by rolling circle amplification, and/or by polymerase chain reaction.
7. The method according to any one of claims 1 to 6, further comprising the steps of treating the at least one oligonucleotide to produce at least one oligonucleotide having one ligatable end and one non-ligatable end, and allowing the oligonucleotide to ligate with a further oligonucleotide or a nucleotide fragment to form an oligonucleotide comprising more than two concatenated nucleotide fragments.
8. The method according to claim 7, wherein the steps are repeated one or more times.
9. The method according to claim 7 or 8, wherein the ligatable end of each oligonucleotide or of the nucleotide fragment is a palindromic cohesive end.
10. The method according to any one of claims 1 to 6, further comprising treating the at least one oligonucleotide to produce at least one oligonucleotide with two ligatable ends and allowing the oligonucleotide to self-circularize at the ligatable ends.
11. The method according to claim 10, wherein the ligatable ends are palindromic cohesive ends.
12. The method according to claim 10, wherein the ligatable ends are obtained from two adaptors, each adaptor linked to each end of the oligonucleotide.
13. The method according to any one of claims 10 to 12, further comprising selecting the at least one circularized oligonucleotide and/or amplifying the oligonucleotide.
14. The method according to claim 13, further comprising treating the at least one circularized and/or amplified oligonucleotide to produce at least one oligonucleotide having one ligatable end and one non-ligatable end, and allowing the oligonucleotide to ligate with at least a further oligonucleotide at their iigatabie ends to form at least two concatenated oligonucleotides.
15. The method according to claim 1 , further comprising: (a) treating the at least one oligonucleotide to produce at least one oligonucleotide having two ligatable ends, and allowing the oligonucleotide to self-circularize at its iigatabie ends; (b) selecting the at least one self-circularized oligonucleotide; (c) optionally amplifying the selected oligonucleotide; (d) treating the oligonucleotide to produce at least one oligonucleotide with one ligatable end and one non-ligatable end; and (e) allowing the oligonucleotide to ligate with a further oligonucleotide at their ligatable ends to form at least two concatenated oligonucleotides.
16. The method according to claim 15, comprising repetition of the steps (a) to (e) one or more times.
17. The method according to any one of claims 1 to 16, wherein the at least one nucleotide fragment comprises at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
18. The method according to any one of claims 1 to 17, wherein each nucleotide fragment of the concatemer has an orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream.
19. The method according to any one of claims 1 to 18, further comprising sequencing the concatenated nucleotide fragments.
20. The method according to claim 19, wherein the sequencing is by pyrosequencing and/or by serial analysis of gene expression.
21. A method of length-controlled concatenating nucleotide fragments, the method comprising the steps of:
(a) providing at least one oligonucleotide comprising at least one nucleotide fragment, wherein the oligonucleotide has ligatable ends;
(b) allowing the at least one oligonucleotide to self-circularize at its ligatable ends;
(c) selecting the at least one self-circularized oligonucleotide;
(d) treating the selected circularized oligonucleotide with at least one restriction enzyme to obtain at least one oligonucleotide with one ligatable end and one non- ligatable end; and
(e) concatenating at least two oligonucleotides at the ligatable ends to form a concatenated oligonucleotide comprising at least two nucleotide fragments.
22. The method according to claim 21 , wherein the provided oligonucleotide in step (a) comprises at least two concatenated nucleotide fragments, and the obtained concatenated oligonucleotide in step (e) comprises at least four concatenated nucleotide fragments.
23. The method according to claim 21 or 22, wherein the nucleotide fragment comprises at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
24. The method according to any one of claims 21 to 23, wherein the ligatable ends are palindromic cohesive ends.
25. The method according to any one of claims 21 to 24, wherein step (e) concatenation is repeated one or more times.
26. The method according to claim 25, wherein each repeat of concatenation results in a doubling of the number of concatenated nucleotide fragments.
27. The method according to any one of claims 21 to 26, wherein the ligatable ends are obtained from adaptors.
28. The method according to any one of claims 21 to 27, wherein step (c) further comprises amplification of the oligonucleotide.
29. The method according to claim 28, wherein the amplification is by rolling circle amplification, and/or by polymerase chain reaction.
30. The method according to any one of claims 21 to 29, the method further comprising sequencing the concatenated oligonucleotide.
31. The method according to any one of claims 21 to 30, wherein each nucleotide fragment of the oligonucleotide has orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream.
32. An isolated oligonucleotide comprising at least two nucleotide fragments, wherein each fragment has at least one ligatable end and and one non-ligatable end, and the fragments are iigated at the ligatable ends to form the oligonucleotide.
33. The oligonucleotide according to claim 32, wherein the ligatable ends are palindromic cohesive ends.
34. The oligonucleotide according to claim 32 or 33, wherein the fragment comprises at least one ditag, the ditag comprising at least one first tag comprising a 5' terminus and at least one second tag comprising a 3' terminus of a polynucleotide.
35. The oligonucleotide according to claim 34, wherein each nucleotide fragment of the oligonucleotide has orientation opposite to the orientation of a nucleotide fragment positioned upstream and/or downstream.
36. The oligonucleotide according to claim 34 or 35, wherein the oligonucleotide is inserted into a plasmid or vector.
PCT/SG2007/000159 2006-06-09 2007-06-04 Nucleic acid concatenation WO2007142608A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07748704.9A EP2032721B1 (en) 2006-06-09 2007-06-04 Nucleic acid concatenation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/449,872 US20080124707A1 (en) 2006-06-09 2006-06-09 Nucleic acid concatenation
US11/449,872 2006-06-09

Publications (1)

Publication Number Publication Date
WO2007142608A1 true WO2007142608A1 (en) 2007-12-13

Family

ID=38801745

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2007/000159 WO2007142608A1 (en) 2006-06-09 2007-06-04 Nucleic acid concatenation

Country Status (5)

Country Link
US (1) US20080124707A1 (en)
EP (1) EP2032721B1 (en)
SG (2) SG10201500691UA (en)
TW (1) TW200815605A (en)
WO (1) WO2007142608A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020249563A1 (en) * 2019-06-13 2020-12-17 Global Life Sciences Solutions Usa Llc Expression of products from nucleic acid concatemers
US11629377B2 (en) 2017-09-29 2023-04-18 Evonetix Ltd Error detection during hybridisation of target double-stranded nucleic acid

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090156431A1 (en) * 2007-12-12 2009-06-18 Si Lok Methods for Nucleic Acid Mapping and Identification of Fine Structural Variations in Nucleic Acids
US8263367B2 (en) * 2008-01-25 2012-09-11 Agency For Science, Technology And Research Nucleic acid interaction analysis
US8829172B2 (en) * 2011-03-11 2014-09-09 Academia Sinica Multiplex barcoded paired-end diTag (mbPED) sequencing approach and ITS application in fusion gene identification
KR102310441B1 (en) 2013-09-05 2021-10-07 더 잭슨 래보라토리 Compositions for rna-chromatin interaction analysis and uses thereof
US11001868B2 (en) * 2017-08-11 2021-05-11 Global Life Sciences Solutions Operations UK Ltd Cell-free protein expression using double-stranded concatameric DNA

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4766072A (en) 1985-07-17 1988-08-23 Promega Corporation Vectors for in vitro production of RNA copies of either strand of a cloned DNA sequence
US20040146866A1 (en) * 2003-01-21 2004-07-29 Guoliang Fu Quantitative multiplex detection of nucleic acids
US20050059022A1 (en) 2003-09-17 2005-03-17 Agency For Science, Technology And Research Method for gene identification signature (GIS) analysis
WO2006003721A1 (en) * 2004-07-02 2006-01-12 Kabushiki Kaisha Dnaform Method for preparing sequence tags

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5695937A (en) * 1995-09-12 1997-12-09 The Johns Hopkins University School Of Medicine Method for serial analysis of gene expression
US5866330A (en) * 1995-09-12 1999-02-02 The Johns Hopkins University School Of Medicine Method for serial analysis of gene expression
JP3441899B2 (en) * 1996-11-01 2003-09-02 理化学研究所 How to make a full-length cDNA library
US6136537A (en) * 1998-02-23 2000-10-24 Macevicz; Stephen C. Gene expression analysis
US6054276A (en) * 1998-02-23 2000-04-25 Macevicz; Stephen C. DNA restriction site mapping
DE19822287C2 (en) * 1998-05-18 2003-04-24 Switch Biotech Ag Cloning vector, its production and use for the analysis of mRNA expression patterns
DE60045796D1 (en) * 1999-09-01 2011-05-12 Whitehead Biomedical Inst TOTAL CHROMOSOME ANALYSIS OF PROTEIN-DNS INTERACTIONS
JP2003516150A (en) * 1999-12-08 2003-05-13 ジェンセット Full-length human cDNA encoding a cryptic secretory protein
US6613520B2 (en) * 2000-04-10 2003-09-02 Matthew Ashby Methods for the survey and genetic analysis of populations
US20020025561A1 (en) * 2000-04-17 2002-02-28 Hodgson Clague Pitman Vectors for gene-self-assembly
AU2001280875A1 (en) * 2000-07-28 2002-02-13 The Johns Hopkins University Serial analysis of transcript expression using long tags
JP2004097158A (en) * 2002-09-12 2004-04-02 Kureha Chem Ind Co Ltd METHOD FOR PRODUCING cDNA TAG FOR IDENTIFICATION OF EXPRESSION GENE AND METHOD FOR ANALYZING GENE EXPRESSION BY USING THE cDNA TAG
EP2159285B1 (en) * 2003-01-29 2012-09-26 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
US20060024681A1 (en) * 2003-10-31 2006-02-02 Agencourt Bioscience Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4766072A (en) 1985-07-17 1988-08-23 Promega Corporation Vectors for in vitro production of RNA copies of either strand of a cloned DNA sequence
US20040146866A1 (en) * 2003-01-21 2004-07-29 Guoliang Fu Quantitative multiplex detection of nucleic acids
US20050059022A1 (en) 2003-09-17 2005-03-17 Agency For Science, Technology And Research Method for gene identification signature (GIS) analysis
US20050255501A1 (en) 2003-09-17 2005-11-17 Agency For Science, Technology And Research Method for gene identification signature (GIS) analysis
WO2006003721A1 (en) * 2004-07-02 2006-01-12 Kabushiki Kaisha Dnaform Method for preparing sequence tags

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
ADAMS, M. ET AL., SCIENCE, vol. 252, 1991, pages 1651 - 1656
BRENNER S ET AL., NATURE BIOTECHNOLOGY, vol. 18, 2000, pages 630 - 634
CHUM W.W.Y. ET AL.: "Modification of LongSAGE for obtaining and cloning long concatemers", BIOTECHNIQUES, vol. 39, no. 5, 2005, pages 637 - 640, XP008090548 *
DUNN ET AL., GENOME RESEARCH, vol. 12, no. 11, 2002, pages 1756 - 1765
JONGENEEL ET AL., PROC. NATL. ACAD. SCI. USA, vol. 100, 2003, pages 4702 - 4705
LI; CHANDRASEGARAN, PROC. NAT. ACAD. SCIENCES USA, vol. 90, 1993, pages 2764 - 8
MARGULIES M. ET AL., NATURE, 31 July 2005 (2005-07-31)
SAMBROOK; RUSSELL: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS
STRAUSBERG, R.L. ET AL., SCIENCE, vol. 286, 1999, pages 455 - 457
VELCULESCU, V. E. ET AL., SCIENCE, vol. 270, 1995, pages 484 - 487
ZHANG Z. ET AL.: "Mapping of transcription start sites in Saccharomyces cerevisiae using 5' SAGE", NUCLEIC ACIDS RESEARCH, vol. 33, no. 9, 2005, pages 2838 - 2851, XP008090942 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11629377B2 (en) 2017-09-29 2023-04-18 Evonetix Ltd Error detection during hybridisation of target double-stranded nucleic acid
WO2020249563A1 (en) * 2019-06-13 2020-12-17 Global Life Sciences Solutions Usa Llc Expression of products from nucleic acid concatemers
CN113950530A (en) * 2019-06-13 2022-01-18 环球生命科技咨询美国有限责任公司 Expression of nucleic acid concatemer products

Also Published As

Publication number Publication date
EP2032721A4 (en) 2010-06-02
US20080124707A1 (en) 2008-05-29
SG172673A1 (en) 2011-07-28
EP2032721B1 (en) 2017-03-29
EP2032721A1 (en) 2009-03-11
SG10201500691UA (en) 2015-04-29
TW200815605A (en) 2008-04-01

Similar Documents

Publication Publication Date Title
US11072819B2 (en) Methods of constructing small RNA libraries and their use for expression profiling of target RNAs
CN110191961B (en) Method for preparing asymmetrically tagged sequencing library
Morozova et al. Applications of next-generation sequencing technologies in functional genomics
EP2451973B1 (en) Method for differentiation of polynucleotide strands
EP2496715B1 (en) Quantitative nuclease protection sequencing (qnps)
CN113166797A (en) Nuclease-based RNA depletion
US20100035249A1 (en) Rna sequencing and analysis using solid support
JP7033602B2 (en) Barcoded DNA for long range sequencing
EP3555305B1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
WO2012003374A2 (en) Targeted sequencing library preparation by genomic dna circularization
EP2032721A1 (en) Nucleic acid concatenation
CA2982421A1 (en) Compositions and methods for constructing strand specific cdna libraries
US20140336058A1 (en) Method and kit for characterizing rna in a composition
US20060063181A1 (en) Method for identification and quantification of short or small RNA molecules
Wulf et al. Chemical capping improves template switching and enhances sequencing of small RNAs
JP7490071B2 (en) Novel nucleic acid template structures for sequencing
US20230287396A1 (en) Methods and compositions of nucleic acid enrichment
WO2023025784A1 (en) Optimised set of oligonucleotides for bulk rna barcoding and sequencing
CA3214198A1 (en) Methods for targeted nucleic acid sequencing
WO2023237180A1 (en) Optimised set of oligonucleotides for bulk rna barcoding and sequencing
Blattner Single cell transcriptome analysis using next generation sequencing.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07748704

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2007748704

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007748704

Country of ref document: EP