WO2009148560A2 - Methods and compositions for nucleic acid sequencing - Google Patents

Methods and compositions for nucleic acid sequencing Download PDF

Info

Publication number
WO2009148560A2
WO2009148560A2 PCT/US2009/003331 US2009003331W WO2009148560A2 WO 2009148560 A2 WO2009148560 A2 WO 2009148560A2 US 2009003331 W US2009003331 W US 2009003331W WO 2009148560 A2 WO2009148560 A2 WO 2009148560A2
Authority
WO
WIPO (PCT)
Prior art keywords
primer
adapter
double stranded
cdna
cap
Prior art date
Application number
PCT/US2009/003331
Other languages
French (fr)
Other versions
WO2009148560A3 (en
WO2009148560A8 (en
Inventor
Mikhail V Matz
Elisha Meyer
Galina Aglyamova
Original Assignee
Board Of Regents, The Universtiy Of Texas System
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Board Of Regents, The Universtiy Of Texas System filed Critical Board Of Regents, The Universtiy Of Texas System
Publication of WO2009148560A2 publication Critical patent/WO2009148560A2/en
Publication of WO2009148560A3 publication Critical patent/WO2009148560A3/en
Publication of WO2009148560A8 publication Critical patent/WO2009148560A8/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the present invention relates in general to the field of nucleic acid sequencing.
  • nucleic acids encode the genome
  • many diseases are associated with particular DNA sequences.
  • Tremendous amounts of resources have been allocated to identify and correlate DNA sequence polymorphisms with a diseased state.
  • sequence polymorphisms include insertions, deletions, or substitutions of nucleotides in one sequence relative to a second sequence.
  • genome sequencing has become an increasing critical tool for diagnosis, therapy and prevention of illnesses and, eventually, the targeted modification of the human genome.
  • Development of rapid and sensitive nucleic acid sequencing methods utilizing automated DNA sequencers has revolutionized modern molecular biology. Analysis of entire genomes of plants, fungi, animals, bacteria, and viruses is now possible with a concerted effort by a series of machines and a team of technicians.
  • Base sequencing of deoxyribonucleic acid and ribonucleic acid is one of the most important analytical techniques in biotechnology, the pharmaceutical industry, food industry, medical diagnostics and other fields of application.
  • a DNA sequence polymorphism analysis is performed by isolating DNA from an individual, manipulating the isolated DNA by digesting the DNA with restriction enzymes and/or amplifying a subset of sequences in the isolated DNA and examining the manipulated DNA.
  • Commonly used procedures for analyzing DNA include electrophoretic-based separation analyses such as agarose or polyacrylamide gel electrophoresis. DNA sequences are typically inserted, or loaded on gels and subjected to an electric field. Because DNA has a uniform negative charge, DNA will migrate through the gel based on properties including sequence length and relative sizes.
  • nucleic acid sequencing systems and methods have become available.
  • United States Patent Number 5,972,693 provides methods by which biologically derived DNA sequences in a mixed sample or in an arrayed single sequence clone can be determined and classified without sequencing. The methods are based on the presence of carefully chosen target subsequences, typically 4 to 8 bases in length, in a sample DNA sequence together with DNA sequence databases containing lists of sequences likely to be present in the sample to determine a sample sequence.
  • the method uses restriction endonucleases to recognize target subsequences to cut the sample sequence. Then, chosen recognition moieties are ligated to the cut fragments, the fragments are amplified, and the experimental observation made.
  • PCR Polymerase chain reaction
  • the patent discloses a methodology that provides positive confirmation that nucleic acids, possessing putatively identified sequence predicted to generate observed GeneCallingTM signals, are actually present within the sample from which the signal was originally derived.
  • the putatively identified nucleic acid fragment within the sample possesses 3'- and 5'-ends with known terminal subsequences.
  • the method in the '868 patent includes; contacting nucleic acid fragments in a sample in amplifying conditions with (i) a nucleic acid polymerase; (ii) "regular” primer oligonucleotides having sequences comprising hybridizable portions of the known terminal subsequences; and (iii) a "poisoning" oligonucleotide primer, the poisoning primer having a sequence comprising a first subsequence that is a portion of the sequence of one of the known terminal subsequences and a second subsequence that is a hybridizable portion of the putatively unidentified sequence which is adjacent to the one known terminal subsequence, where the nucleic acids amplified with the poisoning primer are distinguishable upon detection from nucleic acids amplified with the nucleic acids amplified only with the regular primers; separating the products of the contacting step; and detecting a sequence if the nucleic acids amplified
  • the United States Patent Numbers 7,244,567 teaches methods of sequencing both the sense and antisense strands of DNA with blocked and unblocked sequencing primers. These methods include the steps of annealing an unblocked primer to a first strand of nucleic acid; annealing a second blocked primer to a second strand of nucleic acid; elongating the nucleic acid along the first strand with a polymerase; terminating the first sequencing primer; deblocking the second primer; and elongating the nucleic acid along the second strand.
  • Rothberg disclosed methods and apparatuses for sequencing a nucleic acid that permit a very large number of independent sequencing reactions to be arrayed in parallel, permitting simultaneous sequencing of a very large number (>10,000) of different oligonucleotides.
  • the present invention uses novel compositions to improve nucleic acid sequencing .
  • the present invention dramatically reduces the fraction of unusable sequences corresponding to the adaptors in the total sequencing output, and eliminates artifacts due to improper adapter ligation and/or primer annealing.
  • the present invention also provides even coverage of the length of individual transcripts due to strategic placement of the sequencing primer and improves the sequencing efficiency by pre-selecting the fragments with correct adapter combination.
  • the present invention improves reliability and efficiency of the whole procedure due to reliance on PCR suppression rather than physical separation procedures.
  • the present invention allows simultaneous sequencing of several samples.
  • the present invention provides methods and compositions for preparing a cDNA sample for sequencing.
  • the steps include creating a double stranded cDNA by annealing a
  • RNA resulting in a full length double stranded cDNA fragmenting (e.g., using sonication and/or nebulization) the full length double stranded cDNA to generate a plurality of fragmented double stranded cDNA; repairing the ends of the fragments using DNA polymerase ("end- polishing"), ligating a mixture of partially-double stranded A+ adapter and a partially-double stranded B+ adapter to fragmented double stranded DNA using a ligase; and amplifying the ligated double stranded cDNA using an amplification mixture comprising a primer A, a primer B, a A+-cap primer and a DNA polymerase.
  • end- polishing DNA polymerase
  • the Cap-Trsa-CV oligonucleotide can include a cap primer sequence at the 5' end and a broken poly T stretch region at the 3' end.
  • the broken poly T stretch region typically has two or more poly T regions about 6-base long separated by at least one base residue selected from dA, dC, and dG.
  • This composition prevents pyrosequencing artifacts by eliminating the need to sequence through the long oligo dT stretch.
  • the 3 '-most base of the primer is a mixture of dA, dC, and dG, to ensure that the primer initiates reverse transcription at the distal-most region of the polyA tail of the mRNA, rather than in the middle of it.
  • the Cap-Trsa-CV oligonucleotide has the sequence listed in SEQ ID NO:1 ; the cap primer can have the sequence listed in SEQ ID NO. 2; and the A+-cap primer can have the sequence listed in SEQ. ID. NO: 3.
  • the A+ adapter includes an A+ long oligonucleotide having a first suppression tag at the 3' end and a A+ primer sequence at the 5' end; and an A+ short oligonucleotide complementary to the first suppression tag.
  • the suppression tag prevents amplification of fragments flanked by the same A+ adapter at both ends later in the procedure.
  • the suppression tag of the A+ long oligonucleotide can be of different sequence and function as a barcode to identify the particular cDNA source post- sequencing.
  • the B+ adapter includes a B+ long oligonucleotide having a second suppression tag at the 3' end and a B+ primer region at the 5' end; and a B+ short oligonucleotide complementary to the second suppression tag.
  • the suppression tag prevents amplification of fragments flanked by the same B+ adapter at both ends later in the procedure.
  • the suppression tag of the B+ long oligonucleotide can be of different sequence and function as a barcode to identify the particular cDNA source post-sequencing.
  • the step of step of ligation uses a molar ratio between about 0.9 to about 1.1 for the A+ adapter to B+ adapter
  • the step of amplification uses a molar ratio of between about 0.9-1.1 to about 0.05- 0.1 for the primer A: primer B to A+-cap primer.
  • the present invention also includes an A+ adapter and a B+ adapter oligonucleotides for amplification. Both adapters can further include a bar-coding tag (e.g., a biotin tag).
  • the A+ adapter and a B+ adapter each includes a long strand and an short strand and is capable of ligating to a first end or a second end of a fragmented double stranded cDNA.
  • the long strand of the A+ adapter contains an A primer region at the 5' end and a first suppression/barcode tag region at the 3' end
  • the long strand of the B+ adapter contains a B primer region at the 5' end and a second suppression/barcode tag region at the 3' end.
  • Each of the first and second suppression tag regions prevents PCR amplification of the double stranded cDNA flanked with the same A+ adapter or the same B+ adapter at both ends. Only the double stranded cDNA fragments with both A+ and B+ adapters are capable of being amplified.
  • the primer cocktail includes an A primer, B primer, and A+-cap primer.
  • the long strand A+ adapter can have the sequence listed in SEQ ID NO: 4; the short strand A+ adapter can have the sequence listed in SEQ ID NO: 5; the long strand B+ adapter can have the sequence listed in SEQ ID NO: 6; and the short strand B+ adapter can have the sequence listed in SEQ ID NO: 7.
  • the molar ratios of the A+ adapter: B+ adapter during the ligation step comprises about 0.9 to 1.1 : about 0.9 to 1.1, and the molar ratio of A primer, B primer, and A+-cap primer comprises about 0.9 to 1.1 : about 0.9 to 1.1 : about 0.04 to 0.11.
  • Figure 1 is a schematic diagram of the present invention.
  • Figure 2 shows the preparation of a cDNA sample for 454 sequencing.
  • Example 1 In one embodiment, the present invention describes method to prepare cDNA samples for sequencing, for example, a 454 sequencingTM known by the skilled artisan.
  • a 454 sequencingTM is a parallel pyrosequencing system capable of sequencing about 100 megabases of raw DNA per run. The system relies on fixing nebulized and adapter-ligated DNA fragments to small DNA-capture beads in a water-in-oil emulsion. The DNA fixed to these beads is then amplified by polymerase chain reactions (PCR). Finally, each DNA-bound bead is placed into a approximately 44 ⁇ m well on a PicoTiterPlate fiber optic chip for sequencing.
  • PCR polymerase chain reactions
  • nucleotides are typically washed in series over the PicoTiterPlate. During the nucleotide flow, each of the beads with millions of copies of DNA is sequenced in parallel. If a nucleotide complementary to the template strand is flowed into a well, the polymerase extends the existing DNA strand by adding nucleotides. Addition of one or more nucleotides results in a reaction that generates a light signal that is recorded by the CCD camera in the instrument. This technique is also called pyrosequencing. The signal strength is proportional to the number of nucleotides.
  • genomic DNA is typically broken down into 300-500 base pairs smaller fragments and are subsequently "polished” (blunted).
  • Short adaptors are ligated onto the ends of the fragments. These adaptors provide priming sequences for both amplification and sequencing of the sample-library fragments.
  • one adaptor can contain a 5'-biotin tag that enables immobilization of the library onto streptavidin coated beads. After nick repair, the non-biotinylated strand is released and used as a single-stranded template DNA (sstDNA) library.
  • the sstDNA library is assessed for its quality and the optimal amount (DNA copies per bead) needed for emPCR is determined by titration.
  • the sstDNA library is immobilized onto beads.
  • the beads containing a library fragment carry a single sstDNA molecule.
  • the bead-bound library is emulsified with the amplification reagents in a water-in-oil mixture. Each bead is captured within its own microreactor where PCR amplification occurs. This results in bead-immobilized, clonally amplified DNA fragments.
  • the present invention illustrates methods and compositions for preparation of cDNA samples for sequencing.
  • the present invention enables the sequencing process to avoid repeated unproductive sequencing of adaptor regions, generates significantly less artifactual sequences stemming from improper adapter ligation and/or primer annealing; it provides even coverage of the length of individual transcripts due to strategic placement of the sequencing primer; it improves the sequencing efficiency by pre-selecting the fragments with correct adapter combination; it improves reliability and efficiency of the whole procedure due to reliance on PCR suppression rather than physical separation procedures; and it allows simultaneous sequencing of several samples.
  • the present invention can be used for new-generation sequencing using pyrosequencing.
  • An example can be found in the publication by Margulies M, Egholm M, Airman WE, Attiya S, Bader JS, et al. (2005) Genome sequencing in microfabricated high- density picolitre reactors. Nature 437: 376-380.
  • the initial cDNA is produced using SMART cDNA amplification kit described in Zhu et al., but with different cDNA synthesis primer: Cap-Trsa-CV (first strand cDNA synthesis primer): 5'- AAGCAGTGGTATCAACGCAGAGT CGCAGTCGGTACTTTITTCTTTTTTV - 3' (SEQ. ED NO: 1)
  • the 5' end includes a "cap” primer sequence 5'- AAGC AGTGGTATCAACGCAGAGT-S' (SEQ. ID NO: 2), and the 3' end includes a "broken chain” poly T region.
  • the portion of the Cap-Trsa-CV primer in between the cap primer sequence and polyT stretch can be variable or absent.
  • the cDNA can be amplified by any other method or non-amplified double-stranded cDNA can be used, as long as its synthesis incorporates the Cap-Trsa-CV primer.
  • the purpose of the "broken chain" T-primer is to reduce read artifacts during pyrosequening, which may be thrown out of calibration by a too strong signal produced from a long mononucleotide stretch (such as polyT or polyA).
  • the cDNA may be optionally normalized using Trimmer kit and re-amplified using "cap” primer; nebulized and/or sonicated to the average fragment size of 350-400 base pairs; end- polished (by incubation with a DNA polymerase and dNTPs in appropriate buffer) and ligated to the mixture of "A+" and "B+” adapters.
  • Each of these adapters is an equal molar mixture of two oligos (typically, 1 ⁇ M each in the working concentration), a long one that actually gets ligated by its 3' end, and a short one that complements to the 3' end of the longer one to mimic the double-stranded blunt end for the ligase.
  • the short oligo is not getting ligated since it does not have a 5 '-phosphate.
  • the A+ adapter also includes a long and a short strand oligo.
  • the long strand oligo 5'-GCCTCCCTCGCGCCATCAG CCGCGCAGGT-3' (SEQ. ID NO: 4) has an A primer sequence at the 5' end and a suppression/barcoding tag at the 3' end.
  • the Short oligo has the sequence 5'-ACCTGCGCGG-3' (SEQ. ID NO: 5), complementary to the suppression/barcoding tag of the long oligo.
  • the B+ adapter includes a long and a short strand oligo.
  • the long strand oligo 5'-GCCTCCCTCGCGCCATCAG CCGCGCAGGT-3' (SEQ. ID NO: 4) has an A primer sequence at the 5' end and a suppression/barcoding tag at the 3' end.
  • the Short oligo has the sequence 5'-ACCTGCGCGG-3' (SEQ. ID NO: 5), complementary to the suppression/barcoding tag of the long oligo
  • 5'-GCCTTGCCAGCCCGCTCAG ACGAGCGGCC A-3' (SEQ. ID NO: 6) has a B primer sequence at the 5' end and another suppression/barcoding tag at the 3' end.
  • the short oligo has the sequence 5'-TGGCCGCTCGT-3' (SEQ. ID NO: 7), complementary to the suppression/barcoding tag of the long oligo. It is important to note that the adapters typically only get ligated to the "new" 5' ends formed as a result of fragmentation/polishing, since the original 5' termini correspond to the incorporated "cap" primer used for amplification and don't bear the 5' phosphates.
  • the product of ligation is then amplified using a mixture of three primers: A and B primers in 0.1 ⁇ M concentration (their sequence was incorporated into the ligated adaptors) and a long "step-out" primer ("A+-cap", in the typical concentration of about 0.005-0.01 uM) that allows the A+ sequence to get attached to the original cDNA termini.
  • the A+-cap primer has the sequence:
  • suppression tags invoke PCR suppression effect for the fragments that end up flanked by the same kind of adapter, which results in exclusive amplification of the fragments flanked by both A and B primers.
  • B primer is found only on the "inside” of the original cDNA sequence (i.e., fragmentation points introduced during sonication and/or nebulization) while A primer can be either inside (by virtue of adaptor ligation) or "outside", i.e. flanking the original cDNA termini (by virtue of step-out amplification).
  • a biotinylated A primer can be used to bind the fragments to beads and B primer can be a sequencing one.
  • the suppression/barcoding tag of B primer can be variable and can used to discriminate samples that are sequenced simultaneously in the same plate.
  • the barcode sequence can also be incorporated into A+ adapter and/or A+cap primer.
  • cDNA fragments made with the methods and compositions disclosed herein bear the sequencing primer only on the ends corresponding to the fragmentation sites of the original mRNAs rather than 5' or 3' termini, thus ensuring even coverage of the mRNA and efficient assembly and dramatically reducing the ballast fraction of total sequence output corresponding to 3' and 5'- adaptor regions.
  • cDNA samples can be "barcoded" by different adaptors and processed together in the same sequencing run.
  • transcriptome sequencing de novo or transcriptome re-sequencing includes transcriptome sequencing de novo or transcriptome re-sequencing.
  • Other applications include genetic marker discovery and profiling, gene expression analysis, molecular identification of unknown samples, environmental genomics.
  • the present inventors demonstrated the surprising and unexpected results obtained using the procedure of the present invention by constructing two normalized cDNA libraries: from larvae of coral Acropora millepora and from adult amphipod crustaceans Hyallela sp., followed by sequencing using Roche 454 FLX system.
  • the cDNA preparations procedure from the present invention results consistently in the number of reads exceeding the published transcriptome- sequencing studies by a factor of two or more, and show a remarkable improvement in the fraction of usable reads (i.e., sufficiently long high-quality pyrosequencing readouts with no polyA runs) (Table 1)
  • Table 1 shows the comparison of the gross outputs of de novo transcriptome sequencing. Table 1
  • Example 2 Preparation of cDNA samples for de novo transcriptome sequencing with 454 technology.
  • the preparation of appropriately modified cDNA is a critical step ensuring the overall success of transcriptome diversity characterization using next-generation sequencing methods.
  • Example 2 is method that has been adapted for the use with 454 technology, with the primary focus on protein-coding transcriptome data assembly and annotation de novo (i.e., in the absence of the reference genome data). This method generates pools of fragmented cDNAs flanked by two standard 454 amplification/sequencing primers, ready for amplification of individual sequences on microbeads and sequencing.
  • the method requires as little as 50 ng total RNA at the start, and solves three most important problems inherent in comparable protocols: artifacts due to long AJT homopolymer regions, large proportion of unusable (adaptor) sequences in the 454 output, and coverage bias towards 3 '-termini of transcripts.
  • the developed method uses PCR-suppression effect to eliminate problems associated with improper adapter ligation, primer annealing, and adaptor concatenation. Modification of the cDNA synthesis procedure avoids incorporation of long A/T-stretches originating from the polyA tails of the mRNA, which would create problems during pyrosequencing stage. cDNA fragments in samples produced by this method bear the sequencing primer only on the ends corresponding to the fragmentation sites of the original mRNAs rather than 5' or 3' termini, facilitating even coverage and further lowering the proportion of unusable adaptor sequences in the output. To further reduce the 3 '-end bias, the method uses two approaches.
  • the desired distribution of lengths within the originally produced cDNA can be achieved by varying the conditions of the amplification reaction (there is no physical separation procedure involved).
  • the final product is generated as three separate samples, specific to 3 '-terminal, 5'- terminal, and middle cDNA fragments, which can be then mixed in a desired proportion or sequenced independently.
  • the method uses its own cDNA barcodes incorporated into adaptor sequences.
  • the present invention includes the following advantages: (1) requires small amount of total RNA as a staring material; (2) high output of useful sequence due to elimination of adaptor- related artifacts (2-5 fold more new sequence data per run than in analogous published applications); (3) provides even coverage of the length of individual transcripts due to strategic placement of the sequencing primer and production of separate samples for 5', 3', and middle cDNA fragments; (4) eliminates the need for strand-selection step prior to emulsion PCR due to the inherent control over adaptor configurations; and (5) allows simultaneous sequencing of several samples through adaptor barcoding.
  • the initial cDNA is produced using SMART cDNA amplification kit (Clontech) (Zhu et al, 2001) but with different cDNA synthesis primer: Cap-Trsa-CV (first strand cDNA synthesis primer):
  • the purpose of the "broken chain" T-primer is to reduce read artifacts during 454 pyrosequening, which may get thrown out of calibration by a too strong signal produced from a long mononucleotide stretch (such as polyT or polyA).
  • the cDNA is then: [optionally] normalized using Trimmer kit (Evrogen) and re-amplified using cap primer; nebulized or sonicated to the average fragment size of 500-1000; and end- polished (by incubation with a DNA polymerase and dNTPs in appropriate buffer) and ligated to the mixture of "Atitn+" and "Btitn+” adapters.
  • Each of these adapters is an equimolar mixture of two oligos (typically, 1 uM each in the working concentration), a long one that actually gets ligated by its 3' end and a short one that complements to the 3' end of the longer one to mimic the double-stranded blunt end for the ligase.
  • the short oligo is not getting ligated since it does not have a 5 '-phosphate.
  • AGC TCCCTGCGTGTCTCCGACTCAG CCGCGAGCGT ACGCTCGCGG (SEQ. ID NO: 12)
  • ACG TCCCTGCGTGTCTCCGACTCAG CCGCGACGGT ACCGTCGCGG (SEQ. ID NO: 14)
  • GCA TCCCTGCGTGTCTCCGACTCAG CCGCGGCAGT ACTGCCGCGG (SEQ. ID NO: 15)
  • CTG TCCCTGCGTGTCTCCGACTCAGCCGCGCTGGT ACCAGCGCGG (SEQ. ID NO: 16)
  • CGT TCCCTGCGTGTCTCCGACTCAG CCGCGCGTGT ACACGCGCGG (SEQ. ID NO: 17)
  • GTC TCCCTGCGTGTCTCCGACTCAG CCGCGGTCGT ACGACCGCGG (SEQ. ID NO: 18)
  • Short oligo 5'- TGGCCGCTCGT -3' (SEQ. ID NO: 21)
  • the protocol allows for independent amplification of fragment pools corresponding to 5 '-ends, internal fragments and 3'-ends of the original cDNAs. These pools may be then either sequenced separately or mixed in a desired proportion to ensure even coverage.
  • 5 '-end samples are enriched with coding sequences and are especially useful for obtaining pilot gene hunting or phylogenetics data.
  • Three different primer combinations are used to amplify different cDNA ends. 3 '-ends are amplified with Atitn and Btitn+TrsaC primers, internal fragments - with Atitn and Btitn, 5'- ends - with Atitn and Btitn+halfs witch (see below for primer sequences). All primers are typically used in O.luM concentration.
  • Atitn primer is found only on the "inside” of the original cDNA sequence (i.e., fragmentation points introduced during sonication or nebulization) while Btitn pimer can be either inside (by virtue of adaptor ligation) or "outside", i.e. flanking the original cDNA termini (by virtue of step-out amplification).
  • RNA template preparation preparations. These steps are recommended but may not be necessary, depending on your protocol of choice for isolating total RNA. Begin with about 0.5-1 ⁇ g RNA from the organism of your choice (note: the latest version of the Clontech's SMART kit claims the amount can be as low as 50 ng). Precipitate RNA by adding 1 volume 13.3 M LiCl, incubating 30 minutes at -20°C, and centrif ⁇ iging 20 minutes at 16g at room temperature. Rinse RNA pellets briefly with 80% ethanol (don't centrifugate), air dry at room temperature, and dissolve pellets in EB (10 raM Tris, pH 8.0).
  • RNA for a total of 1000 ng RNA
  • Cap-TRSA-CV primer 1. Incubate 3 minutes at 65°C, then chill on ice.
  • cDNA amplification For each first-strand-cDNA sample, set up 12 PCR reactions (30 ⁇ l each): 3 ⁇ l diluted FS-cDNA (from step 2e); 21 ⁇ l H 2 O; 3 ⁇ l 1OX PCR buffer; 0.75 ⁇ l 10 mM dNTP; 1.4 ⁇ l 10 ⁇ M cDNA amplification primer (from Clontech's SMART cDNA amplification kit); 0.6 ⁇ l Advantage2 polymerase (Clontech).
  • Lu4sCap primer 5'-
  • PCR product After PCR, hold product at room temperature. 6. Evaluate PCR product by loading 3 ⁇ l on a gel and visualizing with ethidium bromide. There should be a faintly visible smear with some bands, with the majority of product falling between 500 and 3000 bp in length. Add 3 more cycles if there is nothing visible on the gel, then evaluate again. If the product is not amplified in 20 cycles (25 for Lu4sCap), something is wrong - start over from the cDNA synthesis step. NOTE: the total amount of cDNA product per tube should not exceed 200 ng, which means that the smear on the agarose gel (20 ng per lane) should be really faint. Make sure you don't over-amplify cDNA beyond that.
  • EtOH precipitate the product to concentrate (i.e. if the resulting concentration is less than 2 ⁇ g in 12 microliters) and dissolve it appropriate volume of miliQ water (but don't use water to elute DNA from the column on previous step!) 5 ⁇ l (1/10 volume) of 3M NaAcetate pH 4.8-5.2; 125 ⁇ l (2.5 volume) 96-100% EtOH; hold 20-30 minutes at -20°C; Spin 20 minutes at maximum speed at 4°C, rinse the with 70%EtOH, air dry, dissolve in appropriate volume of milliQ water to achieve a concentration of 2 ⁇ g in 12 microliters. 2.
  • the Trimmer kit from Evrogen is used essentially according to the manufacturer's instructions, here we are just replicating their protocol.
  • step 4d Using a thermal cycler, incubate at 98°C for 2 minutes, then at 68°C for 5 hours, then proceed immediately to the next step. 5. Near the end of the hybridization period (step 4d), warm the DSN master buffer (Trimmer kit) to 68°C.
  • step 5d twice more, producing aliquots from this tube that correspond to 5, 7, 9 and 11 cycles.
  • step 5g the optimum cycle number
  • step 5i the optimum enzyme treatment
  • the reactions 3, 4, and 5 specifically amplify internal fragments, 5 '-ends, and 3 '-ends of the original cDNAs, respectively.
  • Each PCR reaction is assembled as follows: 3 ⁇ l 1OX PCR buffer; 0.75 ⁇ l 10 mM dNTP; 1 ⁇ l ligation product (from Polishing and Ligation step 9); 0.6 ⁇ l Advantage2 polymerase;
  • N is the optimum cycle number determined (8a-g).
  • PCR purify using a column e.g., Qiagen Qiaquick
  • a column e.g., Qiagen Qiaquick
  • the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), "including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
  • A, B, C, or combinations thereof refers to all permutations and combinations of the listed items preceding the term.
  • A, B, C, or combinations thereof is intended to include at least one of: A, B, C, AB, AC, BC, or ABC, and if order is important in a particular context, also BA, CA, CB, CBA, BCA, ACB, BAC, or CAB.
  • expressly included are combinations that contain repeats of one or more item or term, such as BB, AAA, MB, BBC, AAABCCCC, CBBAAA, CABABB, and so forth.
  • the skilled artisan will understand that typically there is no limit on the number of items or terms in any combination, unless otherwise apparent from the context.
  • compositions and/or methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the compositions and/or methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.

Abstract

The present invention includes novel compositions and method for nucleic acid sequencing. The methods and compositions permit a very large number of independent sequencing reactions to be arrayed in parallel, permitting simultaneous sequencing of a very large number of different oligonucleotides with superior output and quality.

Description

METHODS AND COMPOSITIONS FOR NUCLEIC ACID SEQUENCING
TECHNICAL FIELD OF THE INVENTION
The present invention relates in general to the field of nucleic acid sequencing.
BACKGROUND OF THE INVENTION Without limiting the scope of the invention, its background is described in connection with nucleic acid sequencing, and more particularly, improve methods and compositions for amplifying and determining nucleic acid sequences.
Since the discovery that nucleic acids encode the genome, it has been found that many diseases are associated with particular DNA sequences. Tremendous amounts of resources have been allocated to identify and correlate DNA sequence polymorphisms with a diseased state. These sequence polymorphisms include insertions, deletions, or substitutions of nucleotides in one sequence relative to a second sequence. As such, genome sequencing has become an increasing critical tool for diagnosis, therapy and prevention of illnesses and, eventually, the targeted modification of the human genome. Development of rapid and sensitive nucleic acid sequencing methods utilizing automated DNA sequencers has revolutionized modern molecular biology. Analysis of entire genomes of plants, fungi, animals, bacteria, and viruses is now possible with a concerted effort by a series of machines and a team of technicians. Base sequencing of deoxyribonucleic acid and ribonucleic acid is one of the most important analytical techniques in biotechnology, the pharmaceutical industry, food industry, medical diagnostics and other fields of application.
Typically, a DNA sequence polymorphism analysis is performed by isolating DNA from an individual, manipulating the isolated DNA by digesting the DNA with restriction enzymes and/or amplifying a subset of sequences in the isolated DNA and examining the manipulated DNA. Commonly used procedures for analyzing DNA include electrophoretic-based separation analyses such as agarose or polyacrylamide gel electrophoresis. DNA sequences are typically inserted, or loaded on gels and subjected to an electric field. Because DNA has a uniform negative charge, DNA will migrate through the gel based on properties including sequence length and relative sizes.
Varieties of nucleic acid sequencing systems and methods have become available. For example, United States Patent Number 5,972,693 provides methods by which biologically derived DNA sequences in a mixed sample or in an arrayed single sequence clone can be determined and classified without sequencing. The methods are based on the presence of carefully chosen target subsequences, typically 4 to 8 bases in length, in a sample DNA sequence together with DNA sequence databases containing lists of sequences likely to be present in the sample to determine a sample sequence. The method uses restriction endonucleases to recognize target subsequences to cut the sample sequence. Then, chosen recognition moieties are ligated to the cut fragments, the fragments are amplified, and the experimental observation made. Polymerase chain reaction (PCR) is the method of amplification. Several alternative embodiments were described which capable of increased discrimination and which use Type IIS restriction endonucleases, various capture moieties, or samples of specially synthesized cDNA. The '693 patent also uses information on the presence or absence of carefully chosen target subsequences in a single sequence clone together with DNA sequence databases to determine the clone sequence. Computer implemented methods are provided to analyze the experimental results and to determine the sample sequences in question and to carefully choose target subsequences to yield a maximum amount of information. Another example can be found in the United States Patent Number 6,190,868. Briefly, the patent discloses a methodology that provides positive confirmation that nucleic acids, possessing putatively identified sequence predicted to generate observed GeneCalling™ signals, are actually present within the sample from which the signal was originally derived. The putatively identified nucleic acid fragment within the sample possesses 3'- and 5'-ends with known terminal subsequences. The method in the '868 patent includes; contacting nucleic acid fragments in a sample in amplifying conditions with (i) a nucleic acid polymerase; (ii) "regular" primer oligonucleotides having sequences comprising hybridizable portions of the known terminal subsequences; and (iii) a "poisoning" oligonucleotide primer, the poisoning primer having a sequence comprising a first subsequence that is a portion of the sequence of one of the known terminal subsequences and a second subsequence that is a hybridizable portion of the putatively unidentified sequence which is adjacent to the one known terminal subsequence, where the nucleic acids amplified with the poisoning primer are distinguishable upon detection from nucleic acids amplified with the nucleic acids amplified only with the regular primers; separating the products of the contacting step; and detecting a sequence if the nucleic acids amplified with the poisoning primer are detected.
Yet another example can be found in the United States Patent Numbers 6,274,380 and 7,211,390. Briefly, these patents disclose methods and apparatuses for sequencing a nucleic acid. The method includes annealing a population of circular nucleic acid molecules to a plurality of anchor primers linked to a solid support, and amplifying those members of the population of circular nucleic acid molecules which anneal to the target nucleic acid, and then sequencing the amplified molecules by detecting the presence of a sequence by-product.
The United States Patent Numbers 7,244,567 teaches methods of sequencing both the sense and antisense strands of DNA with blocked and unblocked sequencing primers. These methods include the steps of annealing an unblocked primer to a first strand of nucleic acid; annealing a second blocked primer to a second strand of nucleic acid; elongating the nucleic acid along the first strand with a polymerase; terminating the first sequencing primer; deblocking the second primer; and elongating the nucleic acid along the second strand. Yet another example is shown in the United States Patent Numbers 7,335,762 to Rothberg et al. Briefly, Rothberg disclosed methods and apparatuses for sequencing a nucleic acid that permit a very large number of independent sequencing reactions to be arrayed in parallel, permitting simultaneous sequencing of a very large number (>10,000) of different oligonucleotides.
However, none of the above methods are adapted for high-throughput massively parallel sequencing. In particular, during sequencing of the protein-coding transcriptome, prior methods suffer from artifacts stemming from improper adapter ligation and/or primer annealing; do not provides even coverage of the length of individual transcripts; do not sequence efficiently, and some can not simultaneously sequence more than one sample.
SUMMARY OF THE INVENTION The present invention uses novel compositions to improve nucleic acid sequencing . The present invention dramatically reduces the fraction of unusable sequences corresponding to the adaptors in the total sequencing output, and eliminates artifacts due to improper adapter ligation and/or primer annealing. The present invention also provides even coverage of the length of individual transcripts due to strategic placement of the sequencing primer and improves the sequencing efficiency by pre-selecting the fragments with correct adapter combination. In addition, the present invention improves reliability and efficiency of the whole procedure due to reliance on PCR suppression rather than physical separation procedures. Furthermore, the present invention allows simultaneous sequencing of several samples.
In one aspect, the present invention provides methods and compositions for preparing a cDNA sample for sequencing. The steps include creating a double stranded cDNA by annealing a
RNA having a poly A tail with a Cap-Trsa-CV oligonucleotide and reverse transcribing the
RNA resulting in a full length double stranded cDNA; fragmenting (e.g., using sonication and/or nebulization) the full length double stranded cDNA to generate a plurality of fragmented double stranded cDNA; repairing the ends of the fragments using DNA polymerase ("end- polishing"), ligating a mixture of partially-double stranded A+ adapter and a partially-double stranded B+ adapter to fragmented double stranded DNA using a ligase; and amplifying the ligated double stranded cDNA using an amplification mixture comprising a primer A, a primer B, a A+-cap primer and a DNA polymerase.
[0001] In one aspect, the Cap-Trsa-CV oligonucleotide can include a cap primer sequence at the 5' end and a broken poly T stretch region at the 3' end. The broken poly T stretch region typically has two or more poly T regions about 6-base long separated by at least one base residue selected from dA, dC, and dG. This composition prevents pyrosequencing artifacts by eliminating the need to sequence through the long oligo dT stretch. The 3 '-most base of the primer, is a mixture of dA, dC, and dG, to ensure that the primer initiates reverse transcription at the distal-most region of the polyA tail of the mRNA, rather than in the middle of it. In one aspect, the Cap-Trsa-CV oligonucleotide has the sequence listed in SEQ ID NO:1 ; the cap primer can have the sequence listed in SEQ ID NO. 2; and the A+-cap primer can have the sequence listed in SEQ. ID. NO: 3. In one aspect, the A+ adapter includes an A+ long oligonucleotide having a first suppression tag at the 3' end and a A+ primer sequence at the 5' end; and an A+ short oligonucleotide complementary to the first suppression tag. The suppression tag prevents amplification of fragments flanked by the same A+ adapter at both ends later in the procedure. The suppression tag of the A+ long oligonucleotide can be of different sequence and function as a barcode to identify the particular cDNA source post- sequencing.
[0002] In another aspect, the B+ adapter includes a B+ long oligonucleotide having a second suppression tag at the 3' end and a B+ primer region at the 5' end; and a B+ short oligonucleotide complementary to the second suppression tag. The suppression tag prevents amplification of fragments flanked by the same B+ adapter at both ends later in the procedure. The suppression tag of the B+ long oligonucleotide can be of different sequence and function as a barcode to identify the particular cDNA source post-sequencing. In one aspect, the step of step of ligation uses a molar ratio between about 0.9 to about 1.1 for the A+ adapter to B+ adapter, and the step of amplification uses a molar ratio of between about 0.9-1.1 to about 0.05- 0.1 for the primer A: primer B to A+-cap primer.
[0003] The present invention also includes an A+ adapter and a B+ adapter oligonucleotides for amplification. Both adapters can further include a bar-coding tag (e.g., a biotin tag). The A+ adapter and a B+ adapter, each includes a long strand and an short strand and is capable of ligating to a first end or a second end of a fragmented double stranded cDNA. Typically, the long strand of the A+ adapter contains an A primer region at the 5' end and a first suppression/barcode tag region at the 3' end, and the long strand of the B+ adapter contains a B primer region at the 5' end and a second suppression/barcode tag region at the 3' end. Each of the first and second suppression tag regions prevents PCR amplification of the double stranded cDNA flanked with the same A+ adapter or the same B+ adapter at both ends. Only the double stranded cDNA fragments with both A+ and B+ adapters are capable of being amplified. [0004] In some aspects, the primer cocktail includes an A primer, B primer, and A+-cap primer. In another aspect, the long strand A+ adapter can have the sequence listed in SEQ ID NO: 4; the short strand A+ adapter can have the sequence listed in SEQ ID NO: 5; the long strand B+ adapter can have the sequence listed in SEQ ID NO: 6; and the short strand B+ adapter can have the sequence listed in SEQ ID NO: 7. [0005] In another aspect, the molar ratios of the A+ adapter: B+ adapter during the ligation step comprises about 0.9 to 1.1 : about 0.9 to 1.1, and the molar ratio of A primer, B primer, and A+-cap primer comprises about 0.9 to 1.1 : about 0.9 to 1.1 : about 0.04 to 0.11.
BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of the features and advantages of the present invention, reference is now made to the detailed description of the invention along with the accompanying figures and in which:
Figure 1 is a schematic diagram of the present invention.
Figure 2 shows the preparation of a cDNA sample for 454 sequencing.
DETAILED DESCRIPTION OF THE INVENTION While the making and using of various embodiments of the present invention are discussed in detail below, it should be appreciated that the present invention provides many applicable inventive concepts that can be embodied in a wide variety of specific contexts. The specific embodiments discussed herein are merely illustrative of specific ways to make and use the invention and do not delimit the scope of the invention. To facilitate the understanding of this invention, a number of terms are defined below. Terms defined herein have meanings as commonly understood by a person of ordinary skill in the areas relevant to the present invention. Terms such as "a", "an" and "the" are not intended to refer to only a singular entity, but include the general class of which a specific example may be used for illustration. The terminology herein is used to describe specific embodiments of the invention, but their usage does not delimit the invention, except as outlined in the claims. Example 1 : In one embodiment, the present invention describes method to prepare cDNA samples for sequencing, for example, a 454 sequencing™ known by the skilled artisan. A 454 sequencing™ is a parallel pyrosequencing system capable of sequencing about 100 megabases of raw DNA per run. The system relies on fixing nebulized and adapter-ligated DNA fragments to small DNA-capture beads in a water-in-oil emulsion. The DNA fixed to these beads is then amplified by polymerase chain reactions (PCR). Finally, each DNA-bound bead is placed into a approximately 44 μm well on a PicoTiterPlate fiber optic chip for sequencing.
In the 454 sequencing™ protocol, four nucleotides are typically washed in series over the PicoTiterPlate. During the nucleotide flow, each of the beads with millions of copies of DNA is sequenced in parallel. If a nucleotide complementary to the template strand is flowed into a well, the polymerase extends the existing DNA strand by adding nucleotides. Addition of one or more nucleotides results in a reaction that generates a light signal that is recorded by the CCD camera in the instrument. This technique is also called pyrosequencing. The signal strength is proportional to the number of nucleotides.
In 454 sequencing™, genomic DNA is typically broken down into 300-500 base pairs smaller fragments and are subsequently "polished" (blunted). Short adaptors are ligated onto the ends of the fragments. These adaptors provide priming sequences for both amplification and sequencing of the sample-library fragments. Typically, one adaptor can contain a 5'-biotin tag that enables immobilization of the library onto streptavidin coated beads. After nick repair, the non-biotinylated strand is released and used as a single-stranded template DNA (sstDNA) library. The sstDNA library is assessed for its quality and the optimal amount (DNA copies per bead) needed for emPCR is determined by titration.
The sstDNA library is immobilized onto beads. The beads containing a library fragment carry a single sstDNA molecule. The bead-bound library is emulsified with the amplification reagents in a water-in-oil mixture. Each bead is captured within its own microreactor where PCR amplification occurs. This results in bead-immobilized, clonally amplified DNA fragments.
The present invention illustrates methods and compositions for preparation of cDNA samples for sequencing. The present invention enables the sequencing process to avoid repeated unproductive sequencing of adaptor regions, generates significantly less artifactual sequences stemming from improper adapter ligation and/or primer annealing; it provides even coverage of the length of individual transcripts due to strategic placement of the sequencing primer; it improves the sequencing efficiency by pre-selecting the fragments with correct adapter combination; it improves reliability and efficiency of the whole procedure due to reliance on PCR suppression rather than physical separation procedures; and it allows simultaneous sequencing of several samples.
In certain embodiments, the present invention can be used for new-generation sequencing using pyrosequencing. An example can be found in the publication by Margulies M, Egholm M, Airman WE, Attiya S, Bader JS, et al. (2005) Genome sequencing in microfabricated high- density picolitre reactors. Nature 437: 376-380.
The initial cDNA is produced using SMART cDNA amplification kit described in Zhu et al., but with different cDNA synthesis primer: Cap-Trsa-CV (first strand cDNA synthesis primer): 5'- AAGCAGTGGTATCAACGCAGAGT CGCAGTCGGTACTTTITTCTTTTTTV - 3' (SEQ. ED NO: 1) The 5' end includes a "cap" primer sequence 5'- AAGC AGTGGTATCAACGCAGAGT-S' (SEQ. ID NO: 2), and the 3' end includes a "broken chain" poly T region. The portion of the Cap-Trsa-CV primer in between the cap primer sequence and polyT stretch can be variable or absent. Alternatively, the cDNA can be amplified by any other method or non-amplified double-stranded cDNA can be used, as long as its synthesis incorporates the Cap-Trsa-CV primer.
The purpose of the "broken chain" T-primer is to reduce read artifacts during pyrosequening, which may be thrown out of calibration by a too strong signal produced from a long mononucleotide stretch (such as polyT or polyA).
The cDNA may be optionally normalized using Trimmer kit and re-amplified using "cap" primer; nebulized and/or sonicated to the average fragment size of 350-400 base pairs; end- polished (by incubation with a DNA polymerase and dNTPs in appropriate buffer) and ligated to the mixture of "A+" and "B+" adapters.
Each of these adapters is an equal molar mixture of two oligos (typically, 1 μM each in the working concentration), a long one that actually gets ligated by its 3' end, and a short one that complements to the 3' end of the longer one to mimic the double-stranded blunt end for the ligase. The short oligo is not getting ligated since it does not have a 5 '-phosphate.
The A+ adapter also includes a long and a short strand oligo. The long strand oligo: 5'-GCCTCCCTCGCGCCATCAG CCGCGCAGGT-3' (SEQ. ID NO: 4) has an A primer sequence at the 5' end and a suppression/barcoding tag at the 3' end. The Short oligo has the sequence 5'-ACCTGCGCGG-3' (SEQ. ID NO: 5), complementary to the suppression/barcoding tag of the long oligo. The B+ adapter includes a long and a short strand oligo. The long strand oligo:
5'-GCCTTGCCAGCCCGCTCAG ACGAGCGGCC A-3' (SEQ. ID NO: 6) has a B primer sequence at the 5' end and another suppression/barcoding tag at the 3' end. The short oligo has the sequence 5'-TGGCCGCTCGT-3' (SEQ. ID NO: 7), complementary to the suppression/barcoding tag of the long oligo. It is important to note that the adapters typically only get ligated to the "new" 5' ends formed as a result of fragmentation/polishing, since the original 5' termini correspond to the incorporated "cap" primer used for amplification and don't bear the 5' phosphates.
The product of ligation is then amplified using a mixture of three primers: A and B primers in 0.1 μM concentration (their sequence was incorporated into the ligated adaptors) and a long "step-out" primer ("A+-cap", in the typical concentration of about 0.005-0.01 uM) that allows the A+ sequence to get attached to the original cDNA termini.
The A+-cap primer has the sequence:
5 '-GCCTCCCTCGCGCCATCAG CCGCGCAGGTAAGCAGTGGTATCAACGCAGAGT-3 ' (SEQ. ID NO: 3) with an A primer sequence at the 5' end, a suppression tag in the middle, and a cap primer sequence at the 3' end.
During this amplification "suppression tags" invoke PCR suppression effect for the fragments that end up flanked by the same kind of adapter, which results in exclusive amplification of the fragments flanked by both A and B primers. In these fragments, B primer is found only on the "inside" of the original cDNA sequence (i.e., fragmentation points introduced during sonication and/or nebulization) while A primer can be either inside (by virtue of adaptor ligation) or "outside", i.e. flanking the original cDNA termini (by virtue of step-out amplification).
The entire step is summarized in an example schematic diagram shown in Figure 1. In Figure 1 , a biotinylated A primer can be used to bind the fragments to beads and B primer can be a sequencing one. The suppression/barcoding tag of B primer can be variable and can used to discriminate samples that are sequenced simultaneously in the same plate. The barcode sequence can also be incorporated into A+ adapter and/or A+cap primer. The method disclosed herein does not suffer from problems associated with improper adapter ligation or primer annealing and improves sequencing efficiency by eliminating the fragments with incorrect adapters (same kind of adapters on both ends). Modification of the cDNA synthesis procedure avoids incorporation of long dT-stretches originating from the polyA tails of the mRNA, which otherwise would create problems during pyrosequencing stage. cDNA fragments made with the methods and compositions disclosed herein bear the sequencing primer only on the ends corresponding to the fragmentation sites of the original mRNAs rather than 5' or 3' termini, thus ensuring even coverage of the mRNA and efficient assembly and dramatically reducing the ballast fraction of total sequence output corresponding to 3' and 5'- adaptor regions. cDNA samples can be "barcoded" by different adaptors and processed together in the same sequencing run.
Application of the present invention includes transcriptome sequencing de novo or transcriptome re-sequencing. Other applications include genetic marker discovery and profiling, gene expression analysis, molecular identification of unknown samples, environmental genomics.
The present inventors demonstrated the surprising and unexpected results obtained using the procedure of the present invention by constructing two normalized cDNA libraries: from larvae of coral Acropora millepora and from adult amphipod crustaceans Hyallela sp., followed by sequencing using Roche 454 FLX system. The cDNA preparations procedure from the present invention results consistently in the number of reads exceeding the published transcriptome- sequencing studies by a factor of two or more, and show a remarkable improvement in the fraction of usable reads (i.e., sufficiently long high-quality pyrosequencing readouts with no polyA runs) (Table 1) Table 1 shows the comparison of the gross outputs of de novo transcriptome sequencing. Table 1
Figure imgf000011_0001
Example 2. Preparation of cDNA samples for de novo transcriptome sequencing with 454 technology. The preparation of appropriately modified cDNA is a critical step ensuring the overall success of transcriptome diversity characterization using next-generation sequencing methods. Example 2 is method that has been adapted for the use with 454 technology, with the primary focus on protein-coding transcriptome data assembly and annotation de novo (i.e., in the absence of the reference genome data). This method generates pools of fragmented cDNAs flanked by two standard 454 amplification/sequencing primers, ready for amplification of individual sequences on microbeads and sequencing. The method requires as little as 50 ng total RNA at the start, and solves three most important problems inherent in comparable protocols: artifacts due to long AJT homopolymer regions, large proportion of unusable (adaptor) sequences in the 454 output, and coverage bias towards 3 '-termini of transcripts.
The developed method uses PCR-suppression effect to eliminate problems associated with improper adapter ligation, primer annealing, and adaptor concatenation. Modification of the cDNA synthesis procedure avoids incorporation of long A/T-stretches originating from the polyA tails of the mRNA, which would create problems during pyrosequencing stage. cDNA fragments in samples produced by this method bear the sequencing primer only on the ends corresponding to the fragmentation sites of the original mRNAs rather than 5' or 3' termini, facilitating even coverage and further lowering the proportion of unusable adaptor sequences in the output. To further reduce the 3 '-end bias, the method uses two approaches. First, the desired distribution of lengths within the originally produced cDNA can be achieved by varying the conditions of the amplification reaction (there is no physical separation procedure involved). Second, the final product is generated as three separate samples, specific to 3 '-terminal, 5'- terminal, and middle cDNA fragments, which can be then mixed in a desired proportion or sequenced independently. To enable simultaneous sequencing of several samples, the method uses its own cDNA barcodes incorporated into adaptor sequences.
The present invention includes the following advantages: (1) requires small amount of total RNA as a staring material; (2) high output of useful sequence due to elimination of adaptor- related artifacts (2-5 fold more new sequence data per run than in analogous published applications); (3) provides even coverage of the length of individual transcripts due to strategic placement of the sequencing primer and production of separate samples for 5', 3', and middle cDNA fragments; (4) eliminates the need for strand-selection step prior to emulsion PCR due to the inherent control over adaptor configurations; and (5) allows simultaneous sequencing of several samples through adaptor barcoding. The initial cDNA is produced using SMART cDNA amplification kit (Clontech) (Zhu et al, 2001) but with different cDNA synthesis primer: Cap-Trsa-CV (first strand cDNA synthesis primer):
5'- AAGCAGTGGTATCAACGCAGAGT CGCAGTCGGTACTTTTTTCTTTTTTV - 3' ("cap" primer sequence) ("broken chain" polyT)
(SEQ. ID NO: 8)
The purpose of the "broken chain" T-primer is to reduce read artifacts during 454 pyrosequening, which may get thrown out of calibration by a too strong signal produced from a long mononucleotide stretch (such as polyT or polyA). The cDNA is then: [optionally] normalized using Trimmer kit (Evrogen) and re-amplified using cap primer; nebulized or sonicated to the average fragment size of 500-1000; and end- polished (by incubation with a DNA polymerase and dNTPs in appropriate buffer) and ligated to the mixture of "Atitn+" and "Btitn+" adapters.
Each of these adapters is an equimolar mixture of two oligos (typically, 1 uM each in the working concentration), a long one that actually gets ligated by its 3' end and a short one that complements to the 3' end of the longer one to mimic the double-stranded blunt end for the ligase. The short oligo is not getting ligated since it does not have a 5 '-phosphate.
Atitn+ adapter:
Long oligo: 5'-TCCCTGCGTGTCTCCGACTCAG CCGCGCAGGT -3' Atitn primer sequence suppression tag+barcode (underlined)
(SEQ. ID NO: 9) Short oligo: 5'- ACCTGCGCGG -3' (SEQ. ID NO: 10)
This one has a CAG barcode. Here are some other possible variants of barcoded Atitn+ adaptors (pairs of long and short oligos): GAC: TCCCTGCGTGTCTCCGACTCAG CCGCGGACGT ACGTCCGCGG (SEQ. ID NO: 11)
AGC: TCCCTGCGTGTCTCCGACTCAG CCGCGAGCGT ACGCTCGCGG (SEQ. ID NO: 12)
CGA: TCCCTGCGTGTCTCCGACTCAG CCGCGCGAGT ACTCGCGCGG (SEQ. ID NO: 13)
ACG: TCCCTGCGTGTCTCCGACTCAG CCGCGACGGT ACCGTCGCGG (SEQ. ID NO: 14)
GCA: TCCCTGCGTGTCTCCGACTCAG CCGCGGCAGT ACTGCCGCGG (SEQ. ID NO: 15) CTG: TCCCTGCGTGTCTCCGACTCAGCCGCGCTGGT ACCAGCGCGG (SEQ. ID NO: 16)
CGT: TCCCTGCGTGTCTCCGACTCAG CCGCGCGTGT ACACGCGCGG (SEQ. ID NO: 17) GTC: TCCCTGCGTGTCTCCGACTCAG CCGCGGTCGT ACGACCGCGG (SEQ. ID NO: 18)
GCT: TCCCTGCGTGTCTCCGACTCAG CCGCGGCTGT
ACAGCCGCGG (SEQ. ID NO: 19) Btitn+ adapter:
Long oligo: 5'- TGTGTGCCTTGGCAGTCTCAG ACGAGCGGCCA -3 '
Btitn primer sequence suppression tag
(SEQ. ID NO: 20)
Short oligo: 5'- TGGCCGCTCGT -3' (SEQ. ID NO: 21)
It is important to note that the adapters only get ligated to the "new" 5' ends formed as a result of fragmentation/polishing, since the original 5' termini correspond to the incorporated "cap" primer used for amplification and don't bear the 5' phosphates.
The protocol allows for independent amplification of fragment pools corresponding to 5 '-ends, internal fragments and 3'-ends of the original cDNAs. These pools may be then either sequenced separately or mixed in a desired proportion to ensure even coverage. In particular, 5 '-end samples are enriched with coding sequences and are especially useful for obtaining pilot gene hunting or phylogenetics data.
Three different primer combinations are used to amplify different cDNA ends. 3 '-ends are amplified with Atitn and Btitn+TrsaC primers, internal fragments - with Atitn and Btitn, 5'- ends - with Atitn and Btitn+halfs witch (see below for primer sequences). All primers are typically used in O.luM concentration.
Atitn primer:
5'- TCCCTGCGTGTCTCCGACTCAG-S' (SEQ. ID NO: 22) Btitn primer:
5'-TGTGTGCCTTGGCAGTCTCAG-S' (SEQ. ID NO: 23)
Btitn+halfswitch primer:
5'- TGTGTGCCTTGGCAGTCTCAG ACGAGCGGCCA GTATCAACGCAGAGTACATGG -3' (Btitn primer sequence) (suppression tag) (sequence of the 3 '-portion of the template-switch oligo) (SEQ. ID NO: 24) Btitn+TrsaC:
5'- TGTGTGCCTTGGCAGTCTCAG ACGAGCGGCCA CGCAGTCGGT ACl 1 1 1 lTCl 1 1 1 Tl
(Btitn primer sequence) (suppression tag) (sequence of the 3 ' -portion of the
"broken chain" cDNA synthesis primer) (SEQ. ID NO: 25)
During this amplification "suppression tags" invoke PCR suppression effect for the fragments that end up flanked by the same kind of adapter, which will results in exclusive amplification of the fragments flanked by both Atitn and Btitn primers. In these fragments Atitn primer is found only on the "inside" of the original cDNA sequence (i.e., fragmentation points introduced during sonication or nebulization) while Btitn pimer can be either inside (by virtue of adaptor ligation) or "outside", i.e. flanking the original cDNA termini (by virtue of step-out amplification). Such strategic positioning of the sequencing primer (Atitn) in the final sample eliminates the need for strand-selection step prior to emulsion PCR and further improves the evenness of coverage. As the last stage of the protocol, the products of amplification corresponding to the size range 500-1000 bp are purified from the agarose gel.
The following detailed protocol outlines the basic steps of the present invention as outlined in Figure 2.
RNA template preparation. These steps are recommended but may not be necessary, depending on your protocol of choice for isolating total RNA. Begin with about 0.5-1 μg RNA from the organism of your choice (note: the latest version of the Clontech's SMART kit claims the amount can be as low as 50 ng). Precipitate RNA by adding 1 volume 13.3 M LiCl, incubating 30 minutes at -20°C, and centrifϊiging 20 minutes at 16g at room temperature. Rinse RNA pellets briefly with 80% ethanol (don't centrifugate), air dry at room temperature, and dissolve pellets in EB (10 raM Tris, pH 8.0).
Analyze RNA on a gel to evaluate integrity. First-strand cDNA synthesis (at this and the next stage, follow Clontech's SMART cDNA amplification protocol, but replace the cDNA synthesis primer by Cap-TRSA-CV).
1. Combine 4 μl RNA (for a total of 1000 ng RNA) with 1 μl 10 μM Cap-TRSA-CV primer. Incubate 3 minutes at 65°C, then chill on ice.
2. To the above tube, add a premixed solution containing the following: 2 μl 5X first-strand synthesis buffer; 0.5 μl 10 rtiM dNTP; 1 μl 0.1 M DTT; 1 μl 10 μM template-switch primer (provided with the Clontech's kit); 1 μl Superscript II reverse transcriptase (Invitrogen).
3. Incubate at 42°C for 1 hour.
4. Terminate the reaction by incubating at 65°C for 15 minutes, then return tube to ice.
5. Dilute 5-fold in water to minimize carryover of primers into subsequent reactions. cDNA amplification. For each first-strand-cDNA sample, set up 12 PCR reactions (30 μl each): 3 μl diluted FS-cDNA (from step 2e); 21 μl H2O; 3 μl 1OX PCR buffer; 0.75 μl 10 mM dNTP; 1.4 μl 10 μM cDNA amplification primer (from Clontech's SMART cDNA amplification kit); 0.6 μl Advantage2 polymerase (Clontech).
Optional: use 1.5μl of lOμl Lu4sCap primer for amplification instead of the primer supplied in the Clontech kit to obtain higher molecular weight product (>1.5-2 kb), due to mild PCR- suppression effect. Lu4sCap primer: 5'-
AGTGGACTATCCATGAACGCAAAGCAGTGGTATCAACGCAGAGT-S' (SEQ. ID NO: 26)
1. Amplify using the following profile: 2. 94°C for 5 minutes;
3. (94°C for 40 seconds, 65°C for 1 minute, 720C for 6 minutes) x (15-19) cycles depending on the sample
4. (Lu4sCap primer may require more cycles, up to 25).
5. After PCR, hold product at room temperature. 6. Evaluate PCR product by loading 3 μl on a gel and visualizing with ethidium bromide. There should be a faintly visible smear with some bands, with the majority of product falling between 500 and 3000 bp in length. Add 3 more cycles if there is nothing visible on the gel, then evaluate again. If the product is not amplified in 20 cycles (25 for Lu4sCap), something is wrong - start over from the cDNA synthesis step. NOTE: the total amount of cDNA product per tube should not exceed 200 ng, which means that the smear on the agarose gel (20 ng per lane) should be really faint. Make sure you don't over-amplify cDNA beyond that.
7. To maximize the amount of PCR product that is double-stranded, "chase" the reactions by adding the original amount of primer again (1.4 μl of 10 μM cDNA amplification primer) and cycling with the following profile:
8. 78°C for 1 minute, 65°C for 1 minute, 720C for 7 minutes.
9. Combine together 12 separate reactions prepared from each first-strand-cDNA sample, and purify this PCR product on a column (we use Qiagen Qiaquick PCR Purification kit). Elute the final sample in 50-100 μl of EB (10 mM tris-HCl pH 8.0). Measure the concentration of DNA using Nanodrop spectrofluorometer or any other appropriate method; there should be at least 2 μg of DNA in total. Then, go directly to Sonication (step 6) or do optional Normalization step.
Normalization (optional)
1. EtOH precipitate the product to concentrate (i.e. if the resulting concentration is less than 2 μg in 12 microliters) and dissolve it appropriate volume of miliQ water (but don't use water to elute DNA from the column on previous step!) 5 μl (1/10 volume) of 3M NaAcetate pH 4.8-5.2; 125 μl (2.5 volume) 96-100% EtOH; hold 20-30 minutes at -20°C; Spin 20 minutes at maximum speed at 4°C, rinse the with 70%EtOH, air dry, dissolve in appropriate volume of milliQ water to achieve a concentration of 2 μg in 12 microliters. 2. The Trimmer kit from Evrogen is used essentially according to the manufacturer's instructions, here we are just replicating their protocol. Prepare a hybridization master mix by combining: 2 μg cDNA from step 3f in < 12 μl volume; 4 μl 4X hybridization buffer ; H2O to a total volume of 16 μl; (Note that final cDNA concentration = 125 ng μl"1)
3. Aliquot this out into 4 individual PCR tubes (4 μl each) and overlay each with a drop of sterile mineral oil; centrifuge briefly to collect liquid and separate phases.
4. Using a thermal cycler, incubate at 98°C for 2 minutes, then at 68°C for 5 hours, then proceed immediately to the next step. 5. Near the end of the hybridization period (step 4d), warm the DSN master buffer (Trimmer kit) to 68°C.
6. Prepare a 1A and 1A strength dilutions of the double-strand specific nuclease (DSN) using DSN storage buffer as the diluent; store on ice until ready to use.
5 7. At the end of the hybridization period, add 5 μl preheated master buffer to each tube. Spin briefly in a bench-top centrifuge and return immediately to the thermal cycler. It is important to maintain the temperature at 68°C during this period, so minimize time spent out of the thermal cycler (no more than a few seconds).
8. To the four tubes from step 4c, add the following, while maintaining temperature: 10 Tube Add
A 1 μl un-diluted DSN enzyme
B 1 μl 1A dilution DSN enzyme
C 1 μl 1A dilution DSN enzyme
D 1 μl DSN storage buffer (diluent)
1510. Incubate at 68°C for 25 minutes.
11. Add 10 μl of DSN stop solution (Trimmer kit) to each tube, mix well, and spin briefly to collect contents.
12. Incubate at 68°C an additional 5 minutes.
13. Add 20 μl H2O to each tube then store at -20°C or proceed with next steps. 0 Amplification of normalized cDNA
1. Set up 4 separate PCR reactions, each containing: 1 μl diluted normalized cDNA (from step 41), one PCR reaction per DSN treatment; 23 μl H2O; 3 μl 1OX PCR buffer; 0.75 μl 10 mM dNTP; 1.4 μl 10 μM cDNA amplification primer (from Clontech's SMART cDNA amplification kit); 0.6 μl Advantage2 polymerase (Clontech) 5 2. Amplify using the following profile: 94°C for 5 minutes; (94°C for 40 seconds, 65°C for 1 minute, 720C for 6 minutes) x5 cycles.
3. Remove all tubes from thermal cycler. Remove a 5-μl aliquot from the control tube (corresponding to template tube D, in step 4h) and set this aside.
4. Amplify the control tube for an additional 2 cycles (total = 7). Remove another 5-μl aliquot 0 and set aside.
5. Repeat step 5d twice more, producing aliquots from this tube that correspond to 5, 7, 9 and 11 cycles.
6. Load all aliquots from step 5e on a gel to evaluate optimum cycle number A' as described in the manufacturer's instructions (for our experiments, X= 6). 5 7. Return DSN-containing reactions to the thermal cycler and amplify for an additional N cycles, where N= X+ 9 - 5 (for our experiments, X+ 9 -5 = 8, for 15 cycles total in experimental tubes).
8. "Chase" all reactions as described in step 3d.
9. Load 5 μl on a gel to determine which enzyme dilution treatment (1, 1A, or 1A) gave the best 0 results, as described in Trimmer kit instructions.
10. Once both the optimum cycle number (step 5g) and the optimum enzyme treatment (step 5i) have been established, prepare 16 individual 30-μl reactions according to those treatments and repeat steps 5a-i. Again, avoid over-amplifying the cDNA (see note at the step 3c). 11. Pool the products, purify on a column (e.g., Qiagen Qiaquick), elute in EB, and quantify. Normalized cDNA can be stored at -20°C.
Fragmentation (sonication). In certain circumstances it sonication can be used to nebulize the fragments since it makes it easier to process multiple samples at once, and poses less threat of DNA contamination. Sonication was conducted with a "cup horn" attachment: a water-filled cup with sonicating bottom in which the 1.5 mL tubes may be submerged. Our model is called "ultrasonic liquid processor Sonicator 3000" by Misonix, with cup horn part number 431C.
1. Prepare a tube of normalized (optional), amplified, purified cDNA (from step 3e or 5k) containing ~ 1 - 5 μg cDNA in 100 μl. Dilute with EB if required to achieve this concentration (~ 50 ng/μl).
2. Set aside an aliquot of intact cDNA at this time for later gel analysis.
3. Set up a sonicator with an ice water bath so that a 1.5-ml centrifuge tube can be partially submerged in the water, with the bottom of the tube resting ~ 1 cm above the cup horn bottom, and the portion of that tube containing liquid fully submerged in the water. 4. Set the sonicator power at 1.0 - 1.5, corresponding to 18-30W.
5. Sonication should be done in 30 second "on" bursts, with 30 second "off rests in between. Note that sonication times are reported here as the sum of all "on" periods during the process.
6. Sonicate the cDNA for a series of increasing durations, and remove an aliquot at each interval. In our experiments, we choose 1 minute, 3 minutes, 5 minutes, 7 and up to 10 minutes.
7. After all sonication is complete, load 2-3 μl of each sample (including the original intact cDNA) on a gel to evaluate the molecular weight. Select the treatment that produced a smear ranging from about 500 to about 2000 bp. In our experiments, this is commonly the7- 9 minute treatment.
8. Precipitate the fragmented cDNA with ethanol to remove very short oligonucleotides, and dissolve in 10- 20 μl of a suitable buffer (EB or IX NEB2).
Polishing and ligation with adaptors.
1. Polish the fragmented cDNA to ensure that all ends are blunted, by combining the following in a tube at room temperature: 25 ng fragmented cDNA (from step 6h); 1.25 μl
1OX NEB2 buffer; 1.25 μl 1OX BSA; 0.6 μl 10 mM dNTP; 0.6 μl T4 DNA polymerase; 0.6 μl Klenow fragment of DNA polymerase I (New England Biolabs or equivalent); H2O to final volume = 12.5 μl
2. Incubate at room temperature for 1 Vi hours. 3. Terminate polishing reaction by incubating at 70°C for 15 minutes, then cool to room temperature.
4. Prepare adaptor Atitn by combining Atitn+barcoded primer and anti-Atitn+barcoded primer at a final concentration of 10 μM each. Do the same mix for Btitn + and antiBtitn+ at a final concentration 10 μM each. 5. Prepare ligation master mixes at room temperature by combining: 5 μl H2O; 1.25 μl 1OX T4 DNA ligase buffer; 2.5 μl 10 μM adaptor Atitn+(bar-coded); 2.5 μl 10 μM adaptor Btitn+; 1.25 μl T4 DNA ligase
6. Combine 12.5 μl master mix with 12.5 μl polished cDNA (from step 7c) for a final volume of 25 μl.
7. Incubate at 12°C overnight.
8. The following day incubate ligation mix 10 minutes at 65°C, then cool to room temperature - do not store on ice.
9. Purify on a column (e.g., Qiagen Qiaquick) according to the manufacturer's instructions, and elute in 30 μl EB.
PCR testing the ligation.
1. For each 454 cDNA library produced, prepare 5 different PCR reactions, each with a different combination of primers. Including water controls (no-template controls) is recommended. The primer combinations are as follows (final primer concentrations are shown):
Tube Primers
1 0.2 μM Atitn
2 0.2 μM Btitn
3 0.1 μM Atitn, 0.1 μM Btitn
4 0.1 μM Atitn, 0.1 μM Btitn+halfswitch,
5 0.1 μM Atitn, 0.1 μM Btitn+TrsaC
The reactions 3, 4, and 5 specifically amplify internal fragments, 5 '-ends, and 3 '-ends of the original cDNAs, respectively.
2. Each PCR reaction is assembled as follows: 3 μl 1OX PCR buffer; 0.75 μl 10 mM dNTP; 1 μl ligation product (from Polishing and Ligation step 9); 0.6 μl Advantage2 polymerase;
Primers from Step 1 of the PCR Testing and Ligation); H2O to final volume of 30 μl
3. Amplify these reactions using the following profile: 94°C for 5 minutes; (94°C for 40 seconds, 65°C for 1 minute, 720C for 1 minutes) xl7 cycles; a typical targeted product is the 500bp-1000bp length 1 minute of elongation time is enough. 4. Load 3 μl of these products on a gel; hold remainder at room temperature while the gel runs in case additional cycles are required.
5. A visible smear ranging from 300-2000 bp should be visible in reaction #3, #4 and #5. None of the other two reactions should produce any product.
6. If nothing is visible in any lanes, amplify for an additional 2 cycles and repeat the gel analysis.
7. Repeat previous step until visible smears are produced to allow determination of optimum cycle number. If more than 17 cycles were required to produce visible smears, try adding more template and using fewer cycles. In our experiments, 1 μl of purified ligation product as PCR template and 17 cycles produced visible smears based on loading 3 μl on a gel. Amplification of samples for gel extraction
1. Set up "bulk" amplifications based on the optimum cycle numbers and template volumes determined from the PCR tests above (steps 8a-g). For our experiments, we set up 8 reactions. 2. Each reaction is assembled as follows: 3 μl 1OX PCR buffer; 0.75 μl 10 mM dNTP; X μl ligation product (determined in PCR testing the ligation steps 1-6); 0.5 μl 6 μM primer Atitn; 0.5 μl 6 μM primer Btitn OR Btitn+halfswitch OR Btitn+TrsaC; 0.6 μl Advantage2 polymerase; H2O to final volume of 30 μl. 3. Amplify the reactions using the following profile:
4. 94°C for 5 minutes;
5. (94°C for 40 seconds, 65°C 1 minute, 72°C for 1 minutes) X N cycles;
6. where N is the optimum cycle number determined (8a-g).
7. Load an aliquot on a gel to verify that the reaction amplified as expected. 8. Chase using the following profile:
9. 78°C 1 minute, 65°C 1 minutes, 68°C 1 minutes.
10. PCR purify using a column (e.g., Qiagen Qiaquick), elute in 30 μl EB, and quantify.
Gel purification of final samples. For Titanium 454 procedure it is extremely important to have DNA fragments within 500-1000 bp length range. The following protocol is a modified version of a standard agarose electrophoresis, which improves separation due to the buffer concentration gradient forming in the gel.
1. Make a 1% agarose gel using SeaKem GTG Agarose (Lonza 50071)
2. Put the gel in the apparatus, pour Ix TBE buffer in the "lower" (cathode) chamber and 0.5x TBE buffer (TBE buffer diluted twice with water) into the "upper" (anode) chamber. Take care not to mix the buffers; the buffer should not cover the top of the gel. Wash the wells with IxTBE. Pre-run the gel for 10 minutes at 100V.
3. Load all DNA from step 9f combined with 6x loading dye. It will be 3 kinds of samples: 5' ends, 3' ends and the "middles". For each of them you might need to do more than one gel- load as the usual amount of DNA extracted from the gel is around 200 ng per 1 cm- wide lane, and you want to get 1 μg of material total in the end.
4. Run at 100V for lhour 15 minutes or optimum time.
5. Cut the pieces of gel with the smear between 500bp and lOOObp, avoiding the edges of the lane. Note: ethidium bromide in the buffer and the gel may be used, and view the gel on UV-transilluminator; but any appropriate staining/visualizing method can be used. In you are using UV, minimize exposure of the gel to UV during cutting to avoid damaging your samples.
6. Extract the DNA from the gel. We used QIAEX π Gel extraction Kit. (Qiagen, 20021). At the last step, elute in smaller volume of 1OmM TRIS or EB buffer. We used 15 μl +5 μl for the total volume 20 μl. Then spin one more time to clean the eluate form any residue DNA- binding beads. That allows getting higher concentration and more precise reading on the nanodrop spectrophotometer.
7. Quantify it and mix in desirable proportions (or keep separate).
Now the sample is ready for 454. NOTE: For use in the 454 process, it is best if the final cDNA sample is tested to confirm that is free from artifacts. Ligate an aliquote of it into any PCR-cloning vector (such as pGEM-T, Promega) and sequence 10-20 randomly picked clones using standard Sanger technique. It is contemplated that any embodiment discussed in this specification can be implemented with respect to any method, kit, reagent, or composition of the invention, and vice versa. Furthermore, compositions of the invention can be used to achieve methods of the invention. It will be understood that particular embodiments described herein are shown by way of illustration and not as limitations of the invention. The principal features of this invention can be employed in various embodiments without departing from the scope of the invention. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, numerous equivalents to the specific procedures described herein. Such equivalents are considered to be within the scope of this invention and are covered by the claims.
All publications and patent applications mentioned in the specification are indicative of the level of skill of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
The use of the word "a" or "an" when used in conjunction with the term "comprising" in the claims and/or the specification may mean "one," but it is also consistent with the meaning of "one or more," "at least one," and "one or more than one." The use of the term "or" in the claims is used to mean "and/or" unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and "and/or." Throughout this application, the term "about" is used to indicate that a value includes the inherent variation of error for the device, the method being employed to determine the value, or the variation that exists among the study subjects.
As used in this specification and claim(s), the words "comprising" (and any form of comprising, such as "comprise" and "comprises"), "having" (and any form of having, such as "have" and "has"), "including" (and any form of including, such as "includes" and "include") or "containing" (and any form of containing, such as "contains" and "contain") are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
The term "or combinations thereof as used herein refers to all permutations and combinations of the listed items preceding the term. For example, "A, B, C, or combinations thereof is intended to include at least one of: A, B, C, AB, AC, BC, or ABC, and if order is important in a particular context, also BA, CA, CB, CBA, BCA, ACB, BAC, or CAB. Continuing with this example, expressly included are combinations that contain repeats of one or more item or term, such as BB, AAA, MB, BBC, AAABCCCC, CBBAAA, CABABB, and so forth. The skilled artisan will understand that typically there is no limit on the number of items or terms in any combination, unless otherwise apparent from the context.
All of the compositions and/or methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the compositions and/or methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
REFERENCES
Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, et al. (2008) Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Molecular Ecology 17: 1636-1647.
Weber APM, Weber KL, Carr K, Wilkerson C, Ohlrogge JB (2007) Sampling the arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiology 144: 32-42.
Zhu YY, Machleder EM, Chenchik A, Li R, Siebert PD (2001) Reverse transcriptase template switching: A SMART (TM) approach for full-length cDNA library construction. Biotechniques 30: 892-897.

Claims

What is claimed is:
1. A method for preparing a cDNA sample for sequencing comprising the steps of: creating a double stranded cDNA by annealing a RNA having a poly A tail with a Cap-Trsa-CV oligonucleotide and reverse transcribing the RNA resulting in a full length double stranded cDNA; fragmenting the full length double stranded cDNA to generate a plurality of fragmented double stranded cDNA; ligating a double stranded A+ adapter or a double stranded B+ adapter to a first end of each fragmented double stranded cDNA and a double stranded A+ adapter or a double stranded B+ adapter to a second end of each fragmented double stranded DNA using a ligase; and amplifying the ligated double stranded cDNA using an amplification mixture comprising a primer A, a primer B, a A+-cap primer and a DNA polymerase.
2. The method of claim 1, wherein the Cap-Trsa-CV oligonucleotide comprise a cap primer sequence at the 5' end and a broken poly T stretch region at the 3' end, wherein the broken poly T stretch region comprises two poly T regions separated by at least one base residue selected from dA, dC, and dG.
3. The method of claim 1, wherein the Cap-Trsa-CV oligonucleotide comprises SEQ. ED. NO: 1.
4. The method of claim 2, wherein the cap primer sequence comprises SEQ ID NO. 2.
5. The method of claim 1, wherein the A+-cap primer comprises SEQ. ID. NO: 3.
6. The method of claim 1 , wherein the A+ adapter comprises: an A+ long oligonucleotide having a first suppression tag at the 3' end and an A+ primer sequence at the 5' end; and an A+ short oligonucleotide comprises oligonucleotide complementary to the first suppression tag.
7. The method of claim 1 , wherein the B+ adapter comprises: a B+ long oligonucleotide having a second suppression tag at the 3' end and a B+ primer region at the 5' end; and a B+ short oligonucleotide comprises oligonucleotide complementary to the second suppression tag.
8. The method of claim 1 , wherein the B+ long oligonucleotide further comprises a bar- coding tag.
9. The method of claim 8, wherein the bar-coding tag comprises biotin.
10. The method of claim 1, wherein the step of fragmentation uses sonication.
11. The method of claim 1 , wherein the step of ligation uses a molar ratio between about 0.9 to about 1.1 for the A+ adapter to B+ adapter.
12. The method of claim 1 , wherein the step of amplification uses a molar ratio of between about 0.9 - 1.1 to about 1 for the primer A: primer B to A+-cap primer.
13. The method of claim 1 , further comprising the step of amplifying the 5 ' end separately with an Atitn primer, a Btitn primer and a halfswitch.
14. The method of claim 1, further comprising the step of amplifying the 3' end separately with an Atitn primer, a Btitn primer and a TrsaC.
15. The method of claim 1 , further comprising the step of amplifying the internal fragments with an Atitn primer and a Btitn primer.
16. A pair of adapter oligonucleotides for amplification comprising: an A+ adapter and a B+ adapter, each comprising a long strand and an short strand and wherein each is capable of ligating to a first end or a second end of a fragmentated double stranded cDNA, wherein: the long strand of the A+ adapter comprises an A primer region at the 5' end and a first suppression tag region at the 3' end, and the long strand of the B+ adapter comprises a B primer region at the 5' end and a second suppression tag region at the 3' end, and each of the first and second suppression tag regions cause a PCR suppression effect of the double stranded cDNA with A+ adapter and the double stranded cDNA with B+ adapter; and the combination of the A+ adapter, the B+ adapter and a fragmented double stranded cDNA in the presence of a primer cocktail results in that only the double stranded cDNA with both A+ and B+ adapter are capable of being amplified.
17. The oligonucleotides of claim 16, wherein the primer cocktail comprises A primer, B primer, and A+-cap primer.
18. The oligonucleotides of claim 16, wherein the first and second suppression tag regions comprise the same sequence.
19. The oligonucleotides of claim 16, wherein either the long strand of the A+ or the B+ adapter further comprises a bar-coding tag.
20. The oligonucleotides of claim 16, wherein either the long strand of the A+ or the B+ adapter further comprises a biotin tag.
21. The oligonucleotides of claim 16, wherein the long strand A+ adapter is selected from SEQ ID NO: 4 or NO: 5.
22. The oligonucleotides of claim 16, wherein the long strand B+ adapter is selected from SEQ ID NO: 6 or 7.
23. The oligonucleotides of claim 17, wherein the A+-cap primer comprises SEQ. ED NO: 3
24. The pair of adapter double strand oligonucleotide of claim 16, wherein the molar ratios of the A+ adapter: B+ adapter during the ligation step comprises about 0.9 to 1.1 : about 0.9 to
1.1.
25. The pair of adapter double strand oligonucleotide of claim 17, wherein the molar ratio of A primer, B primer, and A+-cap primer comprises about 0.9 to 1.1 : about 0.9 to 1.1 : about 0.09 to 0.11.
26. A method for preparing a cDNA sample for sequencing comprising the steps of: creating a double stranded cDNA by annealing a RNA having a poly A tail with a Cap-Trsa-CV oligonucleotide and reverse transcribing the RNA resulting in a full length double stranded cDNA; fragmenting the full length double stranded cDNA to generate a plurality of fragmented double stranded cDNA; ligating a double stranded A+ adapter or a double stranded B+ adapter to a first end of each fragmented double stranded cDNA and a double stranded A+ adapter or a double stranded B+ adapter to a second end of each fragmented double stranded DNA using a ligase; and amplifying the ligated double stranded cDNA using an amplification mixture comprising a primer A, a primer B, a A+-cap primer, a Btitn-halfswitch primer and a Btitn-TrsaC primer and a DNA polymerase.
PCT/US2009/003331 2008-05-30 2009-06-01 Methods and compositions for nucleic acid sequencing WO2009148560A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US5760708P 2008-05-30 2008-05-30
US61/057,607 2008-05-30

Publications (3)

Publication Number Publication Date
WO2009148560A2 true WO2009148560A2 (en) 2009-12-10
WO2009148560A3 WO2009148560A3 (en) 2010-03-11
WO2009148560A8 WO2009148560A8 (en) 2010-07-22

Family

ID=41398719

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/003331 WO2009148560A2 (en) 2008-05-30 2009-06-01 Methods and compositions for nucleic acid sequencing

Country Status (2)

Country Link
US (1) US20100120097A1 (en)
WO (1) WO2009148560A2 (en)

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102311948A (en) * 2010-07-07 2012-01-11 霍夫曼-拉罗奇有限公司 Clone in the emulsion increases in advance
WO2013068528A1 (en) * 2011-11-10 2013-05-16 Harry Cuppens Methods for determining nucleotide sequence repeats
WO2014144822A3 (en) * 2013-03-15 2015-04-23 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
WO2015121236A1 (en) * 2014-02-11 2015-08-20 F. Hoffmann-La Roche Ag Targeted sequencing and uid filtering
US9567646B2 (en) 2013-08-28 2017-02-14 Cellular Research, Inc. Massively parallel single cell analysis
US9708659B2 (en) 2009-12-15 2017-07-18 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US9727810B2 (en) 2015-02-27 2017-08-08 Cellular Research, Inc. Spatially addressable molecular barcoding
US9905005B2 (en) 2013-10-07 2018-02-27 Cellular Research, Inc. Methods and systems for digitally counting features on arrays
CN108410856A (en) * 2018-03-29 2018-08-17 武汉光谷创赢生物技术开发有限公司 A kind of structure of full-length cDNA synthetic method and its sequencing library
US10202641B2 (en) 2016-05-31 2019-02-12 Cellular Research, Inc. Error correction in amplification of samples
US10301677B2 (en) 2016-05-25 2019-05-28 Cellular Research, Inc. Normalization of nucleic acid libraries
US10338066B2 (en) 2016-09-26 2019-07-02 Cellular Research, Inc. Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US10619186B2 (en) 2015-09-11 2020-04-14 Cellular Research, Inc. Methods and compositions for library normalization
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
US10669570B2 (en) 2017-06-05 2020-06-02 Becton, Dickinson And Company Sample indexing for single cells
US10697010B2 (en) 2015-02-19 2020-06-30 Becton, Dickinson And Company High-throughput single-cell analysis combining proteomic and genomic information
US10722880B2 (en) 2017-01-13 2020-07-28 Cellular Research, Inc. Hydrophilic coating of fluidic channels
US10822643B2 (en) 2016-05-02 2020-11-03 Cellular Research, Inc. Accurate molecular barcoding
US10941396B2 (en) 2012-02-27 2021-03-09 Becton, Dickinson And Company Compositions and kits for molecular counting
US11098360B2 (en) 2016-06-01 2021-08-24 Roche Sequencing Solutions, Inc. Immuno-PETE
US11124823B2 (en) 2015-06-01 2021-09-21 Becton, Dickinson And Company Methods for RNA quantification
US11164659B2 (en) 2016-11-08 2021-11-02 Becton, Dickinson And Company Methods for expression profile classification
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
US11365409B2 (en) 2018-05-03 2022-06-21 Becton, Dickinson And Company Molecular barcoding on opposite transcript ends
US11371076B2 (en) 2019-01-16 2022-06-28 Becton, Dickinson And Company Polymerase chain reaction normalization through primer titration
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
US11397882B2 (en) 2016-05-26 2022-07-26 Becton, Dickinson And Company Molecular label counting adjustment methods
US11492660B2 (en) 2018-12-13 2022-11-08 Becton, Dickinson And Company Selective extension in single cell whole transcriptome analysis
US11535882B2 (en) 2015-03-30 2022-12-27 Becton, Dickinson And Company Methods and compositions for combinatorial barcoding
US11608497B2 (en) 2016-11-08 2023-03-21 Becton, Dickinson And Company Methods for cell label classification
US11639517B2 (en) 2018-10-01 2023-05-02 Becton, Dickinson And Company Determining 5′ transcript sequences
US11649497B2 (en) 2020-01-13 2023-05-16 Becton, Dickinson And Company Methods and compositions for quantitation of proteins and RNA
US11661631B2 (en) 2019-01-23 2023-05-30 Becton, Dickinson And Company Oligonucleotides associated with antibodies
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins
US11773441B2 (en) 2018-05-03 2023-10-03 Becton, Dickinson And Company High throughput multiomics sample analysis
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
US11932849B2 (en) 2018-11-08 2024-03-19 Becton, Dickinson And Company Whole transcriptome analysis of single cells using random priming
US11939622B2 (en) 2019-07-22 2024-03-26 Becton, Dickinson And Company Single cell chromatin immunoprecipitation sequencing assay
US11946095B2 (en) 2017-12-19 2024-04-02 Becton, Dickinson And Company Particles associated with oligonucleotides
US11965208B2 (en) 2019-04-19 2024-04-23 Becton, Dickinson And Company Methods of associating phenotypical data and single cell sequencing data
US11970737B2 (en) 2019-08-26 2024-04-30 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK2556171T3 (en) 2010-04-05 2015-12-14 Prognosys Biosciences Inc Spatially CODED BIOLOGICAL ASSAYS
US20190300945A1 (en) 2010-04-05 2019-10-03 Prognosys Biosciences, Inc. Spatially Encoded Biological Assays
US10787701B2 (en) 2010-04-05 2020-09-29 Prognosys Biosciences, Inc. Spatially encoded biological assays
GB201106254D0 (en) 2011-04-13 2011-05-25 Frisen Jonas Method and product
LT3305918T (en) 2012-03-05 2020-09-25 President And Fellows Of Harvard College Methods for epigenetic sequencing
CN111662960B (en) 2013-06-25 2024-04-12 普罗格诺西斯生物科学公司 Spatially encoded bioanalytical analysis using microfluidic devices
US20150298091A1 (en) 2014-04-21 2015-10-22 President And Fellows Of Harvard College Systems and methods for barcoding nucleic acids
CN107250382A (en) 2015-02-17 2017-10-13 生物辐射实验室股份有限公司 Quantified using the small nucleic acids of division cyclic amplification
DK3901281T3 (en) 2015-04-10 2023-01-23 Spatial Transcriptomics Ab SPATIALLY SEPARATE, MULTIPLEX NUCLEIC ACID ANALYSIS OF BIOLOGICAL SAMPLES
EP3283629A4 (en) 2015-04-17 2018-08-29 President and Fellows of Harvard College Barcoding systems and methods for gene sequencing and other applications
US11519033B2 (en) 2018-08-28 2022-12-06 10X Genomics, Inc. Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample
WO2020123320A2 (en) 2018-12-10 2020-06-18 10X Genomics, Inc. Imaging system hardware
US11926867B2 (en) 2019-01-06 2024-03-12 10X Genomics, Inc. Generating capture probes for spatial analysis
US11649485B2 (en) 2019-01-06 2023-05-16 10X Genomics, Inc. Generating capture probes for spatial analysis
WO2020243579A1 (en) 2019-05-30 2020-12-03 10X Genomics, Inc. Methods of detecting spatial heterogeneity of a biological sample
EP4025711A2 (en) 2019-11-08 2022-07-13 10X Genomics, Inc. Enhancing specificity of analyte binding
WO2021091611A1 (en) 2019-11-08 2021-05-14 10X Genomics, Inc. Spatially-tagged analyte capture agents for analyte multiplexing
ES2946357T3 (en) 2019-12-23 2023-07-17 10X Genomics Inc Methods for spatial analysis using RNA template ligation
US11702693B2 (en) 2020-01-21 2023-07-18 10X Genomics, Inc. Methods for printing cells and generating arrays of barcoded cells
US11732299B2 (en) 2020-01-21 2023-08-22 10X Genomics, Inc. Spatial assays with perturbed cells
US11821035B1 (en) 2020-01-29 2023-11-21 10X Genomics, Inc. Compositions and methods of making gene expression libraries
US11898205B2 (en) 2020-02-03 2024-02-13 10X Genomics, Inc. Increasing capture efficiency of spatial assays
US11732300B2 (en) 2020-02-05 2023-08-22 10X Genomics, Inc. Increasing efficiency of spatial analysis in a biological sample
US11835462B2 (en) 2020-02-11 2023-12-05 10X Genomics, Inc. Methods and compositions for partitioning a biological sample
US11891654B2 (en) 2020-02-24 2024-02-06 10X Genomics, Inc. Methods of making gene expression libraries
US11926863B1 (en) 2020-02-27 2024-03-12 10X Genomics, Inc. Solid state single cell method for analyzing fixed biological cells
US11768175B1 (en) 2020-03-04 2023-09-26 10X Genomics, Inc. Electrophoretic methods for spatial analysis
EP4242325A3 (en) 2020-04-22 2023-10-04 10X Genomics, Inc. Methods for spatial analysis using targeted rna depletion
WO2021236929A1 (en) 2020-05-22 2021-11-25 10X Genomics, Inc. Simultaneous spatio-temporal measurement of gene expression and cellular activity
WO2021237087A1 (en) 2020-05-22 2021-11-25 10X Genomics, Inc. Spatial analysis to detect sequence variants
WO2021242834A1 (en) 2020-05-26 2021-12-02 10X Genomics, Inc. Method for resetting an array
EP4025692A2 (en) 2020-06-02 2022-07-13 10X Genomics, Inc. Nucleic acid library methods
EP4158054A1 (en) 2020-06-02 2023-04-05 10X Genomics, Inc. Spatial transcriptomics for antigen-receptors
WO2021252499A1 (en) 2020-06-08 2021-12-16 10X Genomics, Inc. Methods of determining a surgical margin and methods of use thereof
EP4165207A1 (en) 2020-06-10 2023-04-19 10X Genomics, Inc. Methods for determining a location of an analyte in a biological sample
WO2021263111A1 (en) 2020-06-25 2021-12-30 10X Genomics, Inc. Spatial analysis of dna methylation
US11761038B1 (en) 2020-07-06 2023-09-19 10X Genomics, Inc. Methods for identifying a location of an RNA in a biological sample
US11926822B1 (en) 2020-09-23 2024-03-12 10X Genomics, Inc. Three-dimensional spatial analysis
US11827935B1 (en) 2020-11-19 2023-11-28 10X Genomics, Inc. Methods for spatial analysis using rolling circle amplification and detection probes
EP4121555A1 (en) 2020-12-21 2023-01-25 10X Genomics, Inc. Methods, compositions, and systems for capturing probes and/or barcodes
AU2022238446A1 (en) 2021-03-18 2023-09-07 10X Genomics, Inc. Multiplex capture of gene and protein expression from a biological sample
WO2023034489A1 (en) 2021-09-01 2023-03-09 10X Genomics, Inc. Methods, compositions, and kits for blocking a capture probe on a spatial array

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6114149A (en) * 1988-07-26 2000-09-05 Genelabs Technologies, Inc. Amplification of mixed sequence nucleic acid fragments
US20020150919A1 (en) * 2000-10-27 2002-10-17 Sherman Weismann Methods for identifying genes associated with diseases or specific phenotypes
US20030219770A1 (en) * 2001-11-08 2003-11-27 Eshleman James R. Methods and systems of nucleic acid sequencing
US20040209298A1 (en) * 2003-03-07 2004-10-21 Emmanuel Kamberov Amplification and analysis of whole genome and whole transcriptome libraries generated by a DNA polymerization process

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2264197A (en) * 1996-02-09 1997-08-28 Government Of The United States Of America, As Represented By The Secretary Of The Department Of Health And Human Services, The Restriction display (rd-pcr) of differentially expressed mrnas
EP1606417A2 (en) * 2003-03-07 2005-12-21 Rubicon Genomics Inc. In vitro dna immortalization and whole genome amplification using libraries generated from randomly fragmented dna

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6114149A (en) * 1988-07-26 2000-09-05 Genelabs Technologies, Inc. Amplification of mixed sequence nucleic acid fragments
US20020150919A1 (en) * 2000-10-27 2002-10-17 Sherman Weismann Methods for identifying genes associated with diseases or specific phenotypes
US20030219770A1 (en) * 2001-11-08 2003-11-27 Eshleman James R. Methods and systems of nucleic acid sequencing
US20040209298A1 (en) * 2003-03-07 2004-10-21 Emmanuel Kamberov Amplification and analysis of whole genome and whole transcriptome libraries generated by a DNA polymerization process

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MATZ, MV.: 'Amplification of representative of cDNA samples from microscopic amounts of invertebrate tissue to search for new genes.' METHODS MOL. BIOL. vol. 183, 2002, pages 3 - 18 *
SCHMIDT, WM. ET AL.: 'Capselect: A highly sensitive method for 5' CAP-dependent enrichment of full-length cDNA in PCR-mediated analysis of mRNAs. e31' NUCLEIC ACIDS RESEARCH vol. 27, no. 21, 1999, *

Cited By (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9708659B2 (en) 2009-12-15 2017-07-18 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10059991B2 (en) 2009-12-15 2018-08-28 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10047394B2 (en) 2009-12-15 2018-08-14 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10202646B2 (en) 2009-12-15 2019-02-12 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US10392661B2 (en) 2009-12-15 2019-08-27 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US9845502B2 (en) 2009-12-15 2017-12-19 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US9816137B2 (en) 2009-12-15 2017-11-14 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10619203B2 (en) 2009-12-15 2020-04-14 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
EP2405020A1 (en) * 2010-07-07 2012-01-11 Roche Diagnostics GmbH Clonal pre-amplification in emulsion
CN102311948A (en) * 2010-07-07 2012-01-11 霍夫曼-拉罗奇有限公司 Clone in the emulsion increases in advance
US9650629B2 (en) 2010-07-07 2017-05-16 Roche Molecular Systems, Inc. Clonal pre-amplification in emulsion
US9695466B2 (en) 2011-11-10 2017-07-04 Dname-It Methods to reduce repeats of identical nucleotides in copies of a target DNA molecule including such repeats
WO2013068528A1 (en) * 2011-11-10 2013-05-16 Harry Cuppens Methods for determining nucleotide sequence repeats
US10941396B2 (en) 2012-02-27 2021-03-09 Becton, Dickinson And Company Compositions and kits for molecular counting
US11634708B2 (en) 2012-02-27 2023-04-25 Becton, Dickinson And Company Compositions and kits for molecular counting
US10058839B2 (en) 2013-03-15 2018-08-28 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
US10722858B2 (en) 2013-03-15 2020-07-28 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
US11161087B2 (en) 2013-03-15 2021-11-02 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
WO2014144822A3 (en) * 2013-03-15 2015-04-23 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
EP3415626A1 (en) * 2013-03-15 2018-12-19 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
US10927419B2 (en) 2013-08-28 2021-02-23 Becton, Dickinson And Company Massively parallel single cell analysis
US10954570B2 (en) 2013-08-28 2021-03-23 Becton, Dickinson And Company Massively parallel single cell analysis
US10151003B2 (en) 2013-08-28 2018-12-11 Cellular Research, Inc. Massively Parallel single cell analysis
US11618929B2 (en) 2013-08-28 2023-04-04 Becton, Dickinson And Company Massively parallel single cell analysis
US11702706B2 (en) 2013-08-28 2023-07-18 Becton, Dickinson And Company Massively parallel single cell analysis
US9567646B2 (en) 2013-08-28 2017-02-14 Cellular Research, Inc. Massively parallel single cell analysis
US10208356B1 (en) 2013-08-28 2019-02-19 Becton, Dickinson And Company Massively parallel single cell analysis
US10253375B1 (en) 2013-08-28 2019-04-09 Becton, Dickinson And Company Massively parallel single cell analysis
US9637799B2 (en) 2013-08-28 2017-05-02 Cellular Research, Inc. Massively parallel single cell analysis
US10131958B1 (en) 2013-08-28 2018-11-20 Cellular Research, Inc. Massively parallel single cell analysis
US9598736B2 (en) 2013-08-28 2017-03-21 Cellular Research, Inc. Massively parallel single cell analysis
US9567645B2 (en) 2013-08-28 2017-02-14 Cellular Research, Inc. Massively parallel single cell analysis
US9905005B2 (en) 2013-10-07 2018-02-27 Cellular Research, Inc. Methods and systems for digitally counting features on arrays
WO2015121236A1 (en) * 2014-02-11 2015-08-20 F. Hoffmann-La Roche Ag Targeted sequencing and uid filtering
US10421999B2 (en) 2014-02-11 2019-09-24 Roche Molecular Systems, Inc. Targeted sequencing and UID filtering
US10697010B2 (en) 2015-02-19 2020-06-30 Becton, Dickinson And Company High-throughput single-cell analysis combining proteomic and genomic information
US11098358B2 (en) 2015-02-19 2021-08-24 Becton, Dickinson And Company High-throughput single-cell analysis combining proteomic and genomic information
US9727810B2 (en) 2015-02-27 2017-08-08 Cellular Research, Inc. Spatially addressable molecular barcoding
US10002316B2 (en) 2015-02-27 2018-06-19 Cellular Research, Inc. Spatially addressable molecular barcoding
USRE48913E1 (en) 2015-02-27 2022-02-01 Becton, Dickinson And Company Spatially addressable molecular barcoding
US11535882B2 (en) 2015-03-30 2022-12-27 Becton, Dickinson And Company Methods and compositions for combinatorial barcoding
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
US11124823B2 (en) 2015-06-01 2021-09-21 Becton, Dickinson And Company Methods for RNA quantification
US11332776B2 (en) 2015-09-11 2022-05-17 Becton, Dickinson And Company Methods and compositions for library normalization
US10619186B2 (en) 2015-09-11 2020-04-14 Cellular Research, Inc. Methods and compositions for library normalization
US10822643B2 (en) 2016-05-02 2020-11-03 Cellular Research, Inc. Accurate molecular barcoding
US10301677B2 (en) 2016-05-25 2019-05-28 Cellular Research, Inc. Normalization of nucleic acid libraries
US11845986B2 (en) 2016-05-25 2023-12-19 Becton, Dickinson And Company Normalization of nucleic acid libraries
US11397882B2 (en) 2016-05-26 2022-07-26 Becton, Dickinson And Company Molecular label counting adjustment methods
US11220685B2 (en) 2016-05-31 2022-01-11 Becton, Dickinson And Company Molecular indexing of internal sequences
US11525157B2 (en) 2016-05-31 2022-12-13 Becton, Dickinson And Company Error correction in amplification of samples
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
US10202641B2 (en) 2016-05-31 2019-02-12 Cellular Research, Inc. Error correction in amplification of samples
US11306356B2 (en) 2016-06-01 2022-04-19 Roche Sequencing Solutions, Inc. Immuno-PETE
US11773511B2 (en) 2016-06-01 2023-10-03 Roche Sequencing Solutions, Inc. Immune profiling by primer extension target enrichment
US11725307B2 (en) 2016-06-01 2023-08-15 Roche Sequencing Solutions, Inc. Immuno-PETE
US11098360B2 (en) 2016-06-01 2021-08-24 Roche Sequencing Solutions, Inc. Immuno-PETE
US10338066B2 (en) 2016-09-26 2019-07-02 Cellular Research, Inc. Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11782059B2 (en) 2016-09-26 2023-10-10 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11460468B2 (en) 2016-09-26 2022-10-04 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11467157B2 (en) 2016-09-26 2022-10-11 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11164659B2 (en) 2016-11-08 2021-11-02 Becton, Dickinson And Company Methods for expression profile classification
US11608497B2 (en) 2016-11-08 2023-03-21 Becton, Dickinson And Company Methods for cell label classification
US10722880B2 (en) 2017-01-13 2020-07-28 Cellular Research, Inc. Hydrophilic coating of fluidic channels
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
US10669570B2 (en) 2017-06-05 2020-06-02 Becton, Dickinson And Company Sample indexing for single cells
US10676779B2 (en) 2017-06-05 2020-06-09 Becton, Dickinson And Company Sample indexing for single cells
US11946095B2 (en) 2017-12-19 2024-04-02 Becton, Dickinson And Company Particles associated with oligonucleotides
CN108410856A (en) * 2018-03-29 2018-08-17 武汉光谷创赢生物技术开发有限公司 A kind of structure of full-length cDNA synthetic method and its sequencing library
US11773441B2 (en) 2018-05-03 2023-10-03 Becton, Dickinson And Company High throughput multiomics sample analysis
US11365409B2 (en) 2018-05-03 2022-06-21 Becton, Dickinson And Company Molecular barcoding on opposite transcript ends
US11639517B2 (en) 2018-10-01 2023-05-02 Becton, Dickinson And Company Determining 5′ transcript sequences
US11932849B2 (en) 2018-11-08 2024-03-19 Becton, Dickinson And Company Whole transcriptome analysis of single cells using random priming
US11492660B2 (en) 2018-12-13 2022-11-08 Becton, Dickinson And Company Selective extension in single cell whole transcriptome analysis
US11371076B2 (en) 2019-01-16 2022-06-28 Becton, Dickinson And Company Polymerase chain reaction normalization through primer titration
US11661631B2 (en) 2019-01-23 2023-05-30 Becton, Dickinson And Company Oligonucleotides associated with antibodies
US11965208B2 (en) 2019-04-19 2024-04-23 Becton, Dickinson And Company Methods of associating phenotypical data and single cell sequencing data
US11939622B2 (en) 2019-07-22 2024-03-26 Becton, Dickinson And Company Single cell chromatin immunoprecipitation sequencing assay
US11970737B2 (en) 2019-08-26 2024-04-30 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
US11649497B2 (en) 2020-01-13 2023-05-16 Becton, Dickinson And Company Methods and compositions for quantitation of proteins and RNA
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins

Also Published As

Publication number Publication date
WO2009148560A3 (en) 2010-03-11
US20100120097A1 (en) 2010-05-13
WO2009148560A8 (en) 2010-07-22

Similar Documents

Publication Publication Date Title
WO2009148560A2 (en) Methods and compositions for nucleic acid sequencing
CN110191961B (en) Method for preparing asymmetrically tagged sequencing library
US20170298345A1 (en) Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
Takahashi et al. CAGE (cap analysis of gene expression): a protocol for the detection of promoter and transcriptional networks
US20180142290A1 (en) Blocking oligonucleotides
US20100035249A1 (en) Rna sequencing and analysis using solid support
US20120028814A1 (en) Oligonucleotide ligation, barcoding and methods and compositions for improving data quality and throughput using massively parallel sequencing
US20110039732A1 (en) cDNA Synthesis Using Non-Random Primers
AU2016268089A1 (en) Methods for next generation genome walking and related compositions and kits
EP4119679A1 (en) Polynucleotide adapter design for reduced bias
WO2011074960A1 (en) Restriction enzyme based whole genome sequencing
WO2010030683A1 (en) Methods of generating gene specific libraries
WO2013192292A1 (en) Massively-parallel multiplex locus-specific nucleic acid sequence analysis
KR20170138566A (en) Compositions and methods for constructing strand-specific cDNA libraries
JP7248228B2 (en) Methods and kits for construction of RNA libraries
US20140336058A1 (en) Method and kit for characterizing rna in a composition
US20200190565A1 (en) Methods and kits for reducing adapter-dimer formation
EP1854882A1 (en) Method for obtaining subtraction polynucleotide
EP2774998A1 (en) Improved sequence tags
WO2002103054A1 (en) Genome walking by selective amplification of nick-translate dna library and amplification from complex mixtures of templates
CN114341353A (en) Method for amplifying mRNA and preparing full-length mRNA library
CN110612355B (en) Composition for quantitative PCR amplification and application thereof
WO2021028682A1 (en) Methods for generating a population of polynucleotide molecules
JP4755973B2 (en) Method for producing DNA fragment and application thereof
CN110546275A (en) Method and kit for removing unwanted nucleic acids

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09758724

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09758724

Country of ref document: EP

Kind code of ref document: A2