WO2015199939A1 - Compositions and methods for amplifying and determining nucleic acid sequences - Google Patents

Compositions and methods for amplifying and determining nucleic acid sequences Download PDF

Info

Publication number
WO2015199939A1
WO2015199939A1 PCT/US2015/034239 US2015034239W WO2015199939A1 WO 2015199939 A1 WO2015199939 A1 WO 2015199939A1 US 2015034239 W US2015034239 W US 2015034239W WO 2015199939 A1 WO2015199939 A1 WO 2015199939A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
primers
target
nucleotide sequence
sequence
Prior art date
Application number
PCT/US2015/034239
Other languages
French (fr)
Inventor
Robert Shoemaker
Anthony P. Shuber
Original Assignee
Ignyta, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ignyta, Inc. filed Critical Ignyta, Inc.
Publication of WO2015199939A1 publication Critical patent/WO2015199939A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • Nucleic acid sequencing is a method for determining the exact order of nucleotides present in a given DNA or RNA molecule.
  • the use of nucleic acid sequencing has increased greatly as the ability to sequence has become accessible to research and clinical laboratories all over the world. Since completion of the first human genome sequence, demand for cheaper and faster sequencing methods has increased exponentially. This demand has driven the development of high -throughput sequencing methods, which are also termed next-generation sequencing (NGS). Additionally, in the past decade, several NGS platforms have also been developed to provide a cheaper and higher- throughput alternative to sequencing nucleic acids than traditional Sanger sequencing methods.
  • NGS next-generation sequencing
  • NGS platforms are designed to perform massively parallel sequencing, during which millions of fragments of DNA from a single sample are simultaneously sequenced in unison. Such massively parallel sequencing technology facilitates high-throughput sequencing, which can currently allow an entire genome to be sequenced in less than one day.
  • Another important application of NGS is RNA-sequencing that can provide information on the entire transcriptome of a sample in a single analysis without requiring previous knowledge of the genetic sequence of an organism. This technique offers a strong alternative to the use of microarrays in gene expression studies.
  • NGS platforms such as Life Technologies Ion Torrent Personal Genome Machine (PGMTM), Applied Biosystems SOLiDTM Sequencer, the Illumina MiSeq®, and the Roche/454 FLX Pyrosequencer, has made high -throughput sequencing accessible to more laboratories, rapidly increasing the amount of research and clinical diagnostics being performed with nucleic acid sequencing.
  • PGMTM Life Technologies Ion Torrent Personal Genome Machine
  • SOLiDTM Sequencer the Illumina MiSeq®
  • Roche/454 FLX Pyrosequencer has made high -throughput sequencing accessible to more laboratories, rapidly increasing the amount of research and clinical diagnostics being performed with nucleic acid sequencing.
  • next generation sequencing technologies and systems although much less costly in time and money in comparison to first-generation sequencing are still too expensive for many laboratories.
  • NGS can cost more than 100,000 USD in start-up cost, and individual sequence reaction can cost upward of 1 ,000 USD per genome.
  • a large part of this cost has often been attributed to the template preparation step which consists of building a library of nucleic acids (DNA or complementary DNA, i.e. cDNA), attaching platform-specific adaptor sequences to the library, and amplifying that library.
  • NGS platforms use slightly different technologies for sequencing, such as pyrosequencing, sequencing by synthesis or sequencing by ligation.
  • most platforms adhere to a common library preparation procedure, with minor modifications, to generate adapter-ligated fragment libraries before a 'run' on a selected sequencing instrument.
  • This procedure typically includes fragmenting the DNA (sonication, nebulization or shearing), followed by DNA repair and end polishing (blunt end or A- overhang) and, finally, platform-specific adaptor ligation.
  • most NGS sample preparation step involves a ligation step where adaptor sequences, which are typically synthetic oligonucleotides of a known sequence, are added to the end of the input nucleic acids prior to amplification. Because ligations are time consuming and inefficient, this process typically results in considerable sample loss with limited throughput.
  • transposon-based methods for preparing fragmented and tagged DNA libraries (Illumina/Epicentre NexteraTM), or amplicon sequencing where primers contain 5' end sequence tails that incorporate adaptor sequences in subsequent polymerase chain reaction (PCR) steps.
  • PCR polymerase chain reaction
  • transposase- based methods have been reported to often create more biased libraries due to non-random transpositional insertions, and have not been tested on degraded nucleic acid populations, such as nucleic acid samples derived from formalin-fixed paraffin embedded (FFPE) biomaterials such as, human tissues and cells.
  • FFPE formalin-fixed paraffin embedded
  • amplicon sequencing methods are non-random, because they only generate targeted libraries (i.e. non-random), which are generally not suitable for many sequencing applications such as whole genome, transcription, epigenome sequencing.
  • the present application discloses compositions and methods useful for the construction of non-targeted libraries without involving a ligation step.
  • the ligation-free methods of amplifying and sequencing nucleic acids according to this disclosure allow one to avoid ligation without resorting to transposase- or ampl icon-based NGS workflows.
  • the methods can include (a) providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers; and (iii) a second set of primers; (b) subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product; and (c) subjecting the first amplified nucleic acid product of step (b) to a second nucleic acid amplification such that the primers of the second set of primers anneal to at least one nucleic acid molecule of the first amplified nucleic acid product to produce a second amplified nucleic acid product.
  • the first set of primers can include individual primers each including a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion
  • the second set of primers can include individual primers each including a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers.
  • methods for amplifying nucleic acid sequences can include (a) providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers wherein the first set of primers comprises individual primers each comprising a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion; (b) subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product.
  • the methods can further include additional steps of nucleic acid amplification in the presence of one or more target-specific primers, wherein the target-specific primers can be used for single and/or nested PCR amplifications.
  • the nucleic acid template of the methods disclosed herein can include at least one R A molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof.
  • the nucleic acid template can include at least one RNA molecule.
  • the at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to step (a).
  • At least one of the nucleic acid amplification steps of the methods disclosed herein can be repeated at least one or more times.
  • each of the nucleic acid amplifications includes the steps of (i) denaturing the nucleic acid molecules, (ii) annealing the primers with the denatured nucleic acids to allow the formation of primer-nucleic acid hybrids, and (iii) incubating the primer-nucleic acid hybrids to allow a heat-stable nucleic acid replicating enzyme to synthesize the corresponding nucleic acid product.
  • the nucleic acid molecules comprised in the nucleic template are subjected to a targeted sequence enrichment procedure prior to being subject to step (a).
  • a barcode portion is further attached to (a) the tail- adaptor sequence, (b) the tag nucleotide sequence, (c) the first target specific primer, (d) the second target specific primer, or (e) a combination of any of the foregoing (a)-(d).
  • the methods disclosed herein can further include sequencing the amplified portion of the amplified nucleic acid products.
  • the sequencing is performed by a next generation sequencing (NGS) procedure.
  • NGS procedure can be selected from the group consisting of pyrosequencing, sequencing by synthesis, and sequencing by ligation.
  • the methods according to any one of the preceding aspects and embodiments exclude the ligation of a tail-adaptor sequence to any one of the primers.
  • FIGURE 1 is a flow diagram illustrating a non-limiting example of a method of amplifying a nucleic acid template by using two sets of primers. This method can optionally include additional rounds of target-specific amplifications, which can be single or nested PCRs.
  • FIGURE 2 is a flow diagram illustrating another non-limiting example of a method of amplifying a nucleic acid template by using a single set of primers.
  • Each of the individual primers can include a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion.
  • This method can optionally include additional rounds of target-specific amplifications, which can be single or nested PCRs.
  • FIGURE 3 is a flow diagram that illustrates a non-limiting example of a ligation-free method of preparing nucleic acids samples for next generation sequencing (NGS), via amplification of input nucleic acid molecules, followed by construction of sequencing libraries, high-throughput sequencing, and sequence data analysis.
  • NGS next generation sequencing
  • FIGURE 4 is a flow diagram illustrating another non-limiting example of a ligation-free method for amplifying input nucleic acid molecules derived from formalin- fixed paraffin embedded (FFPE) biomaterials such as, tissues and cells.
  • FFPE formalin- fixed paraffin embedded
  • the amplified nucleic acid product is further subjected to additional rounds of targeted PCR amplification by using target-specific primers, which can be single or nested PCR amplifications.
  • FIGURE 5 is a flow diagram illustrating another non-limiting example of a method for amplifying input nucleic acid molecules where the input nucleic acids are subjected to a targeted sequence enrichment procedure prior to being subjected to the first nucleic acid amplification.
  • the present disclosure generally describes compositions and methods for amplifying and/or determining nucleic acid sequences, and particularly to methods for amplifying any stretch of nucleic acid sequences in a sequence-independent manner.
  • the disclosure further relates to ligation-free methods for determining nucleic acid sequences, e.g. by enriching target nucleic acid sequences via random-priming nucleic acid synthesis, optionally followed by gene-specific nested PCR amplifications, prior to sequencing the sequences.
  • FIGURE 1 is a flow diagram illustrating one non-limiting example of method for amplifying nucleic acid sequence in accordance with at least some examples of the present disclosure. As illustrated in FIGURE 1, method 100 can include one or more functions, operations, or actions as illustrated by one or more of operations 110-130.
  • Method 100 can begin at operation 110, "Providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers; and (iii) a second set of primers.” Operation 110 can be followed by operation 120, "Subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product.” Operation 120 can be followed by operation 130, "Subjecting the first amplified nucleic acid product to a second nucleic acid amplification such that the primers of the second set of primers anneal to at least one nucleic acid molecule of the first amplified nucleic acid product to produce a second amplified nucleic acid product.”
  • operations 110-130 are illustrated as being performed sequentially with operation 110 first and operation 130 last. It will be appreciated however that these operations can be reordered, combined, and/or divided into additional or different operations as appropriate to suit particular embodiments. For example, as described in further detail below, additional operations can be added before, during or after one or more of operations 110-130. In some embodiments, one or more of the operations can be performed at about the same time.
  • the first set of primers can include individual primers each including a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion.
  • the second set of primers can include individual primers each including a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers.
  • the tag nucleotide sequence can further comprise a barcode portion.
  • the tail- adaptor sequence of the primers of the second set of primers can further comprise a barcode portion.
  • a "portion" of a nucleic acid molecule refers to contiguous set of nucleotides comprised by that molecule. A portion can comprise all or only a subset of the nucleotides comprised by the molecule. A portion can be double-stranded or single- stranded.
  • the 5' tag nucleotide sequence portion comprises a nucleic acid sequence of high GC content which is not substantially complementary to or substantially identical to any other portion of any of the primers.
  • each of the primers can include a unique tag nucleotide sequence portion.
  • the tail-adaptor sequence of the primers of the second set of primers can be universal tail- adaptor sequence.
  • the nucleic acid template at operation 110 of the method disclosed herein can include at least one RNA molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof.
  • the nucleic acid template at operation 110 can include at least one RNA molecule.
  • the at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to operation 110.
  • the input nucleic acid molecules comprised in the nucleic template at operation 110 are subjected to a targeted sequence enrichment procedure prior to being subject to operation 110.
  • the input nucleic acid molecules comprised in the nucleic template are further enriched by using the TruSeqTM Exome Enrichment Kit (Ulumina) prior to being subject to operation 110.
  • the method 100 can further include a step of subjecting the second amplified nucleic acid product, i.e. after operation 130, to a third nucleic acid amplification in the presence of a first target-specific primer to produce a third amplified nucleic acid product, wherein the first target-specific primer includes a nucleotide sequence that can specifically anneal to a known nucleotide sequence of a target nucleic acid molecule.
  • the method 100 can further include a step of subjecting the third amplified nucleic acid product described above to a fourth nucleic acid amplification in the presence of a second target-specific primer to produce a fourth amplified nucleic acid product, wherein the second target-specific primer includes a nucleotide sequence that can specifically anneal to a portion of the known target nucleic acid sequence which is nested with respect to the first target-specific primer.
  • FIGURE 2 is a flow diagram illustrating one non-limiting example of method for amplifying nucleic acid sequence in accordance with at least some examples of the present disclosure. As illustrated in FIGURE 2, method 200 can include one or more functions, operations, or actions as illustrated by one or more of operations 210-220.
  • Method 200 can begin at operation 210, "Providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; and (ii) a first set of primers.” Operation 210 can be followed by operation 220, "Subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product.”
  • operations 210-220 are illustrated as being performed sequentially with operation 210 first and operation 220 last. It will be appreciated however that these operations can be combined, and/or divided into additional or different operations as appropriate to suit particular embodiments. For example, as described in further detail below, additional operations can be added before, during or after one or more of operations 210-220. In some embodiments, one or more of the operations can be performed at about the same time.
  • the first set of primers can include individual primers each including a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion.
  • the tail-adaptor sequence of the primers of the first set of primers can further comprise a barcode portion.
  • the tag nucleotide sequence portion comprises a nucleic acid sequence of high GC content which is not substantially complementary to or substantially identical to any other portion of any of the primers.
  • each of the primers can include a unique tag nucleotide sequence portion.
  • the tail-adaptor sequence of the primers of the first set of primers can be universal tail-adaptor sequence.
  • the nucleic acid template at operation 210 of the method disclosed herein can include at least one RNA molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof.
  • the nucleic acid template at operation 210 can include at least one RNA molecule.
  • the at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to operation 210.
  • the nucleic acid molecules comprised in the nucleic template at operation 210 are subjected to a targeted sequence enrichment procedure prior to being subject to operation 210.
  • the nucleic acid molecules comprised in the nucleic template are further enriched by using the TruSeqTM Exome Enrichment Kit (Illumina) prior to being subject to operation 210.
  • At least one of the first, second, third, and fourth nucleic acid amplification step of the methods 100 and 200 disclosed herein can be repeated at least one time.
  • each of the nucleic acid amplifications of methods 100 and 200 disclosed herein can include the steps of (i) denaturing the nucleic acid molecules, (ii) annealing the primers with the denatured nucleic acids to allow the formation of primer-nucleic acid hybrids, and (iii) incubating the primer-nucleic acid hybrids to allow a heat-stable nucleic acid replicating enzyme to synthesize the corresponding nucleic acid product.
  • nucleic acid replicating enzyme refers to an enzyme that catalyzes the template-dependent polymerization of nucleoside triphosphates to form primer extension products that are complementary to the template nucleic acid sequence.
  • a nucleic acid replicating enzyme typically initiates synthesis at the 3' end of an annealed primer and proceeds in the direction toward the 5' end of the template.
  • Numerous nucleic acid replicating enzymes are known in the art and commercially available.
  • One group of preferred nucleic acid replicating enzymes are thermostable, i.e., they retain function after being subjected to temperatures sufficient to denature annealed strands of complementary nucleic acids, e.g. 72°C, 75°C, 80°C, 85°C, 90°C, or 94°C, or sometimes higher.
  • the heat-stable nucleic acid replicating enzyme is selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, Superscript® III reverse transcriptase, Omniscript® reverse transcriptase, M-MuLV reverse transcriptase, SMARTScribeTM reverse transcriptase, HiFi polymerase, Phusion® polymerase, Taq polymerase, TaqP ⁇ us DNA polymerase, Tub DNA polymerase, Pfu polymerase, and Vent® DNA polymerase.
  • M-MLV Moloney Murine Leukemia Virus
  • AMV Avian Myeloblastosis Virus
  • Superscript® III reverse transcriptase Superscript® III reverse transcriptase
  • Omniscript® reverse transcriptase M-MuLV reverse transcriptase
  • SMARTScribeTM reverse transcriptase HiFi polymerase
  • Phusion® polymerase Taq
  • the first target specific primer and/or the second target specific primer can further comprise a barcode portion.
  • each of the primers can include a unique barcode portion.
  • Each of the resulting amplified nucleic acid products will therefore comprise a barcode identifying which sample comprised the original template nucleic acid from which the amplification product is derived. The use of barcode portions in next-generation sequencing applications is well known in the art.
  • the methods 100 and 200 disclosed herein can further include sequencing the amplified portion of the first, second, third, and/or fourth amplified nucleic acid product using a first and a second sequencing primer.
  • the sequencing is performed by a next generation sequencing (NGS) procedure.
  • NGS next generation sequencing
  • next-generation sequencing refers to oligonucleotide sequencing technologies that have the capacity to sequence oligonucleotides at speeds above those possible with conventional sequencing methods (e.g. Sanger sequencing), due to performing and reading out thousands to millions of sequencing reactions in parallel.
  • Non-limiting examples of next-generation sequencing methods/platforms include Massively Parallel Signature Sequencing (Lynx Therapeutics); solid-phase, reversible dye-terminator sequencing (Solexa/Illumina); DNA nanoball sequencing (Complete Genomics); SOLiD technology (Applied Biosystems); 454 pyrosequencing (454 Life Sciences/Roche Diagnostics); ion semiconductor sequencing (ION Torrent); and technologies available from Pacific Biosciences, Intelligen Bio-systems, Oxford Nanopore Technologies, and Helicos Biosciences.
  • the NGS procedure used in the methods disclosed herein can comprise pyrosequencing, sequencing by synthesis, sequencing by ligation, or a combination of any thereof.
  • the NGS procedure is performed by an NGS platform selected from the group consisting of Illumina, Ion Torrent, Qiagen, Invitrogen, Applied Biosystem, Helicos, Oxford Nanopore, Pacific Biosciences, and Complete Genomics.
  • the input nucleic acids can be sheared, e.g. mechanically or enzymatically sheared, to generate fragments of any desired size prior to the first nucleic acid amplification step.
  • Non-limiting examples of mechanical shearing processes include sonication, nebulization, and AFATM shearing technology available from Covaris (Woburn, Mass.).
  • input nucleic acids can be mechanically sheared by sonication.
  • the sample when the input nucleic acids include RNA molecules, the sample can be subjected to a reverse transcriptase regimen to generate DNA template and the DNA template can then be sheared.
  • input RNA molecules can be sheared before performing the reverse transcriptase regimen.
  • the methods used to extract nucleic acids from specimens and biomaterials e.g.
  • a tissue sample can include one or more of the following features: total nucleic acid extraction; genomic DNA removal for cDNA sequencing; ribosomal RNA depletion for cDNA sequencing; mechanical or enzymatic shearing in any of the steps; double-stranded cDNA synthesis using random hexamers; nucleic acid end-repair, phosphorylation, and adenylation, etc.
  • the methods according to any one of the preceding aspects and embodiments exclude ligating a tail-adaptor sequence to any one of the primers, particularly to the primers of said first and/or second set of primers.
  • This Example illustrates a non-limiting example of a ligation-free method of preparing nucleic acids samples for next generation sequencing (NGS), via amplification of input nucleic acid molecules in accordance with some embodiments of the methods disclosed herein.
  • NGS next generation sequencing
  • input nucleic acids are subjected to the first round of nucleic amplification to produce a first amplified nucleic acid product by using a first set of primers (Integrated DNA Technologies).
  • Each of the individual primers includes a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion.
  • this first round of nucleic acid amplification is performed in one or multiple cycles.
  • the first amplified nucleic acid product is then subjected to a second round of nucleic amplification to produce a second amplified nucleic acid product by using a second set of custom primers, which are manufactured by Integrated DNA Technologies with custom indexes.
  • Each individual primer includes a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers.
  • the second amplified nucleic acid product is then used for the construction of sequencing libraries, sequencing, and sequence data analysis.
  • This Example illustrates a non-limiting example of a ligation-free method for amplifying input nucleic acid molecules that are derived from formalin-fixed paraffin embedded (FFPE) biomaterials such as, tissues and cells.
  • FFPE formalin-fixed paraffin embedded
  • the amplified nucleic acid product is further subjected to additional rounds of targeted PCR amplification by using target- specific primers, which can be single or nested PCR amplifications.
  • the input nucleic acids are comprised in one or more samples obtained from a subject.
  • the samples are diagnostic samples obtained from human subjects, and therefore can include proteins, cells, fluids, biological fluids, preservatives, and/or other substances.
  • the samples are cheek swab, cerebrospinal fluid, urine, tears, alveolar isolates, blood, serum, plasma, pericardial fluid, cyst fluid, tumor tissue, sputum, pleural fluid, saliva, and/or aspirate.
  • the samples contained a population of tumor cells (FIGURE 4).
  • the input nucleic acids are extracted from various solid or liquid tumor samples, by using a QiagenTM or FormapureTM nucleic acid extraction kits, according to manufacturer's instructions.
  • Input nucleic acids are subsequently subjected to the first round of nucleic amplification to produce a first amplified nucleic acid product by using a first set of primers (Integrated DNA Technologies).
  • a first set of primers Integrated DNA Technologies.
  • each of the individual primers includes a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion.
  • the first round of nucleic amplification is performed in one or multiple cycles.
  • the first amplified nucleic acid product is then exposed to target specific primers (Integrated DNA Technologies) in a second round of nucleic acid amplification, such that template nucleic acid molecules that hybridize to the target specific primers are amplified to produce a second amplified nucleic acid product.
  • target specific primers Integrated DNA Technologies
  • the second round of nucleic acid amplification is performed in one or multiple cycles. In some experiments, single PCR reactions are performed. Alternatively or in addition, nested PCR procedure is used is some other experiments.
  • the second amplified nucleic acid product is further amplified by using a set of custom primers, which are manufactured by Integrated DNA Technologies with custom indexes. Each individual primer includes a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers.
  • the second amplified nucleic acid product is then used for the construction of sequencing libraries and NGS sequencing. Sequence data from NGS sequencing is then assembled, annotated, and used for further data mining to identify fusions, SNP polymorphism, splice variants, as well as various types of gene expression analyses (e.g. differential expression or tissue-specific expression, etc.).
  • Targeted sequence enrichment combined with ligation-free NGS sample preparation illustrates a non-limiting exemplary workflow in which a targeted sequence enrichment procedure is deployed in combination with a ligation-free method for amplifying nucleic acid molecules disclosed herein.
  • the input nucleic acids are first subjected to a targeted sequence enrichment process by using a TruSeqTM Exome Enrichment Kit (Illumina) according to manufacturer's instructions. Briefly, input nucleic acids are denatured into single-stranded DNA and then hybridized to biotin-labeled probes specific to one or more targeted regions. The nucleic acid sample is then enriched for the desired regions by adding streptavidin beads that will bind to the biotinylated probes. Biotinylated DNA fragments bound to the streptavidin beads are magnetically pulled down from the solution. The enriched DNA fragments are then eluted from the beads and hybridized for a second enrichment reaction. After amplification, a targeted enriched nucleic acid sample is ready for further processing.
  • TruSeqTM Exome Enrichment Kit Illumina
  • Enriched nucleic acid samples prepared as described above are subsequently subjected to the first round of nucleic amplification to produce a first amplified nucleic acid product by using a first set of primers (Integrated DNA Technologies).
  • Each of the individual primers includes a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion.
  • the first round of nucleic acid amplification is performed in one or multiple cycles.
  • the first amplified nucleic acid product is then subjected to a second round of nucleic amplification to produce a second amplified nucleic acid product by using a second set of custom primers, which are manufactured by Integrated DNA Technologies with custom indexes.
  • Each individual primer includes a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers.
  • the second amplified nucleic acid product is then used for the construction of sequencing libraries, sequencing, and sequence data analysis.

Abstract

The present disclosure generally relates to compositions and methods for amplifying and determining nucleic acid sequences, and particularly to a method for amplifying any stretch of nucleic acid sequences in a sequence-independent manner. The disclosure further relates to ligation-free methods for determining nucleic acid sequences, e.g. by enriching target nucleic acid sequences via random-priming nucleic acid synthesis, optionally followed by gene-specific nested PCR amplifications, prior to sequencing the sequences.

Description

COMPOSITIONS AND METHODS FOR AMPLIFYING AND
DETERMINING NUCLEIC ACID SEQUENCES
RELATED APPLICATIONS
[0001] The present application claims priority to U.S. Provisional Patent Application Serial No. 62/017,734, filed on June 26, 2014, the content of which is hereby expressly incorporated by reference in its entirety.
BACKGROUND
[0002] Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
[0003] Nucleic acid sequencing is a method for determining the exact order of nucleotides present in a given DNA or RNA molecule. In the past decade, the use of nucleic acid sequencing has increased greatly as the ability to sequence has become accessible to research and clinical laboratories all over the world. Since completion of the first human genome sequence, demand for cheaper and faster sequencing methods has increased exponentially. This demand has driven the development of high -throughput sequencing methods, which are also termed next-generation sequencing (NGS). Additionally, in the past decade, several NGS platforms have also been developed to provide a cheaper and higher- throughput alternative to sequencing nucleic acids than traditional Sanger sequencing methods.
[0004] Generally, NGS platforms are designed to perform massively parallel sequencing, during which millions of fragments of DNA from a single sample are simultaneously sequenced in unison. Such massively parallel sequencing technology facilitates high-throughput sequencing, which can currently allow an entire genome to be sequenced in less than one day. Another important application of NGS is RNA-sequencing that can provide information on the entire transcriptome of a sample in a single analysis without requiring previous knowledge of the genetic sequence of an organism. This technique offers a strong alternative to the use of microarrays in gene expression studies. Furthermore, the creation of commercial NGS platforms, such as Life Technologies Ion Torrent Personal Genome Machine (PGM™), Applied Biosystems SOLiD™ Sequencer, the Illumina MiSeq®, and the Roche/454 FLX Pyrosequencer, has made high -throughput sequencing accessible to more laboratories, rapidly increasing the amount of research and clinical diagnostics being performed with nucleic acid sequencing.
[0005] At present, next generation sequencing technologies and systems, although much less costly in time and money in comparison to first-generation sequencing are still too expensive for many laboratories. In fact, NGS can cost more than 100,000 USD in start-up cost, and individual sequence reaction can cost upward of 1 ,000 USD per genome. A large part of this cost has often been attributed to the template preparation step which consists of building a library of nucleic acids (DNA or complementary DNA, i.e. cDNA), attaching platform-specific adaptor sequences to the library, and amplifying that library.
[0006] At present, NGS platforms use slightly different technologies for sequencing, such as pyrosequencing, sequencing by synthesis or sequencing by ligation. However, most platforms adhere to a common library preparation procedure, with minor modifications, to generate adapter-ligated fragment libraries before a 'run' on a selected sequencing instrument. This procedure typically includes fragmenting the DNA (sonication, nebulization or shearing), followed by DNA repair and end polishing (blunt end or A- overhang) and, finally, platform-specific adaptor ligation. As such, most NGS sample preparation step involves a ligation step where adaptor sequences, which are typically synthetic oligonucleotides of a known sequence, are added to the end of the input nucleic acids prior to amplification. Because ligations are time consuming and inefficient, this process typically results in considerable sample loss with limited throughput.
[0007] To streamline the NGS workflow, increase throughput and reduce sample loss, a number of alternative procedures have been developed, including transposon-based methods for preparing fragmented and tagged DNA libraries (Illumina/Epicentre Nextera™), or amplicon sequencing where primers contain 5' end sequence tails that incorporate adaptor sequences in subsequent polymerase chain reaction (PCR) steps. However, transposase- based methods have been reported to often create more biased libraries due to non-random transpositional insertions, and have not been tested on degraded nucleic acid populations, such as nucleic acid samples derived from formalin-fixed paraffin embedded (FFPE) biomaterials such as, human tissues and cells. Similarly, amplicon sequencing methods are non-random, because they only generate targeted libraries (i.e. non-random), which are generally not suitable for many sequencing applications such as whole genome, transcription, epigenome sequencing.
[0008] Thus, there is a need for alternative methods for preparing NGS nucleic acid samples that allow one to exclude these slow and ineffective ligation steps.
[0009] In one aspect, the present application discloses compositions and methods useful for the construction of non-targeted libraries without involving a ligation step. The ligation-free methods of amplifying and sequencing nucleic acids according to this disclosure allow one to avoid ligation without resorting to transposase- or ampl icon-based NGS workflows.
SUMMARY
[0010] In one aspect, disclosed herein are methods for amplifying nucleic acid sequences. The methods can include (a) providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers; and (iii) a second set of primers; (b) subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product; and (c) subjecting the first amplified nucleic acid product of step (b) to a second nucleic acid amplification such that the primers of the second set of primers anneal to at least one nucleic acid molecule of the first amplified nucleic acid product to produce a second amplified nucleic acid product. In such methods, the first set of primers can include individual primers each including a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion, and the second set of primers can include individual primers each including a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers. [0011] In another aspect, disclosed herein are methods for amplifying nucleic acid sequences that can include (a) providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers wherein the first set of primers comprises individual primers each comprising a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion; (b) subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product.
[0012] In some embodiments of the above and other aspects of the present disclosure, the methods can further include additional steps of nucleic acid amplification in the presence of one or more target-specific primers, wherein the target-specific primers can be used for single and/or nested PCR amplifications.
[0013] In some embodiments, the nucleic acid template of the methods disclosed herein can include at least one R A molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof. In some embodiments, the nucleic acid template can include at least one RNA molecule. In some embodiments, the at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to step (a).
[0014] In some embodiments, at least one of the nucleic acid amplification steps of the methods disclosed herein can be repeated at least one or more times.
[0015] In some embodiments, each of the nucleic acid amplifications includes the steps of (i) denaturing the nucleic acid molecules, (ii) annealing the primers with the denatured nucleic acids to allow the formation of primer-nucleic acid hybrids, and (iii) incubating the primer-nucleic acid hybrids to allow a heat-stable nucleic acid replicating enzyme to synthesize the corresponding nucleic acid product.
[0016] In some embodiments, the nucleic acid molecules comprised in the nucleic template are subjected to a targeted sequence enrichment procedure prior to being subject to step (a). [0017] In some embodiments, a barcode portion is further attached to (a) the tail- adaptor sequence, (b) the tag nucleotide sequence, (c) the first target specific primer, (d) the second target specific primer, or (e) a combination of any of the foregoing (a)-(d).
[0018] In some embodiments, the methods disclosed herein can further include sequencing the amplified portion of the amplified nucleic acid products. In some embodiments, the sequencing is performed by a next generation sequencing (NGS) procedure. In some embodiments, the NGS procedure can be selected from the group consisting of pyrosequencing, sequencing by synthesis, and sequencing by ligation.
[0019] In some embodiments, the methods according to any one of the preceding aspects and embodiments exclude the ligation of a tail-adaptor sequence to any one of the primers.
[0020] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] The foregoing and other features of the present disclosure will become more fully apparent from the following description and appended claims, taken in conjunction with the accompanying drawings. Understanding that these drawings depict only several embodiments in accordance with the disclosure and are not to be considered limiting of its scope; the disclosure will be described with additional specificity and detail through use of the accompanying drawings.
[0022] FIGURE 1 is a flow diagram illustrating a non-limiting example of a method of amplifying a nucleic acid template by using two sets of primers. This method can optionally include additional rounds of target-specific amplifications, which can be single or nested PCRs.
[0023] FIGURE 2 is a flow diagram illustrating another non-limiting example of a method of amplifying a nucleic acid template by using a single set of primers. Each of the individual primers can include a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion. This method can optionally include additional rounds of target-specific amplifications, which can be single or nested PCRs.
[0024] FIGURE 3 is a flow diagram that illustrates a non-limiting example of a ligation-free method of preparing nucleic acids samples for next generation sequencing (NGS), via amplification of input nucleic acid molecules, followed by construction of sequencing libraries, high-throughput sequencing, and sequence data analysis.
[0025] FIGURE 4 is a flow diagram illustrating another non-limiting example of a ligation-free method for amplifying input nucleic acid molecules derived from formalin- fixed paraffin embedded (FFPE) biomaterials such as, tissues and cells. The amplified nucleic acid product is further subjected to additional rounds of targeted PCR amplification by using target-specific primers, which can be single or nested PCR amplifications.
[0026] FIGURE 5 is a flow diagram illustrating another non-limiting example of a method for amplifying input nucleic acid molecules where the input nucleic acids are subjected to a targeted sequence enrichment procedure prior to being subjected to the first nucleic acid amplification.
DETAILED DESCRIPTION
[0027] In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be used, and other changes may be made, without departing from the spirit or scope of the subject matter presented here. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the Figures, can be arranged, substituted, combined, and designed in a wide variety of different configurations, all of which are explicitly contemplated and make part of this disclosure. [0028] Unless otherwise defined, all terms of art, notations and other scientific terms or terminology used herein are intended to have the meanings commonly understood by those of skill in the art to which this invention pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art. Many of the techniques and procedures described or referenced herein are well understood and commonly employed using conventional methodology by those skilled in the art.
[0029] The singular form "a", "an", and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a molecule" includes one or more molecules, including mixtures thereof. "A and/or B" is used herein to include all of the following alternatives: "A", "B", and "A and B".
[0030] The present disclosure generally describes compositions and methods for amplifying and/or determining nucleic acid sequences, and particularly to methods for amplifying any stretch of nucleic acid sequences in a sequence-independent manner. The disclosure further relates to ligation-free methods for determining nucleic acid sequences, e.g. by enriching target nucleic acid sequences via random-priming nucleic acid synthesis, optionally followed by gene-specific nested PCR amplifications, prior to sequencing the sequences.
[0031] FIGURE 1 is a flow diagram illustrating one non-limiting example of method for amplifying nucleic acid sequence in accordance with at least some examples of the present disclosure. As illustrated in FIGURE 1, method 100 can include one or more functions, operations, or actions as illustrated by one or more of operations 110-130.
[0032] Method 100 can begin at operation 110, "Providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers; and (iii) a second set of primers." Operation 110 can be followed by operation 120, "Subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product." Operation 120 can be followed by operation 130, "Subjecting the first amplified nucleic acid product to a second nucleic acid amplification such that the primers of the second set of primers anneal to at least one nucleic acid molecule of the first amplified nucleic acid product to produce a second amplified nucleic acid product."
[0033] In FIGURE 1, operations 110-130 are illustrated as being performed sequentially with operation 110 first and operation 130 last. It will be appreciated however that these operations can be reordered, combined, and/or divided into additional or different operations as appropriate to suit particular embodiments. For example, as described in further detail below, additional operations can be added before, during or after one or more of operations 110-130. In some embodiments, one or more of the operations can be performed at about the same time.
[0034] At operation 110, "Providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; (ii) a first set of primers; and (iii) a second set of primers," in accordance with some embodiments, the first set of primers can include individual primers each including a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion. In some embodiments, the second set of primers can include individual primers each including a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers. In some embodiments, the tag nucleotide sequence can further comprise a barcode portion. In some embodiments, the tail- adaptor sequence of the primers of the second set of primers can further comprise a barcode portion.
[0035] As used herein, a "portion" of a nucleic acid molecule refers to contiguous set of nucleotides comprised by that molecule. A portion can comprise all or only a subset of the nucleotides comprised by the molecule. A portion can be double-stranded or single- stranded.
[0036] In some embodiments, the 5' tag nucleotide sequence portion comprises a nucleic acid sequence of high GC content which is not substantially complementary to or substantially identical to any other portion of any of the primers. In some embodiments, each of the primers can include a unique tag nucleotide sequence portion. In some embodiments, the tail-adaptor sequence of the primers of the second set of primers can be universal tail- adaptor sequence. In some embodiments, the nucleic acid template at operation 110 of the method disclosed herein can include at least one RNA molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof. In some embodiments, the nucleic acid template at operation 110 can include at least one RNA molecule. In some embodiments, the at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to operation 110. In some embodiments, the input nucleic acid molecules comprised in the nucleic template at operation 110 are subjected to a targeted sequence enrichment procedure prior to being subject to operation 110. In a preferred embodiment, the input nucleic acid molecules comprised in the nucleic template are further enriched by using the TruSeq™ Exome Enrichment Kit (Ulumina) prior to being subject to operation 110.
[0037] In some embodiments of this aspect, the method 100 can further include a step of subjecting the second amplified nucleic acid product, i.e. after operation 130, to a third nucleic acid amplification in the presence of a first target-specific primer to produce a third amplified nucleic acid product, wherein the first target- specific primer includes a nucleotide sequence that can specifically anneal to a known nucleotide sequence of a target nucleic acid molecule.
[0038] In some embodiments, the method 100 can further include a step of subjecting the third amplified nucleic acid product described above to a fourth nucleic acid amplification in the presence of a second target-specific primer to produce a fourth amplified nucleic acid product, wherein the second target-specific primer includes a nucleotide sequence that can specifically anneal to a portion of the known target nucleic acid sequence which is nested with respect to the first target-specific primer.
[0039] FIGURE 2 is a flow diagram illustrating one non-limiting example of method for amplifying nucleic acid sequence in accordance with at least some examples of the present disclosure. As illustrated in FIGURE 2, method 200 can include one or more functions, operations, or actions as illustrated by one or more of operations 210-220. [0040] Method 200 can begin at operation 210, "Providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; and (ii) a first set of primers." Operation 210 can be followed by operation 220, "Subjecting the single reaction mixture to a first nucleic acid amplification such that the primers of the first set of primers anneal to at least one nucleic acid molecule of the nucleic template to produce a first amplified nucleic acid product."
[0041] In FIGURE 2, operations 210-220 are illustrated as being performed sequentially with operation 210 first and operation 220 last. It will be appreciated however that these operations can be combined, and/or divided into additional or different operations as appropriate to suit particular embodiments. For example, as described in further detail below, additional operations can be added before, during or after one or more of operations 210-220. In some embodiments, one or more of the operations can be performed at about the same time.
[0042] At operation 210, "Providing a single reaction mixture including (i) a nucleic acid template including a plurality of nucleic acid molecules; and (ii) a first set of primers" the first set of primers, in accordance with some embodiments, can include individual primers each including a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion. In some embodiments, the tail-adaptor sequence of the primers of the first set of primers can further comprise a barcode portion. In some embodiments, the tag nucleotide sequence portion comprises a nucleic acid sequence of high GC content which is not substantially complementary to or substantially identical to any other portion of any of the primers. In some embodiments, each of the primers can include a unique tag nucleotide sequence portion. In some embodiments, the tail-adaptor sequence of the primers of the first set of primers can be universal tail-adaptor sequence. In some embodiments, the nucleic acid template at operation 210 of the method disclosed herein can include at least one RNA molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof. In some embodiments, the nucleic acid template at operation 210 can include at least one RNA molecule. In some embodiments, the at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to operation 210. In some embodiments, the nucleic acid molecules comprised in the nucleic template at operation 210 are subjected to a targeted sequence enrichment procedure prior to being subject to operation 210. In a preferred embodiment, the nucleic acid molecules comprised in the nucleic template are further enriched by using the TruSeq™ Exome Enrichment Kit (Illumina) prior to being subject to operation 210.
[0043] In some embodiments, at least one of the first, second, third, and fourth nucleic acid amplification step of the methods 100 and 200 disclosed herein can be repeated at least one time.
[0044] In some embodiments, each of the nucleic acid amplifications of methods 100 and 200 disclosed herein can include the steps of (i) denaturing the nucleic acid molecules, (ii) annealing the primers with the denatured nucleic acids to allow the formation of primer-nucleic acid hybrids, and (iii) incubating the primer-nucleic acid hybrids to allow a heat-stable nucleic acid replicating enzyme to synthesize the corresponding nucleic acid product.
[0045] As used herein, the phrase "nucleic acid replicating enzyme" refers to an enzyme that catalyzes the template-dependent polymerization of nucleoside triphosphates to form primer extension products that are complementary to the template nucleic acid sequence. A nucleic acid replicating enzyme typically initiates synthesis at the 3' end of an annealed primer and proceeds in the direction toward the 5' end of the template. Numerous nucleic acid replicating enzymes are known in the art and commercially available. One group of preferred nucleic acid replicating enzymes are thermostable, i.e., they retain function after being subjected to temperatures sufficient to denature annealed strands of complementary nucleic acids, e.g. 72°C, 75°C, 80°C, 85°C, 90°C, or 94°C, or sometimes higher.
[0046] In some embodiments, the heat-stable nucleic acid replicating enzyme is selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, Superscript® III reverse transcriptase, Omniscript® reverse transcriptase, M-MuLV reverse transcriptase, SMARTScribe™ reverse transcriptase, HiFi polymerase, Phusion® polymerase, Taq polymerase, TaqP\us DNA polymerase, Tub DNA polymerase, Pfu polymerase, and Vent® DNA polymerase.
[0047] In some embodiments, the first target specific primer and/or the second target specific primer can further comprise a barcode portion. In some embodiments, each of the primers can include a unique barcode portion. Each of the resulting amplified nucleic acid products will therefore comprise a barcode identifying which sample comprised the original template nucleic acid from which the amplification product is derived. The use of barcode portions in next-generation sequencing applications is well known in the art.
[0048] In some embodiments, the methods 100 and 200 disclosed herein can further include sequencing the amplified portion of the first, second, third, and/or fourth amplified nucleic acid product using a first and a second sequencing primer. In some embodiments, the sequencing is performed by a next generation sequencing (NGS) procedure. As used herein "next-generation sequencing" refers to oligonucleotide sequencing technologies that have the capacity to sequence oligonucleotides at speeds above those possible with conventional sequencing methods (e.g. Sanger sequencing), due to performing and reading out thousands to millions of sequencing reactions in parallel. Non-limiting examples of next-generation sequencing methods/platforms include Massively Parallel Signature Sequencing (Lynx Therapeutics); solid-phase, reversible dye-terminator sequencing (Solexa/Illumina); DNA nanoball sequencing (Complete Genomics); SOLiD technology (Applied Biosystems); 454 pyrosequencing (454 Life Sciences/Roche Diagnostics); ion semiconductor sequencing (ION Torrent); and technologies available from Pacific Biosciences, Intelligen Bio-systems, Oxford Nanopore Technologies, and Helicos Biosciences.
[0049] Accordingly, in some embodiments, the NGS procedure used in the methods disclosed herein can comprise pyrosequencing, sequencing by synthesis, sequencing by ligation, or a combination of any thereof. In some preferred embodiments, the NGS procedure is performed by an NGS platform selected from the group consisting of Illumina, Ion Torrent, Qiagen, Invitrogen, Applied Biosystem, Helicos, Oxford Nanopore, Pacific Biosciences, and Complete Genomics. [0050] In some embodiments, the input nucleic acids can be sheared, e.g. mechanically or enzymatically sheared, to generate fragments of any desired size prior to the first nucleic acid amplification step. Non-limiting examples of mechanical shearing processes include sonication, nebulization, and AFA™ shearing technology available from Covaris (Woburn, Mass.). In some embodiments, input nucleic acids can be mechanically sheared by sonication. In some embodiments, when the input nucleic acids include RNA molecules, the sample can be subjected to a reverse transcriptase regimen to generate DNA template and the DNA template can then be sheared. In some embodiments, input RNA molecules can be sheared before performing the reverse transcriptase regimen. In some embodiments, the methods used to extract nucleic acids from specimens and biomaterials, e.g. a tissue sample, can include one or more of the following features: total nucleic acid extraction; genomic DNA removal for cDNA sequencing; ribosomal RNA depletion for cDNA sequencing; mechanical or enzymatic shearing in any of the steps; double-stranded cDNA synthesis using random hexamers; nucleic acid end-repair, phosphorylation, and adenylation, etc. In some embodiments, the methods according to any one of the preceding aspects and embodiments exclude ligating a tail-adaptor sequence to any one of the primers, particularly to the primers of said first and/or second set of primers.
[0051] The discussion of the general methods given herein is intended for illustrative purposes only. Other alternative methods and embodiments will be apparent to those of skill in the art upon review of this disclosure, and are to be included within the spirit and purview of this application.
EXAMPLES
[0052] Additional embodiments are disclosed in further detail in the following examples, which are not in any way intended to limit the scope of the claims.
EXAMPLE 1
A ligation-free method of preparing NGS nucleic acid samples
[0053] This Example illustrates a non-limiting example of a ligation-free method of preparing nucleic acids samples for next generation sequencing (NGS), via amplification of input nucleic acid molecules in accordance with some embodiments of the methods disclosed herein.
[0054] As presented in FIGURE 3, input nucleic acids are subjected to the first round of nucleic amplification to produce a first amplified nucleic acid product by using a first set of primers (Integrated DNA Technologies). Each of the individual primers includes a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion. In various experiments, this first round of nucleic acid amplification is performed in one or multiple cycles.
[0055] The first amplified nucleic acid product is then subjected to a second round of nucleic amplification to produce a second amplified nucleic acid product by using a second set of custom primers, which are manufactured by Integrated DNA Technologies with custom indexes. Each individual primer includes a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers. The second amplified nucleic acid product is then used for the construction of sequencing libraries, sequencing, and sequence data analysis.
EXAMPLE 2
Targeting sequencing of nucleic acids derived from FFPE biomaterials
[0056] This Example illustrates a non-limiting example of a ligation-free method for amplifying input nucleic acid molecules that are derived from formalin-fixed paraffin embedded (FFPE) biomaterials such as, tissues and cells. The amplified nucleic acid product is further subjected to additional rounds of targeted PCR amplification by using target- specific primers, which can be single or nested PCR amplifications.
[0057] In some experiments, the input nucleic acids are comprised in one or more samples obtained from a subject. In some experiments, the samples are diagnostic samples obtained from human subjects, and therefore can include proteins, cells, fluids, biological fluids, preservatives, and/or other substances. In some experiments, the samples are cheek swab, cerebrospinal fluid, urine, tears, alveolar isolates, blood, serum, plasma, pericardial fluid, cyst fluid, tumor tissue, sputum, pleural fluid, saliva, and/or aspirate. In some other experiments, the samples contained a population of tumor cells (FIGURE 4).
[0058] As presented in FIGURE 4, the input nucleic acids are extracted from various solid or liquid tumor samples, by using a Qiagen™ or Formapure™ nucleic acid extraction kits, according to manufacturer's instructions. Input nucleic acids are subsequently subjected to the first round of nucleic amplification to produce a first amplified nucleic acid product by using a first set of primers (Integrated DNA Technologies). In this experiment, each of the individual primers includes a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion. In various experiments, the first round of nucleic amplification is performed in one or multiple cycles.
[0059] The first amplified nucleic acid product is then exposed to target specific primers (Integrated DNA Technologies) in a second round of nucleic acid amplification, such that template nucleic acid molecules that hybridize to the target specific primers are amplified to produce a second amplified nucleic acid product. The second round of nucleic acid amplification is performed in one or multiple cycles. In some experiments, single PCR reactions are performed. Alternatively or in addition, nested PCR procedure is used is some other experiments.
[0060] Alternatively or in addition, the second amplified nucleic acid product is further amplified by using a set of custom primers, which are manufactured by Integrated DNA Technologies with custom indexes. Each individual primer includes a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers. The second amplified nucleic acid product is then used for the construction of sequencing libraries and NGS sequencing. Sequence data from NGS sequencing is then assembled, annotated, and used for further data mining to identify fusions, SNP polymorphism, splice variants, as well as various types of gene expression analyses (e.g. differential expression or tissue-specific expression, etc.).
EXAMPLE 3
Targeted sequence enrichment combined with ligation-free NGS sample preparation [0061] This Example illustrates a non-limiting exemplary workflow in which a targeted sequence enrichment procedure is deployed in combination with a ligation-free method for amplifying nucleic acid molecules disclosed herein.
[0062] As presented in FIGURE 5, the input nucleic acids are first subjected to a targeted sequence enrichment process by using a TruSeq™ Exome Enrichment Kit (Illumina) according to manufacturer's instructions. Briefly, input nucleic acids are denatured into single-stranded DNA and then hybridized to biotin-labeled probes specific to one or more targeted regions. The nucleic acid sample is then enriched for the desired regions by adding streptavidin beads that will bind to the biotinylated probes. Biotinylated DNA fragments bound to the streptavidin beads are magnetically pulled down from the solution. The enriched DNA fragments are then eluted from the beads and hybridized for a second enrichment reaction. After amplification, a targeted enriched nucleic acid sample is ready for further processing.
[0063] Enriched nucleic acid samples prepared as described above are subsequently subjected to the first round of nucleic amplification to produce a first amplified nucleic acid product by using a first set of primers (Integrated DNA Technologies). Each of the individual primers includes a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion. In various experiments, the first round of nucleic acid amplification is performed in one or multiple cycles.
[0064] Similar to the workflow described in Example 1 above, the first amplified nucleic acid product is then subjected to a second round of nucleic amplification to produce a second amplified nucleic acid product by using a second set of custom primers, which are manufactured by Integrated DNA Technologies with custom indexes. Each individual primer includes a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers. The second amplified nucleic acid product is then used for the construction of sequencing libraries, sequencing, and sequence data analysis.

Claims

CLAIMS WHAT IS CLAIMED IS:
1 . A method for amplifying nucleic acid sequence, said method comprising:
a) providing a single reaction mixture comprising:
i. a nucleic acid template comprising a plurality of nucleic acid molecules; ii. a first set of primers;
iii. a second set of primers;
b) subjecting said single reaction mixture to a first nucleic acid amplification wherein the primers of said first set of primers anneal to at least one nucleic acid molecule of said nucleic template to produce a first amplified nucleic acid product; and
c) subjecting the first amplified nucleic acid product of step (b) to a second nucleic acid amplification wherein the primers of said second set of primers anneal to at least one nucleic acid molecule of said first amplified nucleic acid product to produce a second amplified nucleic acid product.
wherein said first set of primers comprises individual primers each comprising a tag nucleotide sequence at its 5' portion and a random nucleotide sequence at its 3' portion; and further wherein said second set of primers comprises individual primers each comprising a tail-adaptor sequence at its 5' portion and a nucleotide sequence at its 3' portion that is substantially complementary to the 5' tag nucleotide sequence of the first set of primers.
2. The method of claim 1 , further comprising a step (d) of subjecting the second amplified nucleic acid product of step (c) to a third nucleic acid amplification in the presence of a first target-specific primer to produce a third amplified nucleic acid product, wherein said first target-specific primer comprises a nucleotide sequence that can specifically anneal to a known nucleotide sequence of a target nucleic acid molecule.
3. The method of claim 2, further comprising a step (e) of subjecting the third amplified nucleic acid product of step (c) to a fourth nucleic acid amplification in the presence of a second target-specific primer to produce a fourth amplified nucleic acid product, wherein said second target-specific primer comprises a nucleotide sequence that can specifically anneal to a portion of the known target nucleic acid sequence which is nested with respect to the first target-specific primer.
4. A method for amplifying nucleic acid sequence, said method comprising:
a) providing a single reaction mixture comprising:
i. a nucleic acid template comprising a plurality of nucleic acid molecules; and ii. a first set of primers, wherein said first set of primers comprises individual primers each comprising a tail-adaptor sequence at its 5' portion, a tag nucleotide sequence located at its central portion, and a random nucleotide sequence at its 3' portion; and
b) subjecting said single reaction mixture to a first nucleic acid amplification wherein the primers of said first set of primers anneal to at least one nucleic acid molecule of said nucleic template to produce a first amplified nucleic acid product.
5. The method of claims 4, further comprising a step (c) of subjecting the first amplified nucleic acid product of step (b) to a second nucleic acid amplification in the presence of a first target-specific primer to produce a second amplified nucleic acid product, wherein said first target-specific primer comprises a nucleotide sequence that can specifically anneal to a known nucleotide sequence of a target nucleic acid molecule.
6. The method of claim 5, further comprising a step (d) of subjecting the second amplified nucleic acid product of step (c) to a third nucleic acid amplification in the presence of a second target-specific primer to produce a third amplified nucleic acid product, wherein said second target-specific primer comprises a nucleotide sequence that can specifically anneal to a portion of the known target nucleic acid sequence which is nested with respect to the first target-specific primer.
7. The method of any one of claims 1 -6, wherein said nucleic acid template comprises at least one RNA molecule, at least one single-stranded DNA (ssDNA) molecule, at least one double-stranded DNA (dsDNA) molecule, or a combination of any thereof.
8. The method of claim 7, wherein said nucleic acid template comprises at least one RNA molecule.
9. The method of claim 8, wherein said at least one RNA molecule is subjected to a reverse transcription regimen to produce a cDNA product prior to being subjected to step (a).
10. The method of any one of claims 1-9, wherein at least one of said first, second, third, and fourth nucleic acid amplification step is repeated at least one time.
1 1 . The method of any one of claims 1 -10, wherein each of said nucleic acid amplifications comprises the steps of (i) denaturing said nucleic acid molecules, (ii) annealing said primers with said denatured nucleic acids to allow the formation of primer- nucleic acid hybrids, and (iii) incubating said primer-nucleic acid hybrids to allow a heat- stable nucleic acid replicating enzyme to synthesize said amplified nucleic acid product.
12. The method of any one of claims 1-1 1, wherein said heat-stable nucleic acid replicating enzyme is selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, Superscript® III reverse transcriptase, Omniscript® reverse transcriptase, M-MuLV reverse transcriptase, SMARTScribe™ reverse transcriptase, HiFi polymerase, Phusion® polymerase, Taq polymerase, 7a^Plus DNA polymerase, Tub DNA polymerase, Pfu polymerase, and Vent® DNA polymerase.
13. The method of any one of claims 1 -12, wherein said nucleic acid molecules are subjected to a targeted sequence enrichment procedure prior to being subject to step (a).
14. The method of any one of claims 1 -13, wherein a barcode portion is further attached to (a) said tail-adaptor sequence, (b) said tag nucleotide sequence, (c) said first target specific primer, (d) said second target specific primer, or (e) a combination of any of the foregoing (a)-(d).
15. The method of any one of claims 1-14, further comprising sequencing the amplified portion of the first, second, third, and/or fourth amplified nucleic acid product using a first and a second sequencing primer.
16. The method of claim 15, wherein said sequencing is performed by a next generation sequencing (NGS) procedure.
17. The method of claim 16, wherein said NGS procedure comprises pyrosequencing, sequencing by synthesis, or sequencing by ligation.
18. The method of claim 16 or 17, wherein said NGS procedure is performed by an NGS platform selected from the group consisting of Illumina, Ion Torrent, Qiagen, Invitrogen, Applied Biosystem, Helicos, Oxford Nanopore, Pacific Biosciences, and Complete Genomics.
19. The method of any one the preceding claims, wherein said method excludes ligating a tail-adaptor sequence to any one of the primers.
PCT/US2015/034239 2014-06-26 2015-06-04 Compositions and methods for amplifying and determining nucleic acid sequences WO2015199939A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462017734P 2014-06-26 2014-06-26
US62/017,734 2014-06-26

Publications (1)

Publication Number Publication Date
WO2015199939A1 true WO2015199939A1 (en) 2015-12-30

Family

ID=54938659

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/034239 WO2015199939A1 (en) 2014-06-26 2015-06-04 Compositions and methods for amplifying and determining nucleic acid sequences

Country Status (1)

Country Link
WO (1) WO2015199939A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114196743A (en) * 2021-12-27 2022-03-18 武汉明德生物科技股份有限公司 Rapid detection method for pathogenic microorganisms and kit thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070128654A1 (en) * 2002-05-16 2007-06-07 Lao Kai Q Universal-tagged oligonucleotide primers and methods of use
US20130085083A1 (en) * 2003-03-07 2013-04-04 Rubicon Genomics Substantially non-self complementary primers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070128654A1 (en) * 2002-05-16 2007-06-07 Lao Kai Q Universal-tagged oligonucleotide primers and methods of use
US20130085083A1 (en) * 2003-03-07 2013-04-04 Rubicon Genomics Substantially non-self complementary primers

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114196743A (en) * 2021-12-27 2022-03-18 武汉明德生物科技股份有限公司 Rapid detection method for pathogenic microorganisms and kit thereof

Similar Documents

Publication Publication Date Title
US20200354773A1 (en) High multiplex pcr with molecular barcoding
US20210071171A1 (en) Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
US11098357B2 (en) Compositions and methods for identification of a duplicate sequencing read
US10590471B2 (en) Target enrichment by single probe primer extension
US20220389416A1 (en) COMPOSITIONS AND METHODS FOR CONSTRUCTING STRAND SPECIFIC cDNA LIBRARIES
WO2016022833A1 (en) Digital measurements from targeted sequencing
CN110050067A (en) Generate the method for the double stranded DNA through expanding and composition and kit for the method
US20190169603A1 (en) Compositions and Methods for Labeling Target Nucleic Acid Molecules
CN111801427B (en) Generation of single-stranded circular DNA templates for single molecules
JP2015500012A (en) Methods and kits for characterizing RNA in compositions
WO2015199939A1 (en) Compositions and methods for amplifying and determining nucleic acid sequences
US20220411861A1 (en) A Multiplex Method of Preparing a Sequencing Library
US20230340462A1 (en) Method for producing dna molecules having an adaptor sequence added thereto, and use thereof
JP6983906B2 (en) Quantitative and qualitative library

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15812851

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15812851

Country of ref document: EP

Kind code of ref document: A1