WO2001032932A2 - Method of identifying a nucleic acid sequence - Google Patents

Method of identifying a nucleic acid sequence Download PDF

Info

Publication number
WO2001032932A2
WO2001032932A2 PCT/US2000/041851 US0041851W WO0132932A2 WO 2001032932 A2 WO2001032932 A2 WO 2001032932A2 US 0041851 W US0041851 W US 0041851W WO 0132932 A2 WO0132932 A2 WO 0132932A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
primer
population
selection
dna
Prior art date
Application number
PCT/US2000/041851
Other languages
French (fr)
Other versions
WO2001032932A3 (en
Inventor
Steven D. Coleman
Michael Mckenna
Michael Paul Popp
Original Assignee
Curagen Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Curagen Corporation filed Critical Curagen Corporation
Priority to AU46093/01A priority Critical patent/AU4609301A/en
Publication of WO2001032932A2 publication Critical patent/WO2001032932A2/en
Publication of WO2001032932A3 publication Critical patent/WO2001032932A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions

Definitions

  • the present invention relates to methods for identifying and cloning nucleic acids.
  • the invention relates to high throughput methods for identifying the flanking sequences and full length clones for any known partial nucleic acid sequence.
  • the invention provides a method for isolating a nucleic acid sequence flanking a known nucleic acid sequence.
  • a first selection primer and a second selection primer are provided, wherein the first selection primer comprises at least a portion of a known nucleic acid sequence and the second selection primer is complementary to at least a portion of the first selection primer.
  • a first population of circular nucleic acid molecules such as cDNA or genomic libraries, wherein the first population of circular nucleic acid molecules comprises parental strands that will anneal to the first or second selection primers.
  • the selection primers are combined with the first population of circular nucleic acids molecules.
  • Parental strands of the first population of circular nucleic acids which contain sequences that are complementary to the first and second selection primers, are annealed to the provided selection primers, thus forming a population of circular nucleic acid molecules complexed with the primers;
  • the annealed first and second selection primers are extended with a polymerase to produce a second population of nucleic acid molecules.
  • Said second population comprises the first population of parental circular nucleic acid molecules, a first synthesized DNA strand comprising the first selection primer and a second synthesized DNA strand comprising the second selection primer.
  • the first synthesized DNA strand and the second synthesized DNA strand are distinguishable from the parental strands and the non-annealed first population of circular nucleic acids.
  • the first population of circular nucleic acids are selectively removed from the second population of nucleic acid molecules.
  • kits are provided that contain the compositions utilized according to the invention.
  • FIG. 1 illustrates a set of primers that can be used to amplify nucleic acids according to the invention.
  • Two flanking primers (solid arrow heads) and two complementary selection primers (open arrow heads) are synthesized from a known sequence (black rectangle) in order to clone the unknown flanking sequence (open rectangle) contained within a given vector (solid line).
  • FIG. 2 illustrates primer annealing and extension followed by Dpnl degradation, transformation and plating, as follows: (a) two complementary gene-specific oligonucleotides are annealed; (b) the primers are extended by a polymerase (PfuTurboTM, Stratagene); (c) the parental strands and all other clones are degraded with Dpnl; and (d) the remaining newly synthesized daughter molecule is transformed into E. coli and plated on selective media.
  • PfuTurboTM polymerase
  • FIG. 2 illustrates primer annealing and extension followed by Dpnl degradation, transformation and plating, as follows: (a) two complementary gene-specific oligonucleotides are annealed; (b) the primers are extended by a polymerase (PfuTurboTM, Stratagene); (c) the parental strands and all other clones are degraded with Dpnl; and (d) the remaining newly synth
  • FIG. 3 is a graphic overview of the Genome Priming System (GPS-1) sequencing assay utilizing transposon technology.
  • FIG. 4 is a graphic representation of the data output of forward and reverse sequencing reads from the transposon based sequencing assay. The final sequence of the full length clone is determined after the data is assembled into a contig.
  • FIG. 5 illustrates enrichment for the amyloid precursor protein (APP) gene.
  • APP amyloid precursor protein
  • FIG. 6 is a graphic representation of various gene fragments used in the invention to clone and identify upstream and downstream flanking sequences.
  • the invention provides methods for identifying a nucleic acid of interest in a population of nucleic acid molecules.
  • the method can be used to isolate nucleic acids flanking a known sequence.
  • the methods can be used, e.g., when a partial sequence of a gene is known and additional sequence for the gene is desired.
  • from 4% to >70% of sequences can contain a sequence of interest.
  • Nucleic acid sequences of interest can be identified and recovered by annealing a pair of primers, named a first selection primer and a second selection primer, to a parental population of circular nucleic acid molecules. Included in the population of circular nucleic acid molecules are one or more nucleic acids that are known to contain, or are suspected of containing a known nucleic acid sequence. The population is designated as a parental population to indicate that they can be distinguished from nucleic acids synthesized following annealing of the selection primers to the first population.
  • the first selection primer includes at least a portion of the known nucleic acid sequence, and the second selection primer is complementary to at least a portion of the first selection primer.
  • the first selection primer and second selection primer are annealed to complementary sequences in the first parental nucleic acid in the population of circular nucleic acid molecules to form an annealed first selection primer complex and second primer complexes, respectively.
  • the annealed primers are then extended with a polymerase, nucleotide triphospgates, and additional cofactors as warranted, to produce a second population of nucleic acid molecules that include, in addition to the first population of parental circular nucleic acid molecules, a first synthesized DNA strand comprising the first selection primer and a second synthesized DNA strand comprising the second selection primer.
  • the first synthesized DNA strand and the second synthesized DNA strand are distinguishable from the parental strands. For example, almost all strains of E. coli methylate DNA and, therefore, all clones in the library have methylated nucleotides. However, while the parental DNA strands are methylated, the first and second synthesized can be synthesized to lack methylated nucleotides. Thus de novo polynucleotide synthesis results in unmethylated copies of clones containing genes of interest in a background of methylated clones.
  • An example of a methylation sensitive enzyme is Dpn I, which recognizes and cleaves the sequence: 5' GA m / TC 3'
  • Dpn I cleaves n 6 methylated DNA.
  • the first and second strands can be synthesized using biotinylated nucleotides, provided that the parental strands lack biotinylated nucleotides.
  • the differences between the parental strands and the first and second strands are then used as the basis for selectively removing first population of nucleic acid molecules from the second population of nucleic acid molecules.
  • the parental strands can include a nucleotide, e.g., a modified nucleotide, not present in the first synthesized DNA strand or second synthesized DNA strand.
  • a modified nucleotide is a methylated nucleotide (such as a met y ate adenine or cytosine nucleotide).
  • the parental nucleic acids can be removed by digestion with a restriction enzyme that cuts methylated DNA but does not cut non-methylated DNA.
  • DNA strand include a modified nucleotide not present in the parental strands.
  • the first synthesized strand and second synthesized strand can include a biotinylated nucleotide, and these strands can be isolated using a purification scheme based on the binding or strep-avidin to the biotinylated first and second synthesized strand.
  • the modified nucleotide can be introduced in vitro or in vivo.
  • the first synthesized DNA strand or said second synthesized DNA strand, or both can be then be recovered from the second population of nucleic acid molecules.
  • any desired source of nucleic acid molecules can be used to produce the first population of circular nucleic acid molecules.
  • the first population of nucleic acid molecules can be generated from genomic DNA molecules or complementary DNA (cDNA) molecules.
  • the nucleic acid molecules can be provided as open circular or closed circular nucleic acid molecules.
  • a "closed circle” is a covalently closed circular nucleic acid molecule, e.g., a circular DNA or RNA molecule.
  • An "open circle” is a linear single-stranded nucleic acid molecule having a 5' phosphate group and a 3' hydroxyl group. In some embodiments, the open circle is formed in situ from a linear double-stranded nucleic acid molecule. The ends of a given open circle nucleic acid molecule can be ligated by DNA ligase.
  • Sequences at the 5' and 3' ends of the open circle molecule are complementary to two regions of adjacent nucleotides in a second nucleic acid molecule, e.g., an adapter region of an anchor primer, or to two regions that are nearly adjoining in a second DNA molecule.
  • the ends of the open-circle molecule can be ligated using DNA ligase, or extended by DNA polymerase in a gap-filling reaction.
  • Open circles are described in detail in Lizardi, U.S. Pat. No. 5,854,033.
  • An open circle can be converted to a closed circle in the presence of a DNA ligase (for DNA) or RNA ligase.
  • the first population of nucleic acid molecules can be constructed from any source of nucleic acid, e.g., any cell, tissue, or organism, and can be generated by any art-recognized method. Suitable methods include, e.g., sonication of genomic DNA and digestion with one or more restriction endonucleases (RE) to generate fragments of a desired range of lengths from an initial population of nucleic acid molecules.
  • RE restriction endonucleases
  • one or more of the restriction enzymes have distinct four-base recognition sequences. Examples of such enzymes include, e.g., Sau3Al, Mspl, and Taql.
  • the restriction enzyme is used with a type IIS restriction enzyme.
  • the first population of nucleic acid molecules can be made by generating a complementary DNA (cDNA) library from RNA, e.g., messenger RNA (mRNA).
  • RNA e.g., messenger RNA (mRNA).
  • the cDNA library can, if desired, be further processed with restriction endonucleases to obtain a 3' end characteristic of a specific RNA, internal fragments, or fragments including the 3' end of the isolated RNA.
  • Adapter regions in the anchor primer may be complementary to a sequence of interest that is thought to occur in the template library, e.g., a known or suspected sequence polymorphism within a fragment generated by endonuclease digestion.
  • first population of nucleic acids may generally be derived from a single source, e.g., a mRNA of a particular cell type, for some applications several applications it may be desirable to combine, or pool, two or more distinct libraries. Pooling libraries may be particularly suitable as starting materials in high throughput assay.
  • both the first and second selection primers anneal to the same sequence on opposite strands of the plasmid (i.e. are complementary)
  • one or both primers may in addition include additional sequences.
  • one selection primer may be complementary over its entire length to the second selection primer.
  • the two primers may be homologous over a portion of their lengths and may have non-homologous sequences over the remainder of their lengths.
  • the sequences of the first and second selection primers are completely complimentary to each other.
  • the primers should have a GC content of greater than 40% and terminate in one or more C or G bases. In various embodiments, are between, e.g.
  • the primers are purified, e.g., by PAGE-purification. It desired, at least one of the selection primers is phosphorylated at its 5' terminus.
  • both primers are phosphorylated at their 5' termini.
  • the first and second selection primer sequences are derived from partial nucleic acid fragments identified in art-recognized screening assays. These assays include, e.g., differential display assays, gene markers, expressed sequence tag (EST) analysis, single nuc eot e po ymorp sm ana ys s, part al cDNA clones, consensus sequences for at least one gene family, and consensus sequences for at least one domain type.
  • assays include, e.g., differential display assays, gene markers, expressed sequence tag (EST) analysis, single nuc eot e po ymorp sm ana ys s, part al cDNA clones, consensus sequences for at least one gene family, and consensus sequences for at least one domain type.
  • EST expressed sequence tag
  • first primer or second primer can include one or more DNA analogs that protect the DNA from degradation.
  • a suitable DNA analog is a phosphorothioate.
  • the first population of nucleic acids can be prescreened in order to identify at least one insert comprising the partial nucleic acid fragment.
  • the pre-screen can use a first flanking primer and a second flanking primer in a polymerase chain reaction to identify at least one insert within the library that comprises the partial nucleic acid fragment.
  • the first flanking primer includes a sequence from one end of the partial nucleic acid fragment and the second flanking primer comprises a sequence from the opposite strand at the opposite end of the partial nucleic acid fragment.
  • a pair of flanking primers is shown in FIG. 1 as a pair of arrows with closed arrowheads that flank the selection primers (open arrowheads).
  • flanking primers are used in an initial PCR screen against a panel of nucleic acid libraries derived from cDNAs from distinct tissue sources or from genomes from distinct species. This step identifies the specific library or libraries that contain the gene or fragment of interest. The library with the most robust amplicon is selected for the clone selection step. This selection step includes a primer extension utilizing the selection primers that dramatically increases the representation of the clones (cDNAs, genes or genomic sequences) of interest, while removing non-specific molecules. Following transformation and plating, a second screen is conducted on DNA derived from individual colonies to verify which contain the clone (gene) of interest. The template is prepared from these colonies and submitted for DNA sequencing. The newly extended sequence data is analyzed to identify inserts that contain flanking sequences for sequence of interest.
  • the specific tissues are selected so as to encompass the broadest coverage of expressed genes. Knowledge gleaned from gene expression profiling efforts may be used in determining the appropriateness of source for the library used in the invention. Libraries from different sources may be pooled for high throughput analysis.
  • flanking primers and two selection primers are synthesized for each partial fragment to be characterized.
  • the flanking primers are standard 20 nt PCR oligomers.
  • the flanking primers are at least 10 nt, 15 nt, 20 nt, 25 nt, 30 nt, 35 nt, 40 nt, 45 nt, 50 nt, 60 nt or more in length.
  • the primers are designed such that the flanking PCR primers derived from sequences located on either side of the selection primers (see FIG. 1)
  • flanking primers for every gene may be used in individual PCR reactions using multiple libraries as template. Sources of nucleic acid libraries are described above. In an exemplary high throughput experiment, 20 tissue-specific libraries are used for each gene to be screened. Controls in a typical experiment may include human, rat and mouse genomic DNAs and cDNA from a rat fetal brain library. For efficiency, 16 genes can be tested on a single 384 well plate.
  • One flanking primer PCR profile optimized for high throughput assays utilizes the following "touchdown" conditions: an initial 5 min denaturation step at 95°C, two cycles of a 95°C denaturation step, a 67°C annealing step and a 68°C extension step. Each step is timed for 30s.
  • PCR conditions may be individually optimized for each gene, or a single set of conditions can be used for concurrent high throughput analyses of many genes.
  • PCR products may be separated, e.g., by electrophoresis on 2% agarose gels. PCR products are scored for the presence of the most robust amplicon of the expected size. The specific libraries with the most robust amplicons are paired with the selection primers that correspond to the gene flanking primers used in the screen. These libraries are subsequently re- arrayed onto new 96 well plates for a primer extension step.
  • any nucleic acid polymerase can be used, so long as it can extend the annealed primer.
  • a preferred polymerase is a DNA polymerase, e.g., a thermostable polymerase such as Pfu DNA polymerase.
  • Suitable polymerases include, e.g., DNA-dependent DNA polymerases, RNA- dependent DNA polymerases (reverse transcriptases), DNA-dependent RNA polymerases, and RNA-dependent RNA polymerases.
  • DNA-dependent DNA polymerases include, e.g., the DNA polymerase from Bacillus stearothermophilus (Bst), the E. coli DNA polymerase I Klenow fragment, the bacteriophage T4 and T7 DNA polymerases, and those from Thermus aquaticus (Taq), and Thermococcus litoralis (Vent).
  • the Bst DNA polymerase has been shown to effic ently incorporate 3'- - - - tro enzy - TP nto a growing DNA chain, is highly processive, very stable, and lacks 3 '-5' exonuclease activity.
  • the coding sequence of this enzyme has been determined. See U.S. Patent Nos. 5,830,714 and 5,814,506, incorporated herein by reference.
  • reverse transcriptases examples include, e.g., reverse transcriptase from Avian Myeloblastosis Virus (AMV), Moloney Murine Leukemia Virus, and Human Immunodeficiency Virus-1 (HIV-1). HIV-1 reverse transcriptase is particularly preferred because it is well characterized both structurally and biochemically. See, e.g., Huang, et ai, Science 282: 1669-1675 (1998).
  • the first and second synthesized DNA strands can be further amplified, if desired.
  • amplification occurs prior to selectively removing the first population of nucleic acid molecules from the second population of nucleic acid molecules.
  • amplification occurs prior to selectively removing the first population of nucleic acid molecules from the second population of nucleic acid molecules.
  • Amplification can be performed using methods known in the art. Suitable methods include, e.g., polymerase chain reaction (PCR), rapid amplification of DNA ends (RACE-PCR), ligation chain reaction (LCR), strand displacement amplification (SDA), self-sustained sequence replication (SSR) and Q beta, or any combination thereof.
  • the first and second synthesized sequences can in addition be transformed into at least one host cell, e.g., an E. coli cell with the first synthesized DNA strand or second synthesized nucleic acid strand, or both, and then identifying at least one transformed host cell containing a nucleic acids homologous to the known nucleic acid sequences.
  • the first or second synthesized strands or can be conveniently be identified in the transformed cells using probes that recognize the initial target sequence.
  • the first synthesized DNA strand can optionally be annealed to the second synthesized DNA strand to form a double-stranded DNA prior to transforming the host cell.
  • the double-stranded DNA is treated with ligase prior to being introduced into the host cell.
  • Additional characterization can include characterizing by sequencing sequences that contain first and second synthesized DNA sequences. so prov e y t e nvent on s a t or ut z ng sequence from a part al nucle c ac d fragment to identify flanking sequences contiguous to the partial nucleic acid fragment.
  • the kit can include a nucleic acid polymerase, (such as a DNA polymerase), a methylation sensitive restriction endonuclease (such as Dpn I), a control methylated circular double- stranded DNA molecule; a dilution buffer.
  • the kit may in addition include control first and second selection primers, wherein the first selection primer comprises a region that is complementary to the second selection primer, and wherein said first and second selection primers anneal specifically to the methylated circular double-stranded DNA molecule.
  • Transposons are DNA sequences which, in the presence of an enzyme known as a transposase, will insert themselves into clones in a generally random fashion without any sequence relationship to the target locus. This allows sequencing of inserts in a single step rather than depending on the iterative approach of primer "walking".
  • the transposon technology based sequencing system of the present invention alleviates the requirement of successive sequence determinations in order to design the next primer. It is possible to determine the entire sequence of a large insert from a small number (6-20) of clones containing the transposon using only two primers that are specific to the transposon (so the same primers can be used across all sequencing reactions). Furthermore, these clones are both picked and sequenced at the same time. This substantially increases the speed at which these inserts can be sequenced.
  • transposon technology based sequencing system The basic strategy for this transposon technology based sequencing system is to randomly integrate a transposon into the isolated clone and to use primers specific to this transposon (5' end and 3' end) to perform the sequencing chemistry so the reads go from the transposon and into the insert (See Figures 3, 4 and 5).
  • This transposon integration is mediated by the enzyme transposase.
  • Example 1 Selective amplification of a target nucleic acid present in various dilutions in a starting population of nucleic acid molecules
  • selection primers homologous to sequences in a plasmid named the pWhitescript were annealed to a fetal brain cDNA library containing various dilutions of the plasmid/
  • the pWhitescript plasmid is the same as the pBlueScript plasmid with the exception that it has been mutagenized to inactivate the LacZ gene. This mutation precludes alpha- complementation; bacteria containing these plasmids will appear as white colonies in the presence of IPTG and Xgal.
  • the primers used in this experiment span the mutation and contain the normal LacZ sequence. Therefore, any plasmids generated de novo will revert to "wild-type". Colonies containing these plasmids will appear blue on the IPTG/Xgal background.
  • the pWhitescript plasmid was initially "doped" into a human fetal brain cDNA library in serial dilution from 1 :100 to 1 :10,000 (w/w). A subsequent experiment continued this dilution series to 1 :100,000.
  • the primer extension profile was 95°C for 30s, followed by either 16 or 24 cycles of 95°C for 30s, 55°C for 60s and 68°C.
  • Table 1 Results from the pWhitescript "doping" experiment.
  • the experiments demonstrate that it is possible to selectively amplify a gene of interest using a relatively small amount of sequence data.
  • the representation of the APP gene in the human fetal brain library is approximately 1 : 80,000 molecules.
  • the above experiments document conditions that result in significant enrichment of cDNA clones containing at least portions of this target gene.
  • Test APP and 3 Use genes 20 mm 24 cycles Apparent recovery of genes in FLC list more likely to low percentage of which had been be in O ⁇ Gene APP and gene positive in library 14401193 by PCR primary O ⁇ Gene (turns out 14401 193 screen. was the only gene which was positive in the secondary O ⁇ Gene screen).
  • CuraSelect successful include 7 genes of selection on 2 of 7 genes- (one against fetal of which is the APP brain, heart and I control). Clones PBL libraries successfully selected came from HFB and heart libraries. 10 Test 5 genes plus Test efficacy 20 min 24 cycles Results from APP m APP control, all of HFB library only which have Need to check worked in concentration of standard O ⁇ Gene pooled hbra ⁇ es. library screens

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention describes a method of utilizing sequence from a partial nucleic acid fragment to identify flanking sequences that are contiguous to that partial nucleic acid fragment. The invention also includes optimized parameters for high troughput assay to isolate and characterize clones containing upstream and downstream flanking sequences from any piece of DNA to be used as a starting point. A kit containing components used in the method of the invention is provided.

Description

METHOD OF IDENTIFYING A NUCLEIC ACID SEQUENCE
FIELD OF THE INVENTION
The present invention relates to methods for identifying and cloning nucleic acids. In particular, the invention relates to high throughput methods for identifying the flanking sequences and full length clones for any known partial nucleic acid sequence.
BACKGROUND OF THE INVENTION
Many genomics facilities have developed extensive DNA sequence databases from both expressed genes and genomic sources. One utility of these databases is to use them to identify novel genes that have potential therapeutic value. An essential step in this process is to extend the sequence fragments in these databases to incorporate the entire coding sequence of the genes. This is important from a scientific standpoint in order to make biologically relevant assessments of sequence fragments from these data sets. Establishing the full length gene is also valuable from a business perspective as it is an important means of obtaining the complete open reading frame (ORF) for potentially useful proteins, whether they be targets or therapeutics.
Traditional strategies to search for full length genes using known partial gene sequences have relied on methods such as RACE (rapid amplification of cDNA ends) and hybridization. Unfortunately, these methods are limited by; (1) requiring multiple ligation and cloning steps; (2) being prone to PCR error and a high rate of false positive results; and/or (3) being relatively slow and not amenable to high-throughput. Thus, a need remains in the art for a faster means of cloning specific genes of interest and for extending partial sequences in a high-throughput fashion.
SUMMARY OF THE INVENTION
The invention provides a method for isolating a nucleic acid sequence flanking a known nucleic acid sequence. In one embodiment, a first selection primer and a second selection primer are provided, wherein the first selection primer comprises at least a portion of a known nucleic acid sequence and the second selection primer is complementary to at least a portion of the first selection primer. Also provided is a first population of circular nucleic acid molecules, such as cDNA or genomic libraries, wherein the first population of circular nucleic acid molecules comprises parental strands that will anneal to the first or second selection primers.
In a preferred embodiment, the selection primers are combined with the first population of circular nucleic acids molecules. Parental strands of the first population of circular nucleic acids, which contain sequences that are complementary to the first and second selection primers, are annealed to the provided selection primers, thus forming a population of circular nucleic acid molecules complexed with the primers;
In another embodiment, the annealed first and second selection primers are extended with a polymerase to produce a second population of nucleic acid molecules. Said second population comprises the first population of parental circular nucleic acid molecules, a first synthesized DNA strand comprising the first selection primer and a second synthesized DNA strand comprising the second selection primer. In a third preferred embodiment, the first synthesized DNA strand and the second synthesized DNA strand are distinguishable from the parental strands and the non-annealed first population of circular nucleic acids. In a fourth preferred embodiment, the first population of circular nucleic acids are selectively removed from the second population of nucleic acid molecules. The de novo first synthesized DNA strand, or said second synthesized DNA strand, or both, are recovered from the second population of nucleic acid molecules, thereby isolating a nucleic acid sequence flanking a known nucleic acid sequence. In a fifth preferred embodiment, kits are provided that contain the compositions utilized according to the invention.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description and claims. BRIEF DES RIPTI N F THE DRAWINGS
FIG. 1 illustrates a set of primers that can be used to amplify nucleic acids according to the invention. Two flanking primers (solid arrow heads) and two complementary selection primers (open arrow heads) are synthesized from a known sequence (black rectangle) in order to clone the unknown flanking sequence (open rectangle) contained within a given vector (solid line).
FIG. 2 illustrates primer annealing and extension followed by Dpnl degradation, transformation and plating, as follows: (a) two complementary gene-specific oligonucleotides are annealed; (b) the primers are extended by a polymerase (PfuTurbo™, Stratagene); (c) the parental strands and all other clones are degraded with Dpnl; and (d) the remaining newly synthesized daughter molecule is transformed into E. coli and plated on selective media.
FIG. 3 is a graphic overview of the Genome Priming System (GPS-1) sequencing assay utilizing transposon technology.
FIG. 4 is a graphic representation of the data output of forward and reverse sequencing reads from the transposon based sequencing assay. The final sequence of the full length clone is determined after the data is assembled into a contig.
FIG. 5 illustrates enrichment for the amyloid precursor protein (APP) gene. Various parameters were tested to optimize the invention. All tests utilized 125 ng of the APP selection primers. Samples A1-A12 used 50 ng human fetal brain cDNA template and a 16 min extension time; samples B1-D12 used 50 ng template and a 20 min extension time; samples E1-E9 used 25 ng template and 16 min extension time; and samples E10-H12 used 25 ng template and a 20 min extension time.
FIG. 6 is a graphic representation of various gene fragments used in the invention to clone and identify upstream and downstream flanking sequences.
DETAILED DESCRIPTION OF THE INVENTION
The invention provides methods for identifying a nucleic acid of interest in a population of nucleic acid molecules. The method can be used to isolate nucleic acids flanking a known sequence. Thus, the methods can be used, e.g., when a partial sequence of a gene is known and additional sequence for the gene is desired. Using the methods disclosed herein, from 4% to >70% of sequences can contain a sequence of interest.
Nucleic acid sequences of interest can be identified and recovered by annealing a pair of primers, named a first selection primer and a second selection primer, to a parental population of circular nucleic acid molecules. Included in the population of circular nucleic acid molecules are one or more nucleic acids that are known to contain, or are suspected of containing a known nucleic acid sequence. The population is designated as a parental population to indicate that they can be distinguished from nucleic acids synthesized following annealing of the selection primers to the first population.
The first selection primer includes at least a portion of the known nucleic acid sequence, and the second selection primer is complementary to at least a portion of the first selection primer.
The first selection primer and second selection primer are annealed to complementary sequences in the first parental nucleic acid in the population of circular nucleic acid molecules to form an annealed first selection primer complex and second primer complexes, respectively.
The annealed primers are then extended with a polymerase, nucleotide triphospgates, and additional cofactors as warranted, to produce a second population of nucleic acid molecules that include, in addition to the first population of parental circular nucleic acid molecules, a first synthesized DNA strand comprising the first selection primer and a second synthesized DNA strand comprising the second selection primer.
The first synthesized DNA strand and the second synthesized DNA strand are distinguishable from the parental strands. For example, almost all strains of E. coli methylate DNA and, therefore, all clones in the library have methylated nucleotides. However, while the parental DNA strands are methylated, the first and second synthesized can be synthesized to lack methylated nucleotides. Thus de novo polynucleotide synthesis results in unmethylated copies of clones containing genes of interest in a background of methylated clones. An example of a methylation sensitive enzyme is Dpn I, which recognizes and cleaves the sequence: 5' GAm/ TC 3'
3' CT/AmG 5'
Dpn I cleaves n6 methylated DNA.
Alternatively, the first and second strands can be synthesized using biotinylated nucleotides, provided that the parental strands lack biotinylated nucleotides. The differences between the parental strands and the first and second strands are then used as the basis for selectively removing first population of nucleic acid molecules from the second population of nucleic acid molecules. For example, the parental strands can include a nucleotide, e.g., a modified nucleotide, not present in the first synthesized DNA strand or second synthesized DNA strand. An example of a modified nucleotide is a methylated nucleotide (such as a met y ate adenine or cytosine nucleotide). Thus, if the parental DNA contains methylated nucleotides, and the first and second synthesized strands lack methylated nucleotides, the parental nucleic acids can be removed by digestion with a restriction enzyme that cuts methylated DNA but does not cut non-methylated DNA. In alternative embodiments, the first synthesized DNA strand and second synthesized
DNA strand include a modified nucleotide not present in the parental strands. For example, the first synthesized strand and second synthesized strand can include a biotinylated nucleotide, and these strands can be isolated using a purification scheme based on the binding or strep-avidin to the biotinylated first and second synthesized strand. Depending on the nature of the modification desired, the modified nucleotide can be introduced in vitro or in vivo.
If desired, the first synthesized DNA strand or said second synthesized DNA strand, or both, can be then be recovered from the second population of nucleic acid molecules.
In general, any desired source of nucleic acid molecules can be used to produce the first population of circular nucleic acid molecules. For example, the first population of nucleic acid molecules can be generated from genomic DNA molecules or complementary DNA (cDNA) molecules.
The nucleic acid molecules can be provided as open circular or closed circular nucleic acid molecules. A "closed circle" is a covalently closed circular nucleic acid molecule, e.g., a circular DNA or RNA molecule. An "open circle" is a linear single-stranded nucleic acid molecule having a 5' phosphate group and a 3' hydroxyl group. In some embodiments, the open circle is formed in situ from a linear double-stranded nucleic acid molecule. The ends of a given open circle nucleic acid molecule can be ligated by DNA ligase. Sequences at the 5' and 3' ends of the open circle molecule are complementary to two regions of adjacent nucleotides in a second nucleic acid molecule, e.g., an adapter region of an anchor primer, or to two regions that are nearly adjoining in a second DNA molecule. Thus, the ends of the open-circle molecule can be ligated using DNA ligase, or extended by DNA polymerase in a gap-filling reaction. Open circles are described in detail in Lizardi, U.S. Pat. No. 5,854,033. An open circle can be converted to a closed circle in the presence of a DNA ligase (for DNA) or RNA ligase.
The first population of nucleic acid molecules can be constructed from any source of nucleic acid, e.g., any cell, tissue, or organism, and can be generated by any art-recognized method. Suitable methods include, e.g., sonication of genomic DNA and digestion with one or more restriction endonucleases (RE) to generate fragments of a desired range of lengths from an initial population of nucleic acid molecules. Preferably, one or more of the restriction enzymes have distinct four-base recognition sequences. Examples of such enzymes include, e.g., Sau3Al, Mspl, and Taql. In other embodiments, the restriction enzyme is used with a type IIS restriction enzyme. Alternatively, the first population of nucleic acid molecules can be made by generating a complementary DNA (cDNA) library from RNA, e.g., messenger RNA (mRNA). The cDNA library can, if desired, be further processed with restriction endonucleases to obtain a 3' end characteristic of a specific RNA, internal fragments, or fragments including the 3' end of the isolated RNA. Adapter regions in the anchor primer may be complementary to a sequence of interest that is thought to occur in the template library, e.g., a known or suspected sequence polymorphism within a fragment generated by endonuclease digestion.
While the first population of nucleic acids may generally be derived from a single source, e.g., a mRNA of a particular cell type, for some applications several applications it may be desirable to combine, or pool, two or more distinct libraries. Pooling libraries may be particularly suitable as starting materials in high throughput assay.
Selection primers
While both the first and second selection primers anneal to the same sequence on opposite strands of the plasmid (i.e. are complementary), one or both primers may in addition include additional sequences. Thus, one selection primer may be complementary over its entire length to the second selection primer. Alternatively, the two primers may be homologous over a portion of their lengths and may have non-homologous sequences over the remainder of their lengths. In other embodiments, the sequences of the first and second selection primers are completely complimentary to each other. Preferably, the primers should have a GC content of greater than 40% and terminate in one or more C or G bases. In various embodiments, are between, e.g. t 15 and 90 nucleotides in length, 20 and 50 nucleotides in length, or 24 and 45 nucleotides length. A preferred length is 35 nucleotides. A preferred Tm of the primers is > 78°C. In preferred embodiments, the primers are purified, e.g., by PAGE-purification. It desired, at least one of the selection primers is phosphorylated at its 5' terminus.
More preferably, both primers are phosphorylated at their 5' termini.
In some embodiments, the first and second selection primer sequences are derived from partial nucleic acid fragments identified in art-recognized screening assays. These assays include, e.g., differential display assays, gene markers, expressed sequence tag (EST) analysis, single nuc eot e po ymorp sm ana ys s, part al cDNA clones, consensus sequences for at least one gene family, and consensus sequences for at least one domain type.
One of ordinary skill in the art will also recognize that the first primer or second primer, or both, can include one or more DNA analogs that protect the DNA from degradation. A suitable DNA analog is a phosphorothioate.
Flanking primers
To enhance the likelihood of identifying flanking nucleic acids, the first population of nucleic acids can be prescreened in order to identify at least one insert comprising the partial nucleic acid fragment. The pre-screen can use a first flanking primer and a second flanking primer in a polymerase chain reaction to identify at least one insert within the library that comprises the partial nucleic acid fragment. The first flanking primer includes a sequence from one end of the partial nucleic acid fragment and the second flanking primer comprises a sequence from the opposite strand at the opposite end of the partial nucleic acid fragment. A pair of flanking primers is shown in FIG. 1 as a pair of arrows with closed arrowheads that flank the selection primers (open arrowheads). In one embodiment, flanking primers are used in an initial PCR screen against a panel of nucleic acid libraries derived from cDNAs from distinct tissue sources or from genomes from distinct species. This step identifies the specific library or libraries that contain the gene or fragment of interest. The library with the most robust amplicon is selected for the clone selection step. This selection step includes a primer extension utilizing the selection primers that dramatically increases the representation of the clones (cDNAs, genes or genomic sequences) of interest, while removing non-specific molecules. Following transformation and plating, a second screen is conducted on DNA derived from individual colonies to verify which contain the clone (gene) of interest. The template is prepared from these colonies and submitted for DNA sequencing. The newly extended sequence data is analyzed to identify inserts that contain flanking sequences for sequence of interest.
The specific tissues are selected so as to encompass the broadest coverage of expressed genes. Knowledge gleaned from gene expression profiling efforts may be used in determining the appropriateness of source for the library used in the invention. Libraries from different sources may be pooled for high throughput analysis.
As illustrated in FIG. 1, two flanking primers and two selection primers are synthesized for each partial fragment to be characterized. In one embodiment, the flanking primers are standard 20 nt PCR oligomers. In another embodiment the flanking primers are at least 10 nt, 15 nt, 20 nt, 25 nt, 30 nt, 35 nt, 40 nt, 45 nt, 50 nt, 60 nt or more in length. The primers are designed such that the flanking PCR primers derived from sequences located on either side of the selection primers (see FIG. 1)
The flanking primers for every gene may be used in individual PCR reactions using multiple libraries as template. Sources of nucleic acid libraries are described above. In an exemplary high throughput experiment, 20 tissue-specific libraries are used for each gene to be screened. Controls in a typical experiment may include human, rat and mouse genomic DNAs and cDNA from a rat fetal brain library. For efficiency, 16 genes can be tested on a single 384 well plate. One flanking primer PCR profile optimized for high throughput assays utilizes the following "touchdown" conditions: an initial 5 min denaturation step at 95°C, two cycles of a 95°C denaturation step, a 67°C annealing step and a 68°C extension step. Each step is timed for 30s. The profile continues with two cycles at 95°C for 30s, 65°C for 30s and 68°C for 30s, followed by 21 cycles at 95°C for 30s, 63°C for 30s and 68°C for 30s. Other PCR condition are contemplated. PCR conditions may be individually optimized for each gene, or a single set of conditions can be used for concurrent high throughput analyses of many genes.
PCR products may be separated, e.g., by electrophoresis on 2% agarose gels. PCR products are scored for the presence of the most robust amplicon of the expected size. The specific libraries with the most robust amplicons are paired with the selection primers that correspond to the gene flanking primers used in the screen. These libraries are subsequently re- arrayed onto new 96 well plates for a primer extension step.
Polymerases
In general, any nucleic acid polymerase can be used, so long as it can extend the annealed primer. A preferred polymerase is a DNA polymerase, e.g., a thermostable polymerase such as Pfu DNA polymerase.
Suitable polymerases include, e.g., DNA-dependent DNA polymerases, RNA- dependent DNA polymerases (reverse transcriptases), DNA-dependent RNA polymerases, and RNA-dependent RNA polymerases. Other examples of DNA-dependent DNA polymerases include, e.g., the DNA polymerase from Bacillus stearothermophilus (Bst), the E. coli DNA polymerase I Klenow fragment, the bacteriophage T4 and T7 DNA polymerases, and those from Thermus aquaticus (Taq), and Thermococcus litoralis (Vent). The Bst DNA polymerase has been shown to effic ently incorporate 3'- - - - tro enzy - TP nto a growing DNA chain, is highly processive, very stable, and lacks 3 '-5' exonuclease activity. The coding sequence of this enzyme has been determined. See U.S. Patent Nos. 5,830,714 and 5,814,506, incorporated herein by reference.
Examples of reverse transcriptases include, e.g., reverse transcriptase from Avian Myeloblastosis Virus (AMV), Moloney Murine Leukemia Virus, and Human Immunodeficiency Virus-1 (HIV-1). HIV-1 reverse transcriptase is particularly preferred because it is well characterized both structurally and biochemically. See, e.g., Huang, et ai, Science 282: 1669-1675 (1998).
Additional amplification
The first and second synthesized DNA strands can be further amplified, if desired. In some embodiments, amplification occurs prior to selectively removing the first population of nucleic acid molecules from the second population of nucleic acid molecules. In other embodiments, amplification occurs prior to selectively removing the first population of nucleic acid molecules from the second population of nucleic acid molecules. Amplification can be performed using methods known in the art. Suitable methods include, e.g., polymerase chain reaction (PCR), rapid amplification of DNA ends (RACE-PCR), ligation chain reaction (LCR), strand displacement amplification (SDA), self-sustained sequence replication (SSR) and Q beta, or any combination thereof.
The first and second synthesized sequences can in addition be transformed into at least one host cell, e.g., an E. coli cell with the first synthesized DNA strand or second synthesized nucleic acid strand, or both, and then identifying at least one transformed host cell containing a nucleic acids homologous to the known nucleic acid sequences. The first or second synthesized strands or can be conveniently be identified in the transformed cells using probes that recognize the initial target sequence. The first synthesized DNA strand can optionally be annealed to the second synthesized DNA strand to form a double-stranded DNA prior to transforming the host cell. Preferably, the double-stranded DNA is treated with ligase prior to being introduced into the host cell. Additional characterization can include characterizing by sequencing sequences that contain first and second synthesized DNA sequences. so prov e y t e nvent on s a t or ut z ng sequence from a part al nucle c ac d fragment to identify flanking sequences contiguous to the partial nucleic acid fragment. The kit can include a nucleic acid polymerase, (such as a DNA polymerase), a methylation sensitive restriction endonuclease (such as Dpn I), a control methylated circular double- stranded DNA molecule; a dilution buffer. The kit may in addition include control first and second selection primers, wherein the first selection primer comprises a region that is complementary to the second selection primer, and wherein said first and second selection primers anneal specifically to the methylated circular double-stranded DNA molecule.
Transposon integration
The selection primers described herein can also be designed to be used in conjunction with transposon mutagenesis. Transposons are DNA sequences which, in the presence of an enzyme known as a transposase, will insert themselves into clones in a generally random fashion without any sequence relationship to the target locus. This allows sequencing of inserts in a single step rather than depending on the iterative approach of primer "walking".
The transposon technology based sequencing system of the present invention alleviates the requirement of successive sequence determinations in order to design the next primer. It is possible to determine the entire sequence of a large insert from a small number (6-20) of clones containing the transposon using only two primers that are specific to the transposon (so the same primers can be used across all sequencing reactions). Furthermore, these clones are both picked and sequenced at the same time. This substantially increases the speed at which these inserts can be sequenced.
The basic strategy for this transposon technology based sequencing system is to randomly integrate a transposon into the isolated clone and to use primers specific to this transposon (5' end and 3' end) to perform the sequencing chemistry so the reads go from the transposon and into the insert (See Figures 3, 4 and 5). This transposon integration is mediated by the enzyme transposase.
EXAMPLES
Example 1 : Selective amplification of a target nucleic acid present in various dilutions in a starting population of nucleic acid molecules
To demonstrate that a pair of complementary primers could selectively amplify a target nucleic acid, selection primers homologous to sequences in a plasmid named the pWhitescript were annealed to a fetal brain cDNA library containing various dilutions of the plasmid/
The pWhitescript plasmid is the same as the pBlueScript plasmid with the exception that it has been mutagenized to inactivate the LacZ gene. This mutation precludes alpha- complementation; bacteria containing these plasmids will appear as white colonies in the presence of IPTG and Xgal. The primers used in this experiment span the mutation and contain the normal LacZ sequence. Therefore, any plasmids generated de novo will revert to "wild-type". Colonies containing these plasmids will appear blue on the IPTG/Xgal background.
To simulate clones at various representation within a cDNA library, the pWhitescript plasmid was initially "doped" into a human fetal brain cDNA library in serial dilution from 1 :100 to 1 :10,000 (w/w). A subsequent experiment continued this dilution series to 1 :100,000. The primer extension profile was 95°C for 30s, followed by either 16 or 24 cycles of 95°C for 30s, 55°C for 60s and 68°C.
Results compiled from these experiments are shown in Table 1 below. Table 1 : Results from the pWhitescript "doping" experiment.
Figure imgf000012_0001
This experiment proved that, even at relatively low representation, the primers selectively amplify the pWhitescript plasmid, and that the plasmid provides an adequate template. Furthermore, these experiments indicate that 24 cycles is the optimum number of cycles in the primer extension profile.
Example 2: Selective amplification of amyloid precursor protein gene (APP) sequences
These experiments utilized primers designed to enrich for the amyloid precursor protein gene (APP).
Studies were conducted to test the effects of varying: (1) the primer to template ratio; and (2) the extension time in the primer extension profile. These tests determined that increasing the amount of cDNA template and utilizing a longer extension time increased the efficiency of enrichment for the APP gene (FIG. 5).
The experiments demonstrate that it is possible to selectively amplify a gene of interest using a relatively small amount of sequence data. The representation of the APP gene in the human fetal brain library is approximately 1 : 80,000 molecules. The above experiments document conditions that result in significant enrichment of cDNA clones containing at least portions of this target gene.
The results of the various experiments performed using the APP gene are shown in Table 2, below. As shown in FIG. 6,sequence extensions can be performed from known gene fragments of many sized and positions within a full length clone.
Table 2 experiment design rationale extension cycles Results
1 Spike BRL HFB Proof of 12 min 18 cycles Control successful library with principle out to 1 : 10,000 control plasmid dilution. 1 : 100-1 : 10,000
2 Attempt to recovei Apply kit to 20 min 30 cycles no recovery APP and 4 genes genes in the (enrichment) in FLC queue library
3 Spike BRL HFB determine 20 min 24-30 Control successful library with limits of cycles out to 1 : 100,000 control plasmid recovery dilution at 24 cycles.
1 : 10,000- No recovery at 30
1 : 100,000 cycles.
4 Test effect of fine tune 16-20 min i 24 cycles 50% of recovered altering template experimental colonies positive concentration and parameters (40,000 fold alteπng extension enrichment). 50ng time (APP template (vs 25ng) enrichment) and 20 minute extension works better.
5 Attempt to recover Test efficacy 20 mm 24 cycles APP positive in 50%
APP and 4 genes of colonies picked. in FLC queue All others negative.
6 Attempt to recover Test whether 20 mm 24 cycles No recovery
APP and same 4 large insert (enrichment) in BRL genes in FLC size of genes library. Only 5% queue- OπGene may negatively positive APP m and BRL libraries effect success OπGene library of expeπment
7 Test APP and 3 Use genes 20 mm 24 cycles Apparent recovery of genes in FLC list more likely to low percentage of which had been be in OπGene APP and gene positive in library 14401193 by PCR primary OπGene (turns out 14401 193 screen. was the only gene which was positive in the secondary OπGene screen).
8 Alter fine tune 20 mm 24 cycles Try combinations of pπmeπtemplate expeπmental 125ng and 250ng ratios for APP and 1 parameters primer cone, w/ 50ug 14401 193 and lOOug cDNA template. Best results from 125ng primer coupled with 50ug cDNA template.
9 Expand queue to Test efficacy 20 mm 24 cycles CuraSelect successful include 7 genes of selection on 2 of 7 genes- (one against fetal of which is the APP brain, heart and I control). Clones PBL libraries successfully selected came from HFB and heart libraries. 10 Test 5 genes plus Test efficacy 20 min 24 cycles Results from APP m APP control, all of HFB library only which have Need to check worked in concentration of standard OπGene pooled hbraπes. library screens
11 Test 5 genes plus Test efficacy 20 mm 24 cycles Analysis underway. It APP control, all of after adjusting appears that 2 of the 5 which have concentration plus APP worked. worked in of libraries standard OπGene (only 15-50% library screens as concentrated as OπGene claims)
EQUIVALENTS
Although particular embodiments have been disclosed herein in detail, this has been done by way of example for purposes of illustration only, and is not intended to be limiting with respect to the scope of the appended claims which follow. In particular, it is contemplated by the inventors that various substitutions, alterations, and modifications may be made to the invention without departing from the spirit and scope of the invention as defined by the claims. The choice of nucleic acid starting material, clone of interest, or library type is believed to be a matter of routine for a person of ordinary skill in the art with knowledge of the embodiments described herein. Other aspects, advantages, and modifications considered to be within the scope of the following claims.

Claims

CLAIMSWe claim:
1. A method for isolating a nucleic acid sequence flanking a known nucleic acid sequence, the method comprising: providing a first selection primer and a second selection primer, wherein the first selection primer comprises at least a portion of a first strand of a known nucleic acid sequence and the second selection primer is complementary to at least a portion of the first selection primer; providing a first population of circular nucleic acid molecules, wherein the first population of circular nucleic acid molecules comprises parental strands; annealing the first selection primer to a first parental nucleic acid in the population of circular nucleic acid molecules to form an annealed first selection primer complex; annealing the second selection primer to a second parental nucleic acid in the population of circular nucleic acid molecules to form an annealed second selection primer complex; extending the annealed first and second selection primers with a polymerase to produce a second population of nucleic acid molecules comprising the first population of parental circular nucleic acid molecules, a first synthesized DNA strand comprising the first selection primer and a second synthesized DNA strand comprising the second selection primer, wherein the first synthesized DNA strand and the second synthesized DNA strand are distinguishable from the parental strands; selectively removing the first population from the second population of nucleic acid molecules; and recovering said first synthesized DNA strand or said second synthesized DNA strand, or both, from the second population of nucleic acid molecules, thereby isolating a nucleic acid sequence flanking a known nucleic acid sequence.
2. The method of claim 1 , wherein the first population of circular nucleic acid molecules are complementary DNA (cDNA) or genomic DNA molecules.
3. The method of claim 1, wherein the circular nucleic acid molecule is double stranded DNA.
4. The method of claim 1, wherein the circular nucleic acid molecule is single-stranded DNA.
5. The method of claim 1, wherein sequences of the first and second selection primers are completely complimentary to each other.
6. The method of claim 1, wherein the first selection primer composes a sequence that is non-complimentary to the second selection primer.
7. The method of claim 6, wherein the second selection primer comprises a sequence that is non-complementary to the first selection primer.
8. The method of claim 1, wherein the first and second selection primers are phosphorylated at their 5' termini.
9. The method of claim 1 , wherein the first primer or second primer includes at least one DNA analog.
10. The method of claim 9, wherein the DNA analog comprises a phosphorothioate.
11. The method of claim 1 , wherein the first selection primer is between 15 nucleotides and 90 nucleotides in length.
12. The method of claim 1, wherein the first selection primer is between 20 nucleotides and 50 nucleotides in length.
13. The method of claim 11, wherein the second selection primer is between 15 nucleotides and 90 nucleotides in length.
14. The method of claim 1 , wherein the second selection primer is between 20 nucleotides and 50 nucleotides in length.
15. The method of claim 1 , wherein said polymerase is a DNA-directed DNA polymerase or an RNA-directed DNA polymerase.
16. The method of claim 18, wherein the DNA polymerase is thermostable.
17. The method of claim 22, wherein the DNA polymerase is Pfu DNA polymerase.
18. The method of claim 1, wherein the method further comprises amplifying the synthesized DNA strands.
19. The method of claim 1, wherein the amplification occurs prior to selectively removing the first population of nucleic acid molecules from the second population of nucleic acid molecules.
20. The method of claim 19, wherein the amplification occurs prior to selectively removing the first population of nucleic acid molecules from the second population of nucleic acid molecules.
21. The method of claim 1, wherein the parental strands comprises a modified nucleotide not present in the first synthesized DNA strand or second synthesized DNA strand.
22. The method of claim 21 , wherein said modified nucleotide is a methylated nucleotide.
23. The method of claim 1, wherein the first synthesized DNA strand and second synthesized DNA strand comprise a modified nucleotide not present in the parental strands.
24. The method of claim 23, wherein said modified nucleotide is a biotinylated nucleotide.
25. The method of claim 24, wherein the modified nucleotide is introduced in vitro.
26. The method of claim 25, wherein the chemical modification is introduced in vivo.
27. The method of claim 1, wherein removal of the first population of nucleic acid molecules is by digestion with a restriction endonuclease.
28. The method of claim 34, wherein the restriction endonuclease is Dpnl.
29. The method of claim 1, the method further comprising the steps of: transforming at least one host cell with the first synthesized DNA strand or second synthesized nucleic acid strand, or both; and identifying at least one transformed host cell containing a nucleic acids homologous to the known nucleic acid sequences.
30. The method of claim 29, wherein the first synthesized DNA strand is annealed to the second synthesized DNA strand to form a double-stranded DNA prior to transforming the host cell.
31. The method of claim 30, wherein the double-stranded DNA is treated with ligase prior to transforming the host cell.
32. The method of claim 29, wherein the identifying step comprises sequencing DNA.
33. The method of claim 32, wherein the sequencing comprises transposon integration.
34. A kit for utilizing sequence from a partial nucleic acid fragment to identify flanking sequences contiguous to the partial nucleic acid fragment, the kit comprising: a) a DNA polymerase; b) a methylation sensitive restriction endonuclease; c) optionally, a control methylated circular double-stranded DNA molecule; d) optionally, a dilution buffer; and e) control first and second selection primers, wherein the first selection primer comprises a region that is complementary to the second selection primer, and wherein said first and second selection primers anneal specifically to the methylated circular double- stranded DNA molecule.
35. The kit of claim 41, wherein said methylation sensitive restriction endonuclease is Dpnl.
36. The method of claim 2, wherein multiple distinct libraries are prescreened in order to identify the libraries that contain at least one insert comprising the partial nucleic acid fragment.
37. The method of claim 36, wherein the prescreen comprises the utilization of a first flanking primer and a second flanking primer in a polymerase chain reaction to identify at least one insert within the library that comprises the partial nucleic acid fragment, wherein the first flanking primer comprises a sequence from one end of the partial nucleic acid fragment and the second flanking primer comprises a sequence from the opposite strand at the opposite end of the partial nucleic acid fragment.
38. The method of claim 1, wherein the complex population of circular nucleic acid molecules comprise nucleic acids from two or more libraries that are pooled together as starting material in a high throughput assay.
39. The method of claim 1, wherein the first and second selection primer sequences are derived from partial nucleic acid fragments identified in any one of the group consisting of differential display assays, gene markers, expressed sequence tag (EST) analysis, single nucleotide polymorphism (SNP) analysis, partial cDNA clones, consensus sequences for at least one gene family, and consensus sequences for at least one domain type.
PCT/US2000/041851 1999-11-02 2000-11-01 Method of identifying a nucleic acid sequence WO2001032932A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU46093/01A AU4609301A (en) 1999-11-02 2000-11-01 Method of identifying a nucleic acid sequence

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16294299P 1999-11-02 1999-11-02
US60/162,942 1999-11-02

Publications (2)

Publication Number Publication Date
WO2001032932A2 true WO2001032932A2 (en) 2001-05-10
WO2001032932A3 WO2001032932A3 (en) 2002-06-13

Family

ID=22587769

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/041851 WO2001032932A2 (en) 1999-11-02 2000-11-01 Method of identifying a nucleic acid sequence

Country Status (2)

Country Link
AU (1) AU4609301A (en)
WO (1) WO2001032932A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2257895A1 (en) * 2003-05-30 2006-08-01 Consejo Sup. De Invest. Cientificas Optimization of oligonucleotides for 5'- and 3'-rapid amplification of cDNA ends (race) comprises checking of compatibility on amplification by PCR
CN103756997A (en) * 2014-01-18 2014-04-30 云南云科花卉有限公司 Method for cloning unknown flanking sequences of PS1 gene 5' of dianthus caryophyllus

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10280455B2 (en) * 2015-12-30 2019-05-07 Bio-Rad Laboratories, Inc. Split-cycle and tape amplification

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4994370A (en) * 1989-01-03 1991-02-19 The United States Of America As Represented By The Department Of Health And Human Services DNA amplification technique
WO1993012257A1 (en) * 1991-12-12 1993-06-24 Hybritech Incorporated Enzymatic inverse polymerase chain reaction library mutagenesis
US5286632A (en) * 1991-01-09 1994-02-15 Jones Douglas H Method for in vivo recombination and mutagenesis
WO1995025176A1 (en) * 1994-03-15 1995-09-21 Rapaport, Erich Assay for monitoring the progress of chronic myelogenous leukemia
WO1996038591A1 (en) * 1995-06-02 1996-12-05 Incyte Pharmaceuticals, Inc. IMPROVED METHOD FOR OBTAINING FULL-LENGTH cDNA SEQUENCES

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4994370A (en) * 1989-01-03 1991-02-19 The United States Of America As Represented By The Department Of Health And Human Services DNA amplification technique
US5286632A (en) * 1991-01-09 1994-02-15 Jones Douglas H Method for in vivo recombination and mutagenesis
WO1993012257A1 (en) * 1991-12-12 1993-06-24 Hybritech Incorporated Enzymatic inverse polymerase chain reaction library mutagenesis
WO1995025176A1 (en) * 1994-03-15 1995-09-21 Rapaport, Erich Assay for monitoring the progress of chronic myelogenous leukemia
WO1996038591A1 (en) * 1995-06-02 1996-12-05 Incyte Pharmaceuticals, Inc. IMPROVED METHOD FOR OBTAINING FULL-LENGTH cDNA SEQUENCES

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BENKEL B F ET AL: "Long range-inverse PCR (LR-IPCR): extending the useful range of inverse PCR" GENETIC ANALYSIS: BIOMOLECULAR ENGINEERING, ELSEVIER SCIENCE PUBLISHING, US, vol. 13, no. 5, 1 November 1996 (1996-11-01), pages 123-127, XP004063526 ISSN: 1050-3862 *
COOLIDGE C J ET AL: "RUN-AROUND PCR: A NOVAL WAY TO CREATE DUPLICATIONS USING POLYMERASECHAIN REACTION" BIOTECHNIQUES, EATON PUBLISHING, NATICK, US, vol. 18, no. 5, 1 May 1995 (1995-05-01), page 762,764 XP000509322 ISSN: 0736-6205 *
DAW-JEN TSUEI ET AL: "INVERSE POLYMERASE CHAIN REACTION FOR CLONING CELLULAR SEQUENCES ADJACENT TO INTEGRATED HEPATITIS B VIRUS DNA IN HEPATOCELLULAR CARCINOMAS" JOURNAL OF VIROLOGICAL METHODS, AMSTERDAM, NL, vol. 49, no. 3, 1994, pages 269-284, XP000603737 ISSN: 0166-0934 *
LEWIN: "REPLICATION CAN PROCEED THROUGH EYES, ROLLING CIRCLES, OR D LOOPS" , GENES 4, OXFORD, OUP, GB, PAGE(S) 336-338 XP002041180 * whole document * *
OCHMAN H ET AL: "AMPLIFICATION OF FLANKING SEQUENCES BY INVERSE PCR" , PCR PROTOCOLS GUIDE TO METHODS AND APPLICATIONS, SAN DIEGO, ACADEMIC PRESS, US, PAGE(S) 219-227 XP002015609 * whole document * *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2257895A1 (en) * 2003-05-30 2006-08-01 Consejo Sup. De Invest. Cientificas Optimization of oligonucleotides for 5'- and 3'-rapid amplification of cDNA ends (race) comprises checking of compatibility on amplification by PCR
CN103756997A (en) * 2014-01-18 2014-04-30 云南云科花卉有限公司 Method for cloning unknown flanking sequences of PS1 gene 5' of dianthus caryophyllus
CN103756997B (en) * 2014-01-18 2015-09-02 云南云科花卉有限公司 The cloning process of Dianthus caryophyllus L. PS1 gene 5 ' flank unknown nucleotide sequence

Also Published As

Publication number Publication date
AU4609301A (en) 2001-05-14
WO2001032932A3 (en) 2002-06-13

Similar Documents

Publication Publication Date Title
US20230392191A1 (en) Selective degradation of wild-type dna and enrichment of mutant alleles using nuclease
JP4773338B2 (en) Amplification and analysis of whole genome and whole transcriptome libraries generated by the DNA polymerization process
US20070059700A1 (en) Methods and compositions for optimizing multiplex pcr primers
EP3601593B1 (en) Universal hairpin primers
CN114250274A (en) Amplification of primers with limited nucleotide composition
US20040126760A1 (en) Novel compositions and methods for carrying out multple pcr reactions on a single sample
WO1997012061A1 (en) Method for characterizing nucleic acid molecules
WO1997004131A1 (en) Single primer amplification of polynucleotide hairpins
EP1044281B1 (en) Method for in vitro amplification of circular dna
KR101600039B1 (en) Method for Amplification Nucleic Acid Using Aelle-Specific Reaction Primers
CN110603326A (en) Method for amplifying target nucleic acid
WO2001088174A1 (en) Novel compositions and methods for carrying out multiple pcr reactions on a single sample
WO2008131580A1 (en) Site-directed mutagenesis in circular methylated dna
KR20230124636A (en) Compositions and methods for highly sensitive detection of target sequences in multiplex reactions
US20210301333A1 (en) Methods and kits for highly multiplex single primer extension
WO2001032932A2 (en) Method of identifying a nucleic acid sequence
CN110468179B (en) Method for selectively amplifying nucleic acid sequences
WO2021156295A1 (en) Methods for amplification of genomic dna and preparation of sequencing libraries
JP2023517571A (en) Novel nucleic acid template structures for sequencing
KR20230163386A (en) Blocking oligonucleotides to selectively deplete undesirable fragments from amplified libraries
AU2022407332A1 (en) A method of capturing crispr endonuclease cleavage products
AU2019223315A1 (en) Method for introducing mutations

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase