WO1991018114A1 - Polynucleotide amplification - Google Patents

Polynucleotide amplification Download PDF

Info

Publication number
WO1991018114A1
WO1991018114A1 PCT/GB1991/000803 GB9100803W WO9118114A1 WO 1991018114 A1 WO1991018114 A1 WO 1991018114A1 GB 9100803 W GB9100803 W GB 9100803W WO 9118114 A1 WO9118114 A1 WO 9118114A1
Authority
WO
WIPO (PCT)
Prior art keywords
primer
cassette
pcr
sequence
target
Prior art date
Application number
PCT/GB1991/000803
Other languages
French (fr)
Inventor
David Stephen Charnock Jones
Andre Rosenthal
Original Assignee
Medical Research Council
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Medical Research Council filed Critical Medical Research Council
Publication of WO1991018114A1 publication Critical patent/WO1991018114A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR

Definitions

  • a known sequence within a genome is referred to as a target sequence.
  • a fragment of nucleic acid, for instance derived from the human genome, containing such a target sequence is referred to as a target fragment.
  • the method of the present invention allows one to walk up or down a genome starting from a target sequence.
  • the various strands are then separated from each other, again by denaturation, and, if necessary, more of the primers are added. Usually, in practice, there is already an excess of primers at the start of the reaction and so further amounts of primers need not be added.
  • the primers also bind to the previously formed primer extension-products resulting in the formation of further nucleic acid strands.
  • the amount of primer extension products i.e. the target sequence
  • the amount of primer extension products increases exponentially.
  • extension of the oligonucleotide primers occurs in a convergent manner relative to the target sequence, i.e. extension in the 5 1 - 3 ' direction occurs within the target sequence.
  • the extension direction in the PCR technique creates a drawback. It only allows amplification of the sections of nucleic acid located between the two primer annealing regions. It does not allow amplification of nucleic acid sequences that flank the target sequence. Moreover, if the sequence at only one end of the flanking region (i.e. the end adjacent the target sequence) is known, then suitable primers cannot be constructed to enable extension of the primers towards the target sequence.
  • the first is called the inverse polymerase chain reaction (Triglia, T. et al [1988] NAR 16. 8186; Ochman, H. et al [1988] Genetics 120 621) .
  • This technique allows the amplification of sections of nucleic acids, even of unknown sequences, that flank a target sequence.
  • the inverse polymerase chain reaction involves a first step of restricting a sample of ds nucleic acid with a restriction endonuclease which forms sticky ends and which does not cut within the target sequence. This restriction produces a number of sticky ended fragments, including one which contains the target sequence and has a flanking sequence of unknown sequence at each side.
  • each fragment thus produced are then ligated to each other, resulting in circularisation of the cleaved fragments.
  • the fragment containing the target sequence forms a circle wherein the unknown flanking regions form a continuous unknown region connecting the ends of the target sequence.
  • the continuous unknown region can then be subjected to exponential PCR amplification by inter alia the addition of the two primers which anneal to the respective ends of the target sequence in such a way that extension takes place into the unknown region. No amplification of circles not containing the target sequence will occur as there will be no site to which the primers can anneal.
  • SUBSTITUTE SHEET 2. circularising the nucleic acid fragments
  • concatamers are due to the fact that each of the excised nucleic acid fragments, whether or not they contain the target sequence, has sticky ends that are complementary not only to each other but also to the ends of all the other fragments produced from the initial nucleic acid digest.
  • the sticky ends of the target fragment can thus ligate not only to themselves but also to the sticky ends of other target fragments or to the sticky ends of other fragments. Therefore, linear concatamers can be, and often are, produced which have complex and unpredictable structures. Also", some of the formed concatamers form enlarged circles that, later on, might interfere with the subsequent PCR amplification.
  • each of the linear concatamers and enlarged circles containing the target sequence can undergo exponential PCR amplification because they contain the binding sites for the primers. This leads to the amplification of nonspecific products, which is clearly disadvantageous. IPCR is therefore critically dependent upon the formation of the correct circular nucleic acid fragment.
  • the size of the nucleic acid fragment is also a critical factor for the success of the IPCR technique. Ideally, the size of the initial linear excised target fragment and the corresponding circle must be within the range suitable for the polymerase chain reaction. This range is normally between 100 base pairs (bp) and several kilo base pairs (kb).
  • Another disadvantage of the IPCR technique is that one achieves amplification of both of the flanking sequences. There may be situations when one needs to amplify just one of the flanking regions, particularly as there is now more and more an important need to do gene walking only in one direction.
  • each of these reported schemes employs the use of a special cassette for ligation to each end of the sticky-ended nucleic acid fragments formed by digestion of a nucleic acid sample with a restriction enzyme. It is to be noted that each of the special cassettes ligates not only to the ends of the target fragment but also to the ends of the other restriction fragments produced by the initial digestion of, for example, genomic DNA.
  • Each of the methods employs either a different cassette construction or a different sequence of reaction steps to achieve a degree of selectivity during the amplification reaction.
  • the Shyamak and Ames method therefore has only a limited use and can only really be used for the amplification of "simple" DNA samples (e.g. from very simple prokaryotic organisms) .
  • the method cannot really be used for amplifying a fragment within a complex genomic DNA mixture (e.g. from a eukaryotic organism) because, in addition to the target fragment (usually present in only one or a few copies) , millions of unwanted fragments having ligated cassettes will also be exponentially amplified.
  • cassette primer since the cassette primer is only present in a limited quantity, most of the fragments including the target fragment will not be amplified because the primer will soon be exhausted. This is disadvantageous. Furthermore, if theoretically an excess of cassette primer is present, a mixture of millions of different fragments would be amplified. This is again clearly disadvantageous.
  • the cassette comprises two complementary oligonucleotides that form a double-stranded piece of DNA having a sticky end.
  • one oligonucleotide does not have a phosphate group at its 5 1 - end. Therefore, during the ligation reaction, only one of the oligonucleotides will covalently link to the excised nucleic acid fragments.
  • the unligated oligonucleotide is then removed by selective ethanol precipitation in the presence of ammonium acetate. The ligated cassette thus becomes single stranded.
  • Next PCR amplification is carried out in the presence of both of the primers i.e. the primer complementary to a region of the target sequence and the primer complementary to a portion of the cassette.
  • the first cycle of the PCR amplification comprises only a linear extension of the primer annealed to the target sequence.
  • the cassette primer can take part in the exponential PCR cycles by hybridising to the extension product. Therefore, and according to the reported method, unwanted nucleic acid fragments should not be amplified either in a linear or exponential fashion.
  • the proposed scheme does have some drawbacks.
  • the technique when used in the amplification of procaryotic genomic DNA, a large background of amplified fragments are observed, with only a slight excess production of the amplified target fragment.
  • Roux method follows a pattern similar to the Kalman method, wherein a mechanism is introduced to produce, in theory, only one strand suitable for exponential PCR amplification.
  • the cassette consists of two oligonucleotides of different lengths.
  • the short strand is known as the tailed linker or incomplete strand.
  • the longer strand which is at least 15 to 20 nucleotides longer, is known as the anchor template or complete strand.
  • the incomplete strand has a number of bases that are non- complementary to the complete strand.
  • the ligated cassette-target fragment itself is therefore not suitable for exponential PCR amplification but it is this linear extension product that is suitable for exponential PCR amplification. Thus, it is only after this first linear extension step that the second primer can hybridize to the extension product and create exponential amplification.
  • the Markham method like the Kalman and Roux methods, includes the use of synthetic oligonucleotide cassettes (which are defined as vectorette cassettes) that are ligated to both ends of both the target fragments and the unwanted fragments present in the digested original sample, e.g. genomic DNA.
  • the cassettes are designed to enhance the specificity of the PCR amplification step.
  • the Markham method has two variants which employ two types of oligonucleotide cassette.
  • Each cassette is constructed so that, after ligation to the fragments, a cassette primer cannot be hybridised to the cassette itself. Instead, the cassette primer should, in theory, only hybridise to an extended primer product. This effect is achieved by constructing cassettes comprising two oligonucleotides that are only partially complementary. The primary structures of both oligonucleotides have non-complementary middle portions which a remain single-stranded.
  • both of the oligonucleotides still possess a certain degree of non-complementarity but they are each more than 50 nucleotides long.
  • the primer complementary to the cassette can hybridise to the extended DNA strand and thus take part in the PCR reaction to exponentially amplify the target fragment.
  • the PCR amplification steps are usually carried out in the presence of both the two primers - i.e. a primer that is complementary to the target sequence and a primer that is complementary to the cassette portion.
  • the primer complementary to the target sequence should be linearly extended. This should create a template suitable for exponential PCR because the specific primer that is complementary to the cassette can now hybridize to the first extended PCR product and can thus be extended itself.
  • SUBSTIT t target fragment should be amplified, whereas all unwanted restriction fragments should not be amplified.
  • the amplification should, therefore, be highly specific because the special cassette design should exclude unwanted fragment amplification.
  • the Markham method can include the optional step of conducting two separate amplification reactions, namely a linear amplification followed by an exponential amplification. That is to say linear amplification for several cycles using only the primer complementary to the target sequence followed by the addition of the primer complementary to the cassette leading to exponential PCR amplification in the presence of the two primers.
  • the Markham method can also include the addition of SI nuclease after several linear amplification steps have been carried out with the primer specific for the target
  • SI nuclease should in theory degrade all of the remaining single-stranded unwanted DNA fragments. This should increase the relative concentration of the ligated cassette-target fragments over the remaining background levels of the unwanted nucleic acids that are present.
  • SI is known to attack to a significant extent double-stranded DNA. Therefore, this approach is not always a practical solution to reduce the complexity of the reaction mixture before starting exponential PCR amplification.
  • the cassettes are designed to achieve a specific amplification of the target fragment following a linear PCR amplification step.
  • the first step should be a linear amplification of the target DNA fragment starting from the annealed specific primer. In all the other PCR cycles both primers take part and the amplification is therefore exponential.
  • the non-specificity is due to the clustering of the restriction sites in genomic DNA of complex eukaryotic organisms. Therefore, many quite small restriction fragments will be present in the mixture which, certainly after denaturing, will serve as primers during PCR and can cause a high degree of non-specificity.
  • the Markham method apparently includes an optional step of isolating the extended primer strand by use of a gel.
  • the isolation of the amplified target fragments is carried out after the exponential amplification stages.
  • gel separation methods are not only laborious and time consuming, but they are really only effective for detecting and isolating large quantities of large sized strands. Small strands are difficult to separate from each other using gels.
  • the reported Markham method includes no other method for isolating the extended primer strands. In particular, there is no disclosure of an isolation step dependent upon the use
  • the present method does not require the use of a circularisation step or the use of specially designed cassettes (e.g. cassettes having incomplete strands) .
  • the present method allows one to pick out specific target nucleic acid strands containing the target sequence from a reaction mixture containing many different strands.
  • the specific picking out of the target strands ensures that exponential PCR amplification only occurs on the target strand.
  • SUBSTITUTE SHEET fingerprinting as well as their sequencing could be greatly improved if one could easily walk and sequence through gaps of a given library by using an efficient, fast and specific in vitro amplification method starting from a target sequence. This would avoid the earlier necessary construction of different libraries from the DNA of one organism (or part of it) using different vector systems.
  • the present method is generally well suited for walking and sequencing along any piece of genomic DNA without the need for cloning.
  • the present method can also be used for the detection of point mutations, deletions, and insertions within any genomic region of interest. This is especially advantageous for the detection of any modifications in the coding or non-coding regions of genes associated with genetic disorders, cellular disorders or infectious diseases. This gives one the potential to design specific diagnostics. It also allows the early determination of polymorphism for both alleles in many individuals, even down to the nucleotide level.
  • the method has applications in the identification and sequencing of unclonable loci, the identification of YAC termini for physical mapping and the extension of partial cDNA clones. Also, since the method contains only straight forward biochemical reactions, it can easily be automated.
  • the present method thereby overcomes and avoids most of the aforementioned problems.
  • a polynucleotide amplification method comprising the steps of:
  • a ligation product by ligating a target fragment, having sticky ends and including a first primer annealing region of known sequence, with a cassette, having a sticky end complementary to one of the sticky ends of the target fragment, the cassette including a second primer annealing region of known sequence, such that in the ligation product the known second primer annealing region is remote from the first primer annealing region,
  • nucleotides iv. adding nucleotides to the bound primer by use of a polymerase enzyme to form an extension product
  • a linear PCR amplification step is carried out first of all and independently from an exponential PCR amplification step. This is achieved by introducing a label in the linear amplification step that allows purification of the target fragment on a solid support before the exponential PCR amplification.
  • SUBSTITUTE SHEET Therefore, because only the labelled products (i.e. the primer extension products corresponding to the target fragment) will bind to the matrix, any unlabelled products (which will not bind to the matrix) can be washed away by using suitable solutions (e.g. buffers, alkaline solutions etc) .
  • suitable solutions e.g. buffers, alkaline solutions etc
  • the isolation step allows the easy isolation of and rapid purification of only the labelled fragment (i.e. the strand corresponding to the target fragment) .
  • the labelled fragments can then be subjected to exponential PCR amplification in the absence of any of the unwanted fragments. This leads to an efficient, effective and highly specific method for isolating and amplifying a target fragment having regions of unknown sequence from complex genomic DNA mixtures of restriction fragments - i.e. it reduces a complex mixture of restriction fragments to a single fragment or multiplex of fragments.
  • a linear PCR step is used to introduce a specific binding label into a strand that will be complementary to the ligated cassette target fragment.
  • one linear PCR cycle with a labelled primer should be sufficient to separate the labelled fragment from a complex genomic mixture before exponential amplification takes place.
  • the present method therefore reduces a complex DNA mixture (e.g. genomic DNA) to a very simple mixture.
  • a complex DNA mixture e.g. genomic DNA
  • the target fragment is derived from a digestion of a sample of DNA:
  • the sample of DNA could be total genomic DNA of a prokaryotic or eukaryotic organism, mixtures of total genomic DNA from different organisms or different individuals, a DNA fragment cloned in a vector like a phage, cosmid, YAC or mixture of different cloned
  • the sample of DNA is digested with a suitable restriction enzyme or with a combination of different restriction enzymes.
  • steps 3 to 5 are repeated up to 100 times; advantageously up to 50 times.
  • a more specialised exponential PCR amplification step could be conducted using a third specific PCR primer.
  • This third primer would be a nested primer with respect to the first primer.
  • the target fragment would have a third known primer annealing region distanced from the original first primer annealing region.
  • this third region is situated between the first primer annealing site and the second primer annealing site of the cassette.
  • the third primer would also have on it a separating label so that the amplified fragments can also readily be separated. This will be especially useful if there is a possibility that the first primer also bound to fragments other than the target fragment.
  • the presence of the third primer annealing region in the target strand enables the exponential PCR amplification step to be conducted using a primer that is specifically complementary to either the first primer annealing region or the third primer annealing region.
  • This is particularly advantageous because, by using the third primer annealing region, a further selection mechanism is introduced wherein PCR amplification only occurs with the target fragment and not with any unwanted fragments that may have become inadvertently bound to the support. It therefore introduces a means for ensuring that only the target fragment is exponentially amplified.
  • the nested third primer may furthermore be used in a preferred reamplification step, which advantageously facilitates the isolation of the target fragment in adequate purity and quantity for direct sequencing.
  • an aliquot of the exponentially amplified mixture is reamplified using the cassette primer and a nested primer specific for the target fragment.
  • the nested target fragment-specific primer may be the third primer.
  • it may be a different primer, hybridising to a further known primer annealing region distanced from both the first and second known primer annealing regions.
  • this different primer may be a fourth primer, but it is envisaged that any number of further nested primers may be advantageously employed. Equally advantageously, each different primer will hybridise to a separate further known primer annealing region. Thus a fourth primer would hybridise to a fourth known primer annealing region on the target fragment.
  • the aliquot may be taken from a dilution of the exponentially amplified mixture, for example a dilution between 1:1 and 1:100, most preferably a dilution of 1:50.
  • the aliquot will measure between 0.1 and lO ⁇ l, preferably l ⁇ l.
  • the reamplification step provides added levels of specificity to the amplification reaction through the use of nested primers.
  • the presence of matrix-bound DNA templates is eliminated, and thus the inefficiency of amplification associated with such templates is resolved.
  • the separating label is attached to the 5' end of the first primer.
  • the label can also be attached to one or more heterocyclic bases of the first primer.
  • the separating label is a biotin label and the support matrix comprises streptavidin-coated beads.
  • the support matrix comprises streptavidin-coated beads.
  • other forms of labels and support matrices would suffice e.g. proteins and protein binding groups, antibodies and antibody binding groups, GCN4 and other DNA binding proteins.
  • the matrix need not be in the form of a bead.
  • the matrix can be in any appropriate form.
  • the target strands could be isolated simply by dipping the rod into the reaction mixture and then removing it. In this case, only the target strands will bind to the rod which can then, if necessary, be washed. This set up would be ideal for an automated machine.
  • the matrix could represent the surface of a well of a microtiter dish so that target strands of many different samples could be easily isolated, simply by handling the whole microtiter dish. Again, this set up would be ideal for an automated machine.
  • the present cassette comprises two complementary oligonucleotides having 3 or 4 nucleotide overhangs such that a sticky end is formed.
  • the oligonucleotides can be in the range of 20 to 30 nucleotides long. These oligonucleotides will be easy to synthesise, particularly as it is known that specific primers labelled (e.g. with biotin) can easily be synthesized in a two-step procedure. However, it is to be appreciated that the current new methods allow incorporation of biotin or other labels during automated DNA synthesis. Moreover, there would be no need to purify the primers before using them as cassettes for the ligation reaction.
  • the cassette sequence contains a universal primer sequence.
  • cassettes that can be ligated to the target strands include:
  • N any of the four bases G,A,T,C].
  • the isolated extension product can be exponentially PCR amplified while still in the matrix support-bound state.
  • appropriate primers are repeatedly annealed to the primer annealing regions to form double stranded nucleic acids on the addition of nucleotides to the annealed primers by use of a polymerase enzyme. Then, on denaturing the formed double stranded nucleic acids, the extension products simply fall away from the polymer-bound template into the solution. These extension products could then serve as ordinary templates in further PCR cycles.
  • the products are also easy to collect and do not need to be separated by means of gels etc.
  • the use of the polymer-bound DNA template for exponential PCR has the extra advantage that it can be kept for long periods under appropriate storage conditions. This allows one to return to the bound fragments at a later stage to conduct any further experiments or amplification steps.
  • the bound ligation product can be removed from the matrix before undergoing exponential PCR amplification.
  • the PCR products can be sequenced by any of the existing dideoxy termination or chemical degradation techniques using radio- or fluorescently-labelled nucleotides or primers. It is known that the probability of undertaking a successful walk along a piece of genomic DNA from a target sequence into an unknown region depends on the unknown distribution pattern of suitable restriction sites (within the PCR range) . Since this distribution pattern is completely unknown, it is better to choose several different restriction endonucleases to digest genomic DNA. Generally between 2 and 30 different restriction enzymes should be used to find one which produces a target fragment having a size within the PCR range.
  • a number of appropriate cassettes each comprising specific sticky ends for the given restriction endonucleases, must then be separately ligated to the restriction fragments in a series of independent, parallel ligation reactions.
  • the present invention has the advantage that it is particularly well suited for undertaking a successful walk along a piece of genomic DNA from a known site into an unknown region.
  • the excision and ligation steps can be conducted in the same vessel.
  • all of the multi-ligation products can be pooled (i.e. multiplexed). They can then be easily isolated at the same time or in turn (see below) before commencing the other steps of the present procedure (i.e. linear PCR amplification using labelled primers that are complementary to the target sequence followed by isolation and purification of the labelled
  • the method is just as efficient for the simultaneous walking from different target nucleic acid fragments into a particular unknown region.
  • This special type of "multiplexing" is performed using several specific oligonucleotide primers for the first linear PCR amplification step. Each primer will be complementary to a respective specific region within the different target fragments from which one wishes to walk into the unknown regions. Each of the primers can then be linearly extended at the same time and, also, in the same reaction tube.
  • primers could carry the same separating labels. If the labels are the same, the different extension products can be isolated by use of the same support matrix. Each of the extension products can then be subjected to an exponential PCR amplification step using the cassette primer and a mixture of nested specific primers each carrying different separation labels.
  • a specific amplified fragment can be obtained and sequenced from the mixture by using a support matrix which allows specific isolation of the appropriate label.
  • the linear extension product of interest is isolated from the mixture by its separation on a solid support matrix with an appropriate binding group thereon. It can then be subjected to exponential PCR.
  • the present invention is particularly useful for a method called "multiplexing", wherein a number of different target nucleic acid fragments can be produced at the same time by the addition of a number of different restriction enzymes. Cassettes with appropriate labels can then be annealed to the known regions of the target fragments. Each of the primers can then be extended at the same time and the extension products can then be isolated by use of the same support matrix.
  • the present method therefore allows simultaneous exponential amplification of different specific DNA fragments by a single PCR amplification step.
  • Specific DNA fragments can be isolated and purified out of this mixture by using different solid support and affinity binding mechanisms.
  • different labels can be used in the initial linear amplification step.
  • the present invention is therefore particularly useful for preparing nucleic acids and allowing genomic walking along a large section of genomic nucleic acid, e.g. human nucleic acid.
  • the seguences obtained from the first primer i.e. the primer that anneals to the target sequence
  • the second primer i.e. the primer that anneals to the cassette
  • a ligation product which ligation product comprises a target fragment ligated to a cassette, in a method of genomic walking in any direction along the genomic nucleic acid, wherein the target fragment includes a first primer annealing region of known sequence and has annealed thereto a primer which has attached thereto a separating label, and wherein the cassette includes a second primer annealing region of known sequence.
  • the target fragment can be a fragment excised from genomic nucleic acid.
  • a first kit comprising:
  • (h) at least any one of the following: a buffer, a polymerase, a washing solution and a nucleotide solution.
  • the sample of genomic nucleic acid is DNA and this can include the DNA from one or more different organisms or segments of genomic nucleic acids.
  • the excising means is a restriction enzyme or a group of different restriction enzymes, including, optionally, appropriate digestion buffers.
  • the kit further comprises a number of cassettes ligatable to the excised fragment and having second primer annealing regions of known sequence.
  • the kit further comprises incubation buffer and/or a sample of T4 DNA ligase.
  • the kit comprises a number of first primers, each annealable to * the first primer annealing regions, and having attached thereto the same or a different separating label.
  • the kit comprises a number of second primers, each annealable to the second primer annealing regions.
  • the kit can include a number of support matrices, each having attached thereto the same or a different group that is cooperatively bindable to the separating labels.
  • the kit further comprises a number of third primers, each annealable to the third primer annealing regions of known sequence that are situated on the target fragments, preferably located between the first and the second primer annealing regions.
  • the kit further comprises fourth and further primers hybridisable to fourth or further primer annealing regions of known sequence which are situated on the target fragment, between the first and second primer annealing regions.
  • the kit can also include at least any one of the following: a buffer for in vitro amplification, a deoxynucleotide triphosphate solution, a polymerase, light mineral oil, one or more washing solutions, and means to attach a separating label to a (or any) first oligonucleotide primer that is of specific interest to the user in application of this kit.
  • a second kit comprising:
  • a ligation product comprising a target fragment of genomic nucleic acid ligated to a cassette, wherein the fragment includes a first primer annealing region of known sequence and the cassette includes a second primer annealing region of known sequence;
  • SUBSTITUTE SHEET (c) a second primer annealable to the second primer annealing region
  • a buffer at least any one of the following: a buffer, a polymerase, a washing solution and a nucleotide solution.
  • the second kit has a number of ligation products.
  • a method for extending cDNA clones using the PCR amplification method of the first aspect of the invention can be extended to both the 5' and 3' termini using specific primers hybridising to known regions of the cDNA and general oligonucleotides which are hybridisable to the termini of the cDNA.
  • the method of the fifth aspect of the invention comprises the following steps:
  • v) amplifying the unknown regions of the target cDNA between the known region and the 5' and/or the 3' terminus, by exponential amplification with an internal primer hybridising to the known region, and a general primer, which for the PCR product extended from the antisense strand- binding primer will comprise poly-dN where N is a base complementary to the dNTP used in step (iv) and for the PCR product extended from the sense strand-binding primer will comprise a primer complementary to or substantially identical to at least part of the first primer used in step (i).
  • the products of the PCR reaction may advantageously be visualised on an agarose gel, and directly sequenced if desired.
  • the first primer hybridising to the poly-A tail may further comprise the sequence of the RACE primer (5 1 (ATCGATGGATCCGCGGCCGC(T) 20 )3' ; M.A. Frohman et al., (1988) , PNAS JL5 / p. 8998-9002) .
  • the linear amplification of the cDNA may proceed for between 1 and 100 cycles, preferably 50 cycles.
  • the dNTP used in step (iv) is dGTP; and the poly-dN in step (v) is poly-dC.
  • the general primers used in step (v) for the extension product of the antisense strand-binding primer may comprise a restriction endonuclease cleavage site attached to poly-dN.
  • the cleavage site may be a BamHl a cleavage site.
  • the primer may be AACGAT(C) 15 .
  • the general primer may be the RACE primer, ATCGATGGATCCGCGGCCGC.
  • a third kit comprising:
  • reagents suitable for the synthesis of cDNA from an RNA preparation and, optionally, an RNA preparation i) reagents suitable for the synthesis of cDNA from an RNA preparation and, optionally, an RNA preparation;
  • RNA species ii) a first strand priming primer hybridisable to the poly-A tail of an RNA species
  • the kit contains primers hybridisable to both cDNA strands, in order to extend the cDNA in both directions.
  • the present invention has several advantages. Firstly, the present method allows the linear PCR amplification of a target strand within a complex genomic nucleic acid mixture followed by an effective isolation and purification of the extended product from this mixture. Thus, only target strands can be exponentially amplified in complete isolation from any other fragment.
  • the method therefore introduces a way of amplifying, with a very high degree of specificity, any target sequence from any complex genomic nucleic acid mixture. This is in direct contrast with all of the other known procedures, wherein the step of exponential amplification of the desired target sequence takes place in a very complex mixture of genomic nucleic acid fragments and/or genomic restriction fragments. This naturally leads to a certain degree of non-specificity.
  • the present cassette constructions are very simple and cheap to manufacture. They consist of two complementary oligonucleotides having sticky 3 or 4 nucleotide overhangs. These oligonucleotides are in the 20 to 30 bp range. They can be easily synthesised and can be used for ligation without any purification.
  • the present method can be done with any preparation of a genomic nucleic acid fragment.
  • the method is independent of the size and molecular weight of the fragment.
  • the PCR products can be easily sequenced from either end.
  • the sequence obtained from the first primer i.e. the primer that anneals to the primer annealing sequence in the target sequence
  • the sequence obtained from the second primer i.e. the primer that anneals to the annealing sequence in the cassette
  • the cassette contains the universal primer sequence (e.g. a M13 sequencing consensus sequence)
  • commercially available primers including fluorescently labelled primers can be used to sequence the first 300 to 500 bp from the cassette into the unknown region. If the amplified target fragment is quite large in size it cannot be sequenced in one go from both ends (i.e. from the first and second primer annealing region) and "walking primers" have to be synthesized and used for the DNA sequencing.
  • an important advantage of the present technique is that it, unlike the earlier methods, allows different samples to be pooled and processed simultaneously through the steps of linear amplification, isolation, purification and, finally, exponential amplification all at the same time and in the same reaction tube.
  • This "multiplex" strategy is based on digesting genomic nucleic acid with different restriction endonucleases, ligating appropriate cassettes to the produced restriction fragments, pooling the ligation products and processing them in a simultaneous fashion through all of the subsequent amplification, isolation and purification steps.
  • the efficiency of the method can still further be increased by using a number of different, specific primers that anneal to different primer annealing sequences on the target fragment. These primers can carry different separating labels for the exponential PCR step.
  • the labelled amplified target fragments can then be isolated and purified from this mixture by using solid supports with appropriate binding groups. Multiplexing large numbers of different samples is not possible by using any of the earlier methods.
  • a further advantage of the new technique is that any or a number of restriction endonucleases can be used to cut the genomic DNA into a number of different fragments.
  • the endonuclease can be chosen so that it cuts the genomic DNA
  • the cassettes can be designed so that they can anneal to the restriction endonuclease cut ends, as well as including the desired primer annealing regions.
  • the present method has an even further advantage that it eliminates or reduces the possibility of amplifying unwanted fragments.
  • the specific primer exhibits non-specific hybridisation within a complex genomic DNA mixture (which can also happen with each of the aforementioned methods) one will get some nonspecific binding of certain DNA fragments on the solid support during separation. This will lead to a mixture of several different DNA fragments. However, these impurities will be present in quantities that are much lower than those described in each of the earlier methods.
  • the stringent separation conditions in the present method e.g. high salt, alkaline conditions
  • Figure 1 is a general scheme of one use of the present cassette-mediated PCR technique, namely the exponential amplification of an unknown nucleic acid sequence within a gene;
  • Figure 2 is a schematic diagram of the method of the invention comprising the step of reamplification of a sample of the amplified mixture using a nested third primer;
  • Figure 3 is a general scheme portraying the application of the method of the invention to cDNA clone extension
  • Figure 4 is a representation of the result of a successful walk of a 1 kb nucleic acid sequence within the nematode unc 31 gene contained in a YAC clone within total yeast genomic DNA, following the general scheme of figure 1;
  • Figure 5 is a representation of the result of a successful walk of an about 600bp nucleic acid fragment within the Duchenne Muscular Dystrophy (DMD) gene extending the known nucleotide sequence of intron 50 by about 400 bp using total human genomic DNA, following the general scheme of figure 1.
  • DMD Duchenne Muscular Dystrophy
  • Figure 6 is a representation of the result of PCR walks extending human microclone M54 (Mackinnon et al., 1990 Am. J. Hum. Genet. 47, 181-186) by 700 bp in either direction, following the general scheme of figure 2.
  • the symbol R represents a restriction site
  • the symbol KL represents a known locus
  • the symbol UKL represents an unknown locus
  • the symbol ds NS represents a double stranded nucleotide sequence (such as genomic DNA)
  • the symbol ss NS represents a single stranded nucleotide sequence
  • the symbol OC represents an oligo- cassette
  • the symbol B represents a biotin labelled specific primer
  • the symbol SB represents a streptavidin coated bead
  • the symbols OHC and OHR represent nucleotide overhangs
  • the symbols Pi and P2 represent appropriate primers for exponential amplification by PCR.
  • a target fragment of nucleic acid (ds NS) is first excised from a larger sequence at restriction sites (R) by the use of an appropriate restriction enzyme (see step l) .
  • oligo-cassettes are ligated to the ends of the excised fragment (see step 2) .
  • Each cassette (OC) has a blunt end (E) and a nucleotide overhang (OHC) at its other end.
  • the overhang (OHC) is complementary to the nucleotide overhang
  • linear PCR amplification of the ligated cassette-target fragment is conducted using a specific biotin-labelled primer (B) (see step 3 ) .
  • the products of the linear PCR amplification step are then isolated by, for example, admixing the reaction mixture with streptavidin-coated magnetic beads (SB) (see step 4) .
  • the biotin-labelled PCR products (from step 3) selectively bind to the streptavidin-coated magnetic beads (SB) . The products are thus easily separated.
  • the separated biotin-labelled PCR products while still bound to the coated magnetic beads, are denatured and then exponentially amplified by the PCR technique using the two appropriate primers (PI, P2) (see step 5) .
  • P2 need not have the same sequence as B. If it does not, then a further specificity is introduced into the scheme, wherein it is ensured that only the fragment of interest is subjected to PCR amplification.
  • the double stranded nucleic acid products can then be sequenced by any standard sequencing technique (see step 6) .
  • the neighbouring fragment of nucleic acid can be isolated and sequenced by repeating the above steps.
  • primer B is the nested primer
  • primer A is the biotinylated first primer
  • primer C is the cassette- hybridising primer
  • Linear amplification is allowed to proceed as for the method of Figure 1.
  • the DNA sequences of interest are isolated on streptavidin-coated beads, and exponentially amplified using primers A and C.
  • An aliquot of the product of the exponential amplification is then re-amplified using nested primer B and cassette primer C.
  • SUBSTITUTESHEET which is complementary to a portion of the known sequence, adds a further level of specificity and improves the purity of the final product. This product can be sequenced directly without the need for cloning.
  • Figure 3 is a schematic representation of the application of the method of the invention to the extension of cDNA clones.
  • Step I comprises the synthesis of double stranded cDNA, the first strand having been primed with the RACE primer, 5' (ATCGATGGATCCGCGGCCGC(T) 20 )3* , which hybridises to the poly-A tail of the mRNA.
  • the region of the cDNA shown shaded black is the region whose sequence is known, while the unshaded regions represent unknown cDNA sequences.
  • Primers A and B, which are biotin-labelled, and C and D are constructed complementary to regions of the known cDNA sequence as shown.
  • step II the cDNA is split into two aliquots and linearly amplified for 50 cycles using only the biotinylated primers A or B, as shown. Each primer hybridises to a different strand of the cDNA.
  • Step III involves the isolation of the biotinylated product on streptavidin beads.
  • Step IV is carried out only on the aliquot which has been amplified using the primer hybridised to the antisense strand of the cDNA.
  • the 5' end of the resulting extension product is tailed with dGTP using terminal transferase.
  • Step V is exponential amplification of the two cDNA populations using two primers.
  • Primers C and D which are nested within the biotinylated primers A and B and the terminal primers, are used together with a poly-dC terminal
  • SUBSTITUTE SHEET primer for the antisense strand product and a RACE primer for the sense strand product.
  • the products of the exponential amplification may be directly sequenced and cloned if necessary.
  • figure 4 records the result of a successful walk of 1 Kb within the nematode unc 31 gene contained in a YAC clone and figure 5 records the result of a successful walk along a segment of total human DNA from exon 51 of the Duchenne Muscular Distrophy (DMD) gene into the adjacent intron and within this intron itself.
  • DMD Duchenne Muscular Distrophy
  • Figure 6 records the results of a bidirectional walk of 700 bp in each direction from human microclone M54.
  • Lanes P in panel A and B show two PCR products which extend the microclone M54 at both ends by approximately 700 bp towards two Pst sites. Lanes E and H in panel B also show other PCR products which are caused by hybridisation of the biotinylated primers to similar regions in the genome.
  • the total genomic DNA contained yeast genomic DNA (a recombinant YAC with the unc31 gene) , nematode genomic DNA and human genomic DNA. 250-500ng of the total genomic DNA was digested to completion in a 20 ⁇ l solution with the six restriction enzymes EcoRI, Hindlll, Xbal, Bglll, PstI and Hinfl which were then inactivated by heating (equivalent to step 1 in figure 1) .
  • Half of the digested DNA (125-250ng) was ligated either to 5 pmol of the appropriate EcoRI, Hindlll, Xbal, Bglll or PstI oligo-cassettes or to 50 pmol of a Hinfl oligo-cassette in a total volume of 20 ⁇ l (i.e.
  • lO ⁇ l of the appropriate digested genomic DNA 2 ⁇ l of 10 times T4 DNA ligase buffer containing 200 mM Tris/HCl pH 7.4, lOOmM MgCl 2 and lOOmM DTT, 2 ⁇ l of 6mM rATP, 4 ⁇ l water and l ⁇ l of the appropriate oligo-cassette in a concentration 5 or 50 pmol/ ⁇ l) (equivalent to step 2 in figure 1) .
  • Each of the double- stranded cassettes were 28 nucleotides long and had appropriate 4 nucleotide overhangs (EcoRI, Hindlll, Xbal, Bglll and PstI) or a 3 nucleotide (Hinfl) overhang.
  • the cassettes were prepared from crude oligonucleotides by mixing together equimolar amounts of both oligonucleotides representing the upper and lower strands of the cassette, heating the mixture for 10 min to 80°C and slowly cooling the solution to room temperature over a period of 30 minutes. The reaction volume was then diluted with water to lOO ⁇ l and heated for 10 min to 75°C to heat-inactivate T4 DNA ligase.
  • l ⁇ l (1.25 to 2.5ng) of the ligated product was amplified by linear PCR steps using a specific primer having a biotinylated 5'-end.
  • the linear PCR step was carried out in 1 x PCR buffer (10 mM Tris HC1, pH 8.3; 25°C; 50 mM KC1; 1.5mM MgCl 2 ; 0.01% gelatin) (Cetus) with 250 ⁇ M dNTP's, 0.5 ⁇ M specific biotinylated primer and 2.5 units Tag polymerase (Cetus).
  • the PCR rate was 50 cycles of 95°C for 0.5 minutes, 55°C for 1 minute, and 72°C for 1 minute.
  • biotin-labelled products were then isolated by mixing them with 25 ⁇ l of washed streptavidin-coated beads (Dynal S.A.) (equivalent to step 4 in figure 1).
  • the beads were washed three times with 40 ⁇ l of 1 M NaCl in TE buffer followed by three washes with 1 x TE buffer. After each washing stage, the supernatants were carefully removed.
  • the bead-bound DNA was then denatured by heat and subjected to exponential PCR amplification (equivalent to step 5 in figure 1) .
  • the conditions for exponential PCR amplification were similar to those for the linear PCR steps except that 35 cycles were carried out (instead of 50) and that two primers were present (instead of one) , each of which was unbiotinylated.
  • cassettes were used: a 21 bp long universal primer complementary to a part of the second primer annealing region within the cassette having the sequence of: 5* d(CGT TGT AAA ACG GCC AGT) 3'
  • nucleotide nematode unc31-specific primer complementary to and third primer annealing region within the target fragment having the sequence:
  • PCR amplification products were then separated on 1% low melting point (LMP) agarose, isolated and sequenced (equivalent to step 6) .
  • LMP low melting point
  • the sequencing was • performed by both a standard chemical degradation and by a dideoxy termination technique using radioactive and/or fluorescent labels.
  • oligonucleotide cassette 28 nucleotides long having an additional 4 nucleotide overhang.
  • the "upper" oligonucleotide is the same for all cassette constructions and contains the (-21) M13 primer sequence: 5 1 d(CGT TGT AAA ACG GCC AGT GCC AAG T)3 » .
  • the "lower" oligonucleotides were synthesised for the restriction enzymes EcoRI, Hindlll, Xbal, and Bgll and their nucleotide sequences are:
  • the lower oligonucleotides are not phosphorylated and therefore are not covalently bound to the restriction fragment during ligation.
  • a double-stranded oligonucleotide cassette 28 nucleotides long having a 4 nucleotide overhang we have used a double-stranded oligonucleotide cassette 28 nucleotides long having a 4 nucleotide overhang.
  • the "lower" oligonucleotide is the same for all cassette constructions and contains the complementary (-21) M13 primer sequence:
  • EcoRI, Hindlll, Xbal, Bglll and PstI oligo-cassettes are prepared at 5 pmol/ ⁇ l concentration by dissolving approximately 500 pmol of the upper and the respective lower oligonucleotide in 100 ⁇ l water.
  • the cassettes are heated to 80°C for 5 min and than slowly cooled down to RT over a period of 30 min before using in the ligation reactions.
  • the cassettes are stored at -20°C and thawed on ice before use.
  • Linear PCR was performed in 20 ⁇ l using 0.5 ml test tubes.
  • the biotinylated product was recovered and puryfied by the addition of 25 ⁇ l beads directly to the PCR mixture. After incubation at room temperature with occasional mixing, the beads were sedimented with a strong magnet. The supernatent and the oil were removed and the beads washed 3 times with 40 ⁇ l TE/0.1 M NaCl and 3 times with TE. The beads were finally resuspended in 4.5 ⁇ l H-O.
  • Two primers were used to amplify the single-stranded template bound to the beads: the locus-specific primer from step 1 of the method but without biotin (primer A, figure 2) and the cassette-specific primer (primer C, figure 2) with the following sequence: d(TGT AAA ACG ACG GCC AGT GCC) containing the M13 universal forward primer sequence.
  • Exponential PCR is performed in 20 ⁇ l comprising 9.0 ⁇ l water, 2.0 ⁇ l 10 x PCR buffer, 2.0 ⁇ l 2.5 mM dNTP mix 1.0 ⁇ l 10 ⁇ M cassette-specific primer, 1.0 ⁇ l 10 ⁇ M locus-specific primer, 4.5 ⁇ l bead bound DNA template, 0.5 ⁇ l Taq polymerase. After overlaying with light mineral oil the following cycles were performed: 95°C 90s, [95°C 30s, 55°C 60s, 72°C 60s] x 35, 72°C 180s.
  • the PCR product Prior to direct sequencing, the PCR product was purified by gel electrophoresis ' using LMP agarose, in order to remove excess nucleotides and primers, as well as minor DNA contaminants.
  • the DNA can be recovered using a Qiagen gel extraction kit, a Gene Clean II or Mermaid kit (Bio 101) . Sequencing of PCR products was performed by linear
  • cDNA was synthesised from 0.1 ⁇ g of human fetal brain mRNA using AMV reverse transcriptase (Anglican Biotechnology, Colchester UK) under standard conditions but using the RACE- oligo-dT primer: 5'ATCGATGGATCCGCGGCCGCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT3' .
  • the reverse transcriptase was heat-inactivated and the cDNA diluted to lOO ⁇ l. A portion of this single stranded cDNA (1 ⁇ l) was then used for PCR.
  • reaction constituents 32.5 ⁇ l water, 5 ⁇ l 10 x PCR buffer (Cetus), 5 ⁇ l 2.5 mM dNTP mix, 2.5 ⁇ l of each primer 5' (TTT GTC GAC. and primer 5 1 d(TTT GTC GAC) (the underlined portions representing a tail containing a Sail site) , 2 ⁇ l human fetal brain cDNA (2ng) , and 1 ⁇ l (5 units) Taq polymerase (Cetus) .
  • the mixture was overlayed with 40 ⁇ l of light mineral oil.
  • Standard PCR cycles were 95°C 60s, [95°C 30s, 55°C 30s, 72°C 180s] x 40.5 ⁇ l of the PCR product were sized on a 1% " agarose gel. The remaining 45 ⁇ l were passed through a Strategene PrimeEraseTM column according to the manufacturer's recommendation in order to remove excess nucleotides, primers and polymerase, digested with Sail and subsequently cloned into pUCl ⁇ . Recombinant colonies having
  • Figure 4 shows a successful 1 kb walk along the nematode unc31 gene within total yeast DNA.
  • lane 1 of figure 4 shows the predicted 1 kb band resulting from the exponential amplification of the target DNA between the nematode unc31-specific primer:
  • Lane 2 shows the result of exponential amplification with a nested nematode unc31-specific primer:
  • Lane 3 shows a range of molecular weight markers (123 base pair ladder, Bethesda Research Laboratories) .
  • the PCR product from the exponential PCR amplification step with the cassette specific primer and the nested nematode unc31-specific primer (lane 2) was subjected to DNA sequencing. The results of which show the expected nucleotide sequence confirming the walk.
  • the short molecular weight bands in lane 1 and 2 represent side products which are often observed during PCR and are due to some side reaction with the applied primers. These bands are not caused by the walking method itself.
  • Figure 5 shows a successful 600 bp walk along the human DMD gene within total human genomic DNA.
  • lane 1 shows a 600bp walk along the DMD intron 50 resulting from exponential amplification with a nested human DMD intron 50-specific primer:
  • Lane 2 shows a range of molecular weight markers (123 base pair ladder, Bethesda Research Laboratories) .
  • the PCR product from the exponential PCR amplification step with the cassette specific primer and the nested human DMD intron 50-specific primer was subjected to DNA sequencing. The results of which show the expected overlapping nucleotide sequence confirming the walk. Around 400 bp were obtained from the end of the cassette and this represented new DMD intron 50 sequence. This was used to synthesise a new biotinylated specific primer for the next cycle.
  • S UB S TITUTESHEET Figure 6 shows the extended sequence derived from human microclone M54 of the human CAM-LI gene.
  • a single PCR product was produced after the reamplification step 4 (figure 6, panel A, lane P) . This suggests that a PstI site is located approximately 700 bp upstream of M54. The failure to amplify a product after digestion with the other enzymes suggest that these cut too far away from the locus.
  • the use of a second primer set, directed downstream of M54 resulted in three different PCR products as can be seen from the agarose gel (figure 6 panel B, lanes E, H and P) and would suggest that three restriction sites (EcoRI, Hindlll and PstI) might be located downstream from M54.
  • the 800 bp long Hindlll-PCR product (figure 6, panel B, lane H) is not a extension of the M54 microclone. It was identified as an LI repeat.
  • the sequence of the Hindlll- product matches the 11 kb long human LI repeat located in the intergenic region of the epsilon and gamma globin gene between nucleotide positions 7744 and 8544 and shows about 80% homology.
  • the human LI repeat has a predicted Hindlll site at position 7744-7750.
  • the sequence of the microclone M54 is not contained within the above mentioned LI repeat nor does the primer set used for walking show any significant match. Hybridisation experiments with M54, on the other hand, show that the human genome has more copies of this microclone.
  • the Hindlll walk represents an extension from such an M54-like sequence into an adjacent LI repeat.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The polynucleotide amplification method described includes the use of a labelled primer that is complementary to a specific known sequence in a target strand. A linear polymerase chain reaction (PCR) step is first conducted with the labelled primer. The labelled linear extension products are then isolated by means of a suitable support matrix that cooperatively binds to the label. The labelled extension products can then be subjected to exponential PCR in the absence of any other strands.

Description

POLYNUCLEOTIDE AMPLIFICATION
Field of the Invention
The present invention relates to polynucleotide amplification.
Background to the Invention
There has been much interest recently in determining the sequence of the human genome. The sequence of many genes and their location within the human genome are already known. It has been proposed that the sequences of unknown areas of the genome could be determined by first determining the sequence of areas flanking the known genes. In order to do so, it will be necessary to determine the sequence on either side of a known sequence and then, by a series of similar steps, "walk" up or down the genome. In the present specification, a known sequence within a genome is referred to as a target sequence. Also a fragment of nucleic acid, for instance derived from the human genome, containing such a target sequence is referred to as a target fragment. The method of the present invention allows one to walk up or down a genome starting from a target sequence.
In particular, the invention relates to polynucleotide amplification by a cassette mediated polymerase chain reaction technique and to a kit for the same. The term "cassette" means a short section of double stranded (ds) nucleic acid having a sticky end and a blunt end. The cassette has a known sequence.
The polymerase chain reaction (PCR) is an extremely powerful biochemical technique because it leads to the in vitro production of many copies of a target sequence, thereby avoiding cloning. It is particularly useful for producing acceptable quantities of a target sequence when only a very small amount of the target sequence is naturally available.
SUBSTITUTE SHEET The PCR technique has been used to detect nucleic acid sequences associated with infectious diseases, genetic disorders or cellular disorders such as cancer. A typical example is the use of the technique in the prenatal diagnosis of sickle cell anaemia using DNA obtained from foetal cells.
The key initial work on the PCR technique was conducted by Saiki and his co-workers between 1985 and 1987 (Saiki, R. et al . [1985] Science 230 1350 and [1987] Science 239 487). However, despite its recent introduction to biotechnology, it is already a well established technique. A number of patent applications have now been published on the PCR technique and these include EP-A-0258017, EP-A-0201184, EP- A-0200362 and GB-A-2221909.
In the standard PCR technique, a sample of double stranded nucleic acid (e.g. duplex DNA) having within it a target sequence is first denatured, usually by heating, so that the two strands of the ds nucleic acid become separated from each other. Two primers are then added. These primers are single stranded oligonucleotides, one of which has a sequence complementary to a region at the 5• end of the sense strand of the target sequence. The other primer is complementary to a region at the 5* end of the antisense strand of the target sequence. There need not be exact correspondence between the primers and the strand regions. The conditions are then altered to allow the primers to anneal to their respective strands.
Next, a DNA polymerase (e.g. the Klenow fragment of T4 DNA polymerase, the thermally stable polymerase from Thermuε Aquaticuε, the thermally stable polymerase from Bacillus Stearothermophilus , T7 DNA polymerase or modified versions thereof) and the four deoxynucleotide triphosphates are added. Each primer then becomes extended by the synthesis of a new nucleic acid strand complementary to the strand to
SUBSTITUTE SHEET which the primer is annealed. Primer extension products are thus formed.
The various strands are then separated from each other, again by denaturation, and, if necessary, more of the primers are added. Usually, in practice, there is already an excess of primers at the start of the reaction and so further amounts of primers need not be added.
The same sequence of reaction steps is then repeated. In repeating the steps, the primers also bind to the previously formed primer extension-products resulting in the formation of further nucleic acid strands.
By repeatedly recycling the products through the reaction steps, the amount of primer extension products (i.e. the target sequence) increases exponentially.
The PCR technique therefore includes six key steps:
1. preparing a solution of single stranded nucleic acid, for example from the denaturation of a double stranded (ds) nucleic acid,
2. binding a primer at the 5' end of each of the strands of the target sequence,
3. forming double stranded nucleic acid by adding nucleotides to the primers bound on the target nucleic acid strands by use of a polymerase enzyme,
4. denaturing the double stranded nucleic acid thus formed,
5. repeating steps 2 to 4, leading to exponential amplification in the production of double stranded nucleic acid, and
SUBSTITUTE SHEET 6. isolating the prepared nucleic acid.
It will be seen from this that in order to carry out a PCR reaction, the sequence of the target nucleic acid, at least at its 5 and 3' ends, must be known.
There are many factors which are known to influence the specificity of a given PCR reaction. Some factors include cycling time and temperature, cycling profile of temperature, PCR buffer quality and strength, additions (e.g. cations) to the PCR buffer, nucleotide triphosphate quality and concentration, DNA polymerase quality and concentration, primer length and concentration. These parameters can all be optimized.
Other factors also play an important role in the specificity of a PCR amplification, for example the primary structure of the PCR primers (GC content) , formation of secondary structures during the PCR amplification and complexity of the DNA mixture and concentration of a given template. However, only some of the aforementioned factors have been investigated so far and, in general, little is known about the detailed mechanism of PCR amplification.
It is to be noted that in the PCR technique extension of the oligonucleotide primers occurs in a convergent manner relative to the target sequence, i.e. extension in the 51- 3 ' direction occurs within the target sequence.
The extension direction in the PCR technique creates a drawback. It only allows amplification of the sections of nucleic acid located between the two primer annealing regions. It does not allow amplification of nucleic acid sequences that flank the target sequence. Moreover, if the sequence at only one end of the flanking region (i.e. the end adjacent the target sequence) is known, then suitable primers cannot be constructed to enable extension of the primers towards the target sequence.
Some techniques have been recently developed to overcome this drawback. The first is called the inverse polymerase chain reaction (Triglia, T. et al [1988] NAR 16. 8186; Ochman, H. et al [1988] Genetics 120 621) . This technique allows the amplification of sections of nucleic acids, even of unknown sequences, that flank a target sequence.
Essentially, the inverse polymerase chain reaction (IPCR) involves a first step of restricting a sample of ds nucleic acid with a restriction endonuclease which forms sticky ends and which does not cut within the target sequence. This restriction produces a number of sticky ended fragments, including one which contains the target sequence and has a flanking sequence of unknown sequence at each side.
The sticky ends of each fragment thus produced are then ligated to each other, resulting in circularisation of the cleaved fragments. In this circularisation step, the fragment containing the target sequence forms a circle wherein the unknown flanking regions form a continuous unknown region connecting the ends of the target sequence. The continuous unknown region can then be subjected to exponential PCR amplification by inter alia the addition of the two primers which anneal to the respective ends of the target sequence in such a way that extension takes place into the unknown region. No amplification of circles not containing the target sequence will occur as there will be no site to which the primers can anneal.
The IPCR technique therefore includes the following three key steps:
1. preparing a solution of linear double stranded nucleic acid fragments having sticky ends, one of which fragments contains the target sequence.
SUBSTITUTE SHEET 2. circularising the nucleic acid fragments, and
3. amplifying the circularised target fragment by the PCR technique.
However, there are a number of problems inherent in and encountered with the IPCR technique, largely because circularization occurs only under very specific conditions which have not been investigated in detail and are therefore almost unknown.
Firstly, it is known that the circularisation step (step 2) is dependent on at least two factors, principally the concentration and the size of the nucleic acid fragments.
In this regard, it is known that only very dilute solutions of nucleic acid fragments bearing sticky ends can be ligated to form circles. Moreover, the formation of circles is by no means certain and the yield may vary considerably during the ligation step. In particular, the circularization is always accompanied by the formation of linear concatamers of nucleic acid fragments with sticky ends.
The formation of concatamers is due to the fact that each of the excised nucleic acid fragments, whether or not they contain the target sequence, has sticky ends that are complementary not only to each other but also to the ends of all the other fragments produced from the initial nucleic acid digest. The sticky ends of the target fragment can thus ligate not only to themselves but also to the sticky ends of other target fragments or to the sticky ends of other fragments. Therefore, linear concatamers can be, and often are, produced which have complex and unpredictable structures. Also", some of the formed concatamers form enlarged circles that, later on, might interfere with the subsequent PCR amplification.
SUBSTITUTE SHEET The formation of concatamers is clearly an unwanted side reaction, particularly as, under certain circumstances, concatamer production can be dominant, leading to a high excess of linear concatamers over circles.
In practice, each of the linear concatamers and enlarged circles containing the target sequence can undergo exponential PCR amplification because they contain the binding sites for the primers. This leads to the amplification of nonspecific products, which is clearly disadvantageous. IPCR is therefore critically dependent upon the formation of the correct circular nucleic acid fragment.
The size of the nucleic acid fragment is also a critical factor for the success of the IPCR technique. Ideally, the size of the initial linear excised target fragment and the corresponding circle must be within the range suitable for the polymerase chain reaction. This range is normally between 100 base pairs (bp) and several kilo base pairs (kb).
However, it is obvious that the actual size of the target fragment will very much depend upon the distribution pattern of the recognition sites for the restriction endonuclease used in the digestion of the original sample, for example of genomic DNA. In practice fragments greater than 2 or 3 kb are often formed. These large fragments cannot therefore be amplified by the IPCR technique.
Another disadvantage of the IPCR technique is that one achieves amplification of both of the flanking sequences. There may be situations when one needs to amplify just one of the flanking regions, particularly as there is now more and more an important need to do gene walking only in one direction.
SUBSTITUTE SHEET It would therefore be advantageous to have a PCR technique that does not include a circularisation step but allows exponential amplification of a flanking sequence. A number of such schemes have been reported in the literature (e.g. Shyamak and Ames [1989] Gene 84. 1; Kalman et al [1990] Biochem and Biophys Res Comm 167 504; Roux et al [1990] Biotechniques 8.48 ; and Markham et al GB-A-2221909) .
Each of these reported schemes employs the use of a special cassette for ligation to each end of the sticky-ended nucleic acid fragments formed by digestion of a nucleic acid sample with a restriction enzyme. It is to be noted that each of the special cassettes ligates not only to the ends of the target fragment but also to the ends of the other restriction fragments produced by the initial digestion of, for example, genomic DNA.
Each of the methods employs either a different cassette construction or a different sequence of reaction steps to achieve a degree of selectivity during the amplification reaction.
In the Shyamak and Ames method ([1989] Gene j34. 1), the cassette (which is called a vector) includes one of the primer annealing regions. The excised target nucleic acid fragment includes the required other primer annealing region. Therefore, in theory, one could amplify, by exponential PCR, the nucleic acid region located between the cassette and the primer annealing region in the target sequence. In this method, the ligated cassette-target fragment is exposed to both of the primers at the same time.
However, problems arise with this scheme because all of the excised fragments have ligated cassettes at each end which contain one of the primer annealing regions. Thus, unwanted fragments will be amplified during the exponential PCR amplification of the target fragment because the cassette primer, once hybridised to the cassette, can be extended by the polymerase from both ends of the unwanted fragments. This, of course, leads to the presence of a large number of unwanted nucleic acid fragments in the final mixture, making further analysis impossible. This reduces the efficiency and precision of the method.
The Shyamak and Ames method therefore has only a limited use and can only really be used for the amplification of "simple" DNA samples (e.g. from very simple prokaryotic organisms) . The method cannot really be used for amplifying a fragment within a complex genomic DNA mixture (e.g. from a eukaryotic organism) because, in addition to the target fragment (usually present in only one or a few copies) , millions of unwanted fragments having ligated cassettes will also be exponentially amplified.
Furthermore, since the cassette primer is only present in a limited quantity, most of the fragments including the target fragment will not be amplified because the primer will soon be exhausted. This is disadvantageous. Furthermore, if theoretically an excess of cassette primer is present, a mixture of millions of different fragments would be amplified. This is again clearly disadvantageous.
The reported Kalman and Roux methods (Kalman et al [1990] Biochem and Biophys Res Com 167 504 and Roux et al [1990] Biotechniques 8. 48) seek to overcome some of the above mentioned problems. However, there are similar problems associated with these methods (see below) which prevent them from being used for any given amplification problem. In each of these methods, synthetic oligonucleotide cassettes with sticky ends are used in the ligation reaction following the digestion of genomic DNA with a given restriction endonuclease. These cassettes ligate to the ends of all of the nucleic acid fragments (i.e. the target fragment and the unwanted fragments) .
In the Kalman method, the cassette comprises two complementary oligonucleotides that form a double-stranded piece of DNA having a sticky end. However, one oligonucleotide does not have a phosphate group at its 51- end. Therefore, during the ligation reaction, only one of the oligonucleotides will covalently link to the excised nucleic acid fragments. The unligated oligonucleotide is then removed by selective ethanol precipitation in the presence of ammonium acetate. The ligated cassette thus becomes single stranded.
Next PCR amplification is carried out in the presence of both of the primers i.e. the primer complementary to a region of the target sequence and the primer complementary to a portion of the cassette.
The first cycle of the PCR amplification comprises only a linear extension of the primer annealed to the target sequence. In theory, it is only after this linear reaction step that the cassette primer can take part in the exponential PCR cycles by hybridising to the extension product. Therefore, and according to the reported method, unwanted nucleic acid fragments should not be amplified either in a linear or exponential fashion.
However, the proposed scheme does have some drawbacks. In particular, when the technique is used in the amplification of procaryotic genomic DNA, a large background of amplified fragments are observed, with only a slight excess production of the amplified target fragment.
Another drawback of the technique relates to the fact that the unligated oligonucleotide can never be quantitatively removed by precipitation. Even if the removal is 99% or 99.9% complete * there will be enough unligated oligonucleotides associated with unwanted fragments to allow linear amplification of these unwanted fragments.
<».- HE*_i' A third drawback stems from the fact that the incomplete DNA strand in the cassette serves as a primer and can therefore be elongated by polymerase in the presence of all four deoxynucleotide triphosphates to yield a complete double-stranded cassette at the ends of all excised fragments.
Taking into account each of these drawbacks, non-specific amplification will be a major part of the amplification step when both of the primers (i.e. one specific for the target sequence and one specific for the cassette) are present in the reaction mixture.
Therefore, the proposed Kalman method will only work successfully for very simple prokaryotic mixtures and, even then, high background levels of unwanted products will be observed.
The Roux method follows a pattern similar to the Kalman method, wherein a mechanism is introduced to produce, in theory, only one strand suitable for exponential PCR amplification.
In the Roux method, the cassette consists of two oligonucleotides of different lengths. The short strand is known as the tailed linker or incomplete strand. The longer strand, which is at least 15 to 20 nucleotides longer, is known as the anchor template or complete strand. The incomplete strand has a number of bases that are non- complementary to the complete strand. These mismatches should, in theory, prevent a filling-in reaction during the PCR amplification step by the action of the polymerase. Moreover, because there is not a region in the short strand that is complementary to the primer annealing region in the complete strand, there should only be one template produced by a linear PCR amplification step in the presence of the two primers (i.e. the primer complementary to the target sequence and the primer for the cassette) . The ligated cassette-target fragment itself is therefore not suitable for exponential PCR amplification but it is this linear extension product that is suitable for exponential PCR amplification. Thus, it is only after this first linear extension step that the second primer can hybridize to the extension product and create exponential amplification.
However, there are problems encountered with the Roux method. In particular, it is not uncommon for the incomplete strand to be filled-in to yield a strand that is complementary to the remainder of the complete strand. This filling-in reaction should be orders of magnitude less than in the Kalman method. However, in spite of the mismatches, this filling-in reaction will still produce templates of all of the unwanted fragments that are suitable for exponential PCR amplification.
In summary, the Kalman and the Roux methods do not overcome the problems experienced with the Shyamak and Ames method.
The method of Markham et al (GB-A-2221909) apparently tries to overcome the problems associated with each of the above described methods. In brief, the Markham method, like the Kalman and Roux methods, includes the use of synthetic oligonucleotide cassettes (which are defined as vectorette cassettes) that are ligated to both ends of both the target fragments and the unwanted fragments present in the digested original sample, e.g. genomic DNA. The cassettes are designed to enhance the specificity of the PCR amplification step.
The Markham method has two variants which employ two types of oligonucleotide cassette. Each cassette is constructed so that, after ligation to the fragments, a cassette primer cannot be hybridised to the cassette itself. Instead, the cassette primer should, in theory, only hybridise to an extended primer product. This effect is achieved by constructing cassettes comprising two oligonucleotides that are only partially complementary. The primary structures of both oligonucleotides have non-complementary middle portions which a remain single-stranded.
In the first variant, the cassette comprises a short oligonucleotide and a long oligonucleotide. The short oligonucleotide is blocked at its 3'-end with either a dideoxynucleotide or another suitable nucleotide derivative. This prevents a filling-in reaction during the PCR amplification steps. The long oligonucleotide is at least 15 to 20 nucleotides longer at its 3'-end.
In the second variant, both of the oligonucleotides still possess a certain degree of non-complementarity but they are each more than 50 nucleotides long.
Therefore, it should only be after the primer complementary to the target sequence has been extended by one linear PCR cycle that the primer complementary to the cassette can hybridise to the extended DNA strand and thus take part in the PCR reaction to exponentially amplify the target fragment.
In the Markham method, as in the other earlier methods, the PCR amplification steps are usually carried out in the presence of both the two primers - i.e. a primer that is complementary to the target sequence and a primer that is complementary to the cassette portion.
In the first PCR cycle, only the primer complementary to the target sequence should be linearly extended. This should create a template suitable for exponential PCR because the specific primer that is complementary to the cassette can now hybridize to the first extended PCR product and can thus be extended itself.
In theory, therefore, by repeating the whole process several times (usually 20 to 40 cycles) , only the ligated cassette-
SUBSTIT t target fragment should be amplified, whereas all unwanted restriction fragments should not be amplified. The amplification should, therefore, be highly specific because the special cassette design should exclude unwanted fragment amplification.
The Markham method can include the optional step of conducting two separate amplification reactions, namely a linear amplification followed by an exponential amplification. That is to say linear amplification for several cycles using only the primer complementary to the target sequence followed by the addition of the primer complementary to the cassette leading to exponential PCR amplification in the presence of the two primers.
One possible reason for splitting the original method into two separate amplifications might be that the amount of target fragment could be slightly enlarged (by a factor of 2 to 100 if up to 100 linear PCR cycles are carried out) before exponential PCR amplification starts. Therefore, this two stage process might only be necessary if one starts from a few copies of a target fragment or just one single target fragment (e.g. from egg or sperm cells) .
However, even though it has been shown that exponential PCR will work on a single DNA molecule, it has been shown that in some cases too much DNA template often leads to a dramatic increase of non-specificity during exponential PCR amplification. Therefore, the second version of the Markham method will not significantly differ from the first version because exponential PCR will also take place in the original complex mixture of digested human genomic DNA which, in turn, will cause non-specificity during the PCR amplification steps.
The Markham method can also include the addition of SI nuclease after several linear amplification steps have been carried out with the primer specific for the target
SUBSTITUTE SHEET fragment. The SI nuclease should in theory degrade all of the remaining single-stranded unwanted DNA fragments. This should increase the relative concentration of the ligated cassette-target fragments over the remaining background levels of the unwanted nucleic acids that are present. However, SI is known to attack to a significant extent double-stranded DNA. Therefore, this approach is not always a practical solution to reduce the complexity of the reaction mixture before starting exponential PCR amplification.
It initially appears that the Markham method offers certain advantages over the earlier mentioned methods. In particular, the cassettes are designed to achieve a specific amplification of the target fragment following a linear PCR amplification step. Thus, in theory, even though the PCR amplifications are performed in the presence of both primers (i.e. the specific primer which hybridizes to the target sequence and the cassette primer which hybridizes to the extended product and not to the cassette itself) , the first step should be a linear amplification of the target DNA fragment starting from the annealed specific primer. In all the other PCR cycles both primers take part and the amplification is therefore exponential.
However, there are problems associated with the Markham method. For example, if it is used for the amplification of genomic DNA, the PCR amplification steps have to be carried out in the presence of the whole genomic DNA mixture of restriction fragments. This genomic mixture will contain millions of different fragments, particularly in the case of complex genomes. These fragments will cause some degree of nonspecific amplification.
The non-specificity is due to the clustering of the restriction sites in genomic DNA of complex eukaryotic organisms. Therefore, many quite small restriction fragments will be present in the mixture which, certainly after denaturing, will serve as primers during PCR and can cause a high degree of non-specificity.
Accordingly, and even though the Markham method initially appears technically easier to perform, it does lead to some degree of non-specificity with both complex genomic DNA mixtures (like human genomic DNA) and simpler DNA mixtures
(e.g. procaryotic DNA).
Aside from the non-specificity of the Markham method, it does have some further drawbacks. For example, special oligonucleotide blocking groups (e.g. dideoxynucleotides at the 3 '-end) or very long oligonucleotides (e.g. above 50 nucleotides in size) are necessary for the construction of the cassettes. These are expensive and difficult to produce, especially if many different cassettes are to be used.
Moreover, it is also not feasible to conduct simultaneous exponential amplification of different specific nucleotide fragments at the same time. Thus, as with the other earlier mentioned cassette-mediated PCR amplifications, this method is not suitable for a multiplexing process (see below) .
The Markham method apparently includes an optional step of isolating the extended primer strand by use of a gel. However, it is to be noted that the isolation of the amplified target fragments is carried out after the exponential amplification stages. Also, it is a widely recognised fact that gel separation methods are not only laborious and time consuming, but they are really only effective for detecting and isolating large quantities of large sized strands. Small strands are difficult to separate from each other using gels.
The reported Markham method includes no other method for isolating the extended primer strands. In particular, there is no disclosure of an isolation step dependent upon the use
SUBSTITUTE SHEET of a labelled primer.
An isolation procedure, using a labelled primer, has been reported in the literature (Hultman et al [1989] N.A.R. .17 4937) . However, in the Hultman method, the labelling is only carried out during the exponential PCR amplification of a target DNA strand. This labelling step is then followed by binding the labelled DNA extension products to a polymer, separating the strands and subsequently sequencing the polymer-bound DNA strands.
In order to overcome the problems associated with each of the above techniques and methods, we have developed a modified PCR method which allows the amplification of nucleic acid regions that flank a target sequence. In particular, the present method does not require the use of a circularisation step or the use of specially designed cassettes (e.g. cassettes having incomplete strands) .
More importantly, the present method allows one to pick out specific target nucleic acid strands containing the target sequence from a reaction mixture containing many different strands. The specific picking out of the target strands ensures that exponential PCR amplification only occurs on the target strand.
Also, most of the walking and sequencing methods at present are based on DNA cloned into plasmidε, phages, cosmids or yeast artificial chromosomes (YACs) . Primary cloning and subcloning is very tedious and time-consuming. It has been shown that some regions of genomic DNA cannot be cloned at all or prove to be very difficult to clone. These portions of genomic DNA cannot therefore be analysed by presently available walking or sequencing techniques.
Also, other regions containing repetitive or other structures are difficult to clone in certain vector systems. Ordering of individual clones of one library by mapping or
SUBSTITUTE SHEET fingerprinting as well as their sequencing could be greatly improved if one could easily walk and sequence through gaps of a given library by using an efficient, fast and specific in vitro amplification method starting from a target sequence. This would avoid the earlier necessary construction of different libraries from the DNA of one organism (or part of it) using different vector systems.
The present method is generally well suited for walking and sequencing along any piece of genomic DNA without the need for cloning. The present method can also be used for the detection of point mutations, deletions, and insertions within any genomic region of interest. This is especially advantageous for the detection of any modifications in the coding or non-coding regions of genes associated with genetic disorders, cellular disorders or infectious diseases. This gives one the potential to design specific diagnostics. It also allows the early determination of polymorphism for both alleles in many individuals, even down to the nucleotide level. Furthermore, the method has applications in the identification and sequencing of unclonable loci, the identification of YAC termini for physical mapping and the extension of partial cDNA clones. Also, since the method contains only straight forward biochemical reactions, it can easily be automated.
The present method thereby overcomes and avoids most of the aforementioned problems.
Summary of the Invention
According to a first aspect of the present invention there is provided a polynucleotide amplification method comprising the steps of:
i. forming a ligation product by ligating a target fragment, having sticky ends and including a first primer annealing region of known sequence, with a cassette, having a sticky end complementary to one of the sticky ends of the target fragment, the cassette including a second primer annealing region of known sequence, such that in the ligation product the known second primer annealing region is remote from the first primer annealing region,
ii. denaturing the ligation product,
iii. annealing only a first primer to the first primer annealing region, the first primer having attached thereto a separating label,
iv. adding nucleotides to the bound primer by use of a polymerase enzyme to form an extension product,
v. denaturing the ds nucleic acid extension product thus formed,
vi. optionally repeating steps 3 to 5, leading to linear amplification in the production of single stranded (ss) nucleic acid having the separating label attached thereto,
vii. isolating the prepared ss nucleic acid by binding the attached label to a support matrix having a group cooperatively bindable with the label, and
viii. subjecting the isolated nucleic acid to exponential PCR amplification.
It is important to note that in the present method, a linear PCR amplification step is carried out first of all and independently from an exponential PCR amplification step. This is achieved by introducing a label in the linear amplification step that allows purification of the target fragment on a solid support before the exponential PCR amplification.
SUBSTITUTE SHEET Therefore, because only the labelled products (i.e. the primer extension products corresponding to the target fragment) will bind to the matrix, any unlabelled products (which will not bind to the matrix) can be washed away by using suitable solutions (e.g. buffers, alkaline solutions etc) .
The isolation step allows the easy isolation of and rapid purification of only the labelled fragment (i.e. the strand corresponding to the target fragment) . The labelled fragments can then be subjected to exponential PCR amplification in the absence of any of the unwanted fragments. This leads to an efficient, effective and highly specific method for isolating and amplifying a target fragment having regions of unknown sequence from complex genomic DNA mixtures of restriction fragments - i.e. it reduces a complex mixture of restriction fragments to a single fragment or multiplex of fragments.
Thus, in the present method, a linear PCR step is used to introduce a specific binding label into a strand that will be complementary to the ligated cassette target fragment. In principle, one linear PCR cycle with a labelled primer should be sufficient to separate the labelled fragment from a complex genomic mixture before exponential amplification takes place. The present method therefore reduces a complex DNA mixture (e.g. genomic DNA) to a very simple mixture. Furthermore, because the separation of the labelled strand is completely specific, only the correct template will be present for the exponential PCR amplification step.
Preferably, the target fragment is derived from a digestion of a sample of DNA: For example, the sample of DNA could be total genomic DNA of a prokaryotic or eukaryotic organism, mixtures of total genomic DNA from different organisms or different individuals, a DNA fragment cloned in a vector like a phage, cosmid, YAC or mixture of different cloned
SUBSTITUTE SHEET DNA fragments.
Preferably, the sample of DNA is digested with a suitable restriction enzyme or with a combination of different restriction enzymes.
Preferably, steps 3 to 5 are repeated up to 100 times; advantageously up to 50 times.
To enhance further the specificity of the amplification, a more specialised exponential PCR amplification step could be conducted using a third specific PCR primer. This third primer would be a nested primer with respect to the first primer. In this case, the target fragment would have a third known primer annealing region distanced from the original first primer annealing region. Preferably, this third region is situated between the first primer annealing site and the second primer annealing site of the cassette.
If desired, the third primer would also have on it a separating label so that the amplified fragments can also readily be separated. This will be especially useful if there is a possibility that the first primer also bound to fragments other than the target fragment.
The presence of the third primer annealing region in the target strand enables the exponential PCR amplification step to be conducted using a primer that is specifically complementary to either the first primer annealing region or the third primer annealing region. This is particularly advantageous because, by using the third primer annealing region, a further selection mechanism is introduced wherein PCR amplification only occurs with the target fragment and not with any unwanted fragments that may have become inadvertently bound to the support. It therefore introduces a means for ensuring that only the target fragment is exponentially amplified.
■' 8-_C""P
Sgs- - !£_- 3 ! This would also be preferable if, for example, the first labelled specific primer hybridized to several different places within the genomic mixture, which was then subsequently extended from all of the annealed points. The nested primer ensures that only the target fragment is amplified.
The nested third primer may furthermore be used in a preferred reamplification step, which advantageously facilitates the isolation of the target fragment in adequate purity and quantity for direct sequencing. In the reamplification step, an aliquot of the exponentially amplified mixture is reamplified using the cassette primer and a nested primer specific for the target fragment. Optionally, the nested target fragment-specific primer may be the third primer. Alternatively, it may be a different primer, hybridising to a further known primer annealing region distanced from both the first and second known primer annealing regions. Optionally, this different primer may be a fourth primer, but it is envisaged that any number of further nested primers may be advantageously employed. Equally advantageously, each different primer will hybridise to a separate further known primer annealing region. Thus a fourth primer would hybridise to a fourth known primer annealing region on the target fragment.
Preferably, the aliquot may be taken from a dilution of the exponentially amplified mixture, for example a dilution between 1:1 and 1:100, most preferably a dilution of 1:50. Advantageously the aliquot will measure between 0.1 and lOμl, preferably lμl.
The reamplification step provides added levels of specificity to the amplification reaction through the use of nested primers. In addition, the presence of matrix-bound DNA templates is eliminated, and thus the inefficiency of amplification associated with such templates is resolved. Preferably, the separating label is attached to the 5' end of the first primer. The label can also be attached to one or more heterocyclic bases of the first primer.
Preferably, the separating label is a biotin label and the support matrix comprises streptavidin-coated beads. However, it is to be understood that other forms of labels and support matrices would suffice e.g. proteins and protein binding groups, antibodies and antibody binding groups, GCN4 and other DNA binding proteins.
It is also to be noted that the matrix need not be in the form of a bead. The matrix can be in any appropriate form. For example, if the matrix was in the form of a rod, the target strands could be isolated simply by dipping the rod into the reaction mixture and then removing it. In this case, only the target strands will bind to the rod which can then, if necessary, be washed. This set up would be ideal for an automated machine.
In a further example, the matrix could represent the surface of a well of a microtiter dish so that target strands of many different samples could be easily isolated, simply by handling the whole microtiter dish. Again, this set up would be ideal for an automated machine.
Preferably, the present cassette comprises two complementary oligonucleotides having 3 or 4 nucleotide overhangs such that a sticky end is formed. The oligonucleotides can be in the range of 20 to 30 nucleotides long. These oligonucleotides will be easy to synthesise, particularly as it is known that specific primers labelled (e.g. with biotin) can easily be synthesized in a two-step procedure. However, it is to be appreciated that the current new methods allow incorporation of biotin or other labels during automated DNA synthesis. Moreover, there would be no need to purify the primers before using them as cassettes for the ligation reaction.
SUBSTITUTE SHEET Advantageously, the cassette sequence contains a universal primer sequence.
Examples of appropriate cassettes that can be ligated to the target strands include:
an EcoRI cassette
5' d(CGTTGTAAAACGGCCAGTGCCAAGT) 3'
3' d(GCAACATTTTGCCGGTCACGGTTCATTAA) 5'
a Hindlll cassette
5' d(CGTTGTAAAACGGCCAGTGCCAAGT) 3'
3' d(GCAACATTTTGCCGGTCACGGTTCATCGA) 5'
a Bglll cassette
5' d(CGTTGTAAAACGGCCAGTGCCAAGT) 3'
3' d(GCAACATTTTGCCGGTCACGGTTCACTAG) 5'
an Xbal cassette
5' d(CGTTGTAAAACGGCCAGTGCCAAGT) 3'
3' d(GCAACATTTTGCCGGTCACGGTTCAGATC) 5'
a PstI cassette
5' d(CGTTGTAAAACGGCCAGTGCCAAGTTGCA) 31 3' d(GCAACATTTTGCCGGTCACGGTTCA) 5'
a Hinfl cassette
5' d(CGTTGTAAAACGGCCAGTGCCAAGT) 3'
3' d(GCAACATTTTGCCGGTCACGGTTCATNA) 5'
[wherein N = any of the four bases G,A,T,C].
SUBSTITUTE SHEET Examples of some appropriate primer sequences include:
the M13 Sequencing Primer (-21)
5'd(TGT AAA ACG GCC AGT)3 ' , and
the M13 Sequencing Primer (-40)
5'd(GTT TTC CCA GTC ACG AC)3 ' .
In the present method, the isolated extension product can be exponentially PCR amplified while still in the matrix support-bound state. In this case appropriate primers are repeatedly annealed to the primer annealing regions to form double stranded nucleic acids on the addition of nucleotides to the annealed primers by use of a polymerase enzyme. Then, on denaturing the formed double stranded nucleic acids, the extension products simply fall away from the polymer-bound template into the solution. These extension products could then serve as ordinary templates in further PCR cycles. The products are also easy to collect and do not need to be separated by means of gels etc.
The use of the polymer-bound DNA template for exponential PCR has the extra advantage that it can be kept for long periods under appropriate storage conditions. This allows one to return to the bound fragments at a later stage to conduct any further experiments or amplification steps.
If desired, the bound ligation product can be removed from the matrix before undergoing exponential PCR amplification.
Following their preparation, the PCR products can be sequenced by any of the existing dideoxy termination or chemical degradation techniques using radio- or fluorescently-labelled nucleotides or primers. It is known that the probability of undertaking a successful walk along a piece of genomic DNA from a target sequence into an unknown region depends on the unknown distribution pattern of suitable restriction sites (within the PCR range) . Since this distribution pattern is completely unknown, it is better to choose several different restriction endonucleases to digest genomic DNA. Generally between 2 and 30 different restriction enzymes should be used to find one which produces a target fragment having a size within the PCR range. In most cases 5 different restriction enzymes possessing 6 nucleotide long recognition sites are sufficient for this purpose, for example EcoRI, Hindlll, Xbal, Bglll and PstI. If these restriction enzymes do not produce suitable restriction fragments, an endonuclease recognising 4 or 5 nucleotides like Hinfl could be used.
After the genomic DNA has been digested into a number of restriction fragments by, for example, a suitable combination of different restriction enzymes, a number of appropriate cassettes, each comprising specific sticky ends for the given restriction endonucleases, must then be separately ligated to the restriction fragments in a series of independent, parallel ligation reactions.
The present invention has the advantage that it is particularly well suited for undertaking a successful walk along a piece of genomic DNA from a known site into an unknown region.
In particular, the excision and ligation steps can be conducted in the same vessel. In this way, all of the multi-ligation products can be pooled (i.e. multiplexed). They can then be easily isolated at the same time or in turn (see below) before commencing the other steps of the present procedure (i.e. linear PCR amplification using labelled primers that are complementary to the target sequence followed by isolation and purification of the labelled
SUBSTITUTESHEET linear PCR products on a solid support matrix and, finally, exponential PCR amplification in presence of two primers) . Therefore, using the present process, the number of reactions is reduced to a minimum.
In addition, the method is just as efficient for the simultaneous walking from different target nucleic acid fragments into a particular unknown region. This special type of "multiplexing" is performed using several specific oligonucleotide primers for the first linear PCR amplification step. Each primer will be complementary to a respective specific region within the different target fragments from which one wishes to walk into the unknown regions. Each of the primers can then be linearly extended at the same time and, also, in the same reaction tube.
These primers could carry the same separating labels. If the labels are the same, the different extension products can be isolated by use of the same support matrix. Each of the extension products can then be subjected to an exponential PCR amplification step using the cassette primer and a mixture of nested specific primers each carrying different separation labels.
After the exponential PCR amplification stage, a specific amplified fragment can be obtained and sequenced from the mixture by using a support matrix which allows specific isolation of the appropriate label. The linear extension product of interest is isolated from the mixture by its separation on a solid support matrix with an appropriate binding group thereon. It can then be subjected to exponential PCR.
It is to be appreciated that in such a multiplexing reaction the appropriate primers need not have the same attached label. This allows one to pick out specific extension products if and when desired.
SUBSTITUTE SHEET Of course, however, it is clear that the differently labelled amplification products could be isolated at the same time simply by adding a mixture of support matrices (with appropriate binding groups thereon) to the reaction mixture at the same time.
Accordingly, large sections of unknown nucleic acids can be amplified, isolated and sequenced at the same time simply by picking out each of the fragments of interest. This is particularly advantageous if there is a need to walk and sequence from many different starting points on a genome into unknown directions. Also, if the initial target fragments are overlapping, the sequence of the larger fragment, from which the fragments were excised, can be determined at the same time.
In summation, the present invention is particularly useful for a method called "multiplexing", wherein a number of different target nucleic acid fragments can be produced at the same time by the addition of a number of different restriction enzymes. Cassettes with appropriate labels can then be annealed to the known regions of the target fragments. Each of the primers can then be extended at the same time and the extension products can then be isolated by use of the same support matrix.
The present method therefore allows simultaneous exponential amplification of different specific DNA fragments by a single PCR amplification step. Specific DNA fragments can be isolated and purified out of this mixture by using different solid support and affinity binding mechanisms. Also, different labels can be used in the initial linear amplification step.
The present invention is therefore particularly useful for preparing nucleic acids and allowing genomic walking along a large section of genomic nucleic acid, e.g. human nucleic acid.
SUBSTITUTE SHEET In using the present method for genomic walking, the seguences obtained from the first primer (i.e. the primer that anneals to the target sequence) and the second primer (i.e. the primer that anneals to the cassette) confirm the overlapping sequences. This gives the necessary information to enable one to sequence the unknown region and to design new primers for the next genomic walking step.
According to a second aspect of the present invention there is provided the use of a ligation product, which ligation product comprises a target fragment ligated to a cassette, in a method of genomic walking in any direction along the genomic nucleic acid, wherein the target fragment includes a first primer annealing region of known sequence and has annealed thereto a primer which has attached thereto a separating label, and wherein the cassette includes a second primer annealing region of known sequence.
The target fragment can be a fragment excised from genomic nucleic acid.
According to a third aspect of the present invention, there is provided a first kit comprising:
(a) a sample of genomic nucleic acid;
(b) means for excising a target fragment of nucleic acid having a first primer annealing region of known sequence from the genomic nucleic acid;
(c) a cassette ligatable to the excised fragment and having a second primer annealing region of known sequence;
(d) a first primer, annealable to the first primer annealing region, having attached thereto a separating label; (e) a second primer annealable to the second primer annealing region;
(f) a support matrix having attached thereto a group cooperatively bindable to the separating label; and optionally
(g) a third primer annealable to a third primer annealing region of known sequence upstream or downstream of the first primer annealing region; and further optionally
(h) at least any one of the following: a buffer, a polymerase, a washing solution and a nucleotide solution.
Preferably, the sample of genomic nucleic acid is DNA and this can include the DNA from one or more different organisms or segments of genomic nucleic acids.
Advantageously, the excising means is a restriction enzyme or a group of different restriction enzymes, including, optionally, appropriate digestion buffers.
Preferably, the kit further comprises a number of cassettes ligatable to the excised fragment and having second primer annealing regions of known sequence.
Advantageously, the kit further comprises incubation buffer and/or a sample of T4 DNA ligase.
Preferably, the kit comprises a number of first primers, each annealable to* the first primer annealing regions, and having attached thereto the same or a different separating label.
SUBSTITUTESHEET Advantageously, the kit comprises a number of second primers, each annealable to the second primer annealing regions.
The kit can include a number of support matrices, each having attached thereto the same or a different group that is cooperatively bindable to the separating labels.
Preferably, the kit further comprises a number of third primers, each annealable to the third primer annealing regions of known sequence that are situated on the target fragments, preferably located between the first and the second primer annealing regions. Optionally, the kit further comprises fourth and further primers hybridisable to fourth or further primer annealing regions of known sequence which are situated on the target fragment, between the first and second primer annealing regions.
The kit can also include at least any one of the following: a buffer for in vitro amplification, a deoxynucleotide triphosphate solution, a polymerase, light mineral oil, one or more washing solutions, and means to attach a separating label to a (or any) first oligonucleotide primer that is of specific interest to the user in application of this kit.
According to a fourth aspect of the present invention, there is provided a second kit comprising:
(a) a ligation product comprising a target fragment of genomic nucleic acid ligated to a cassette, wherein the fragment includes a first primer annealing region of known sequence and the cassette includes a second primer annealing region of known sequence;
(b) a first primer annealable to the first primer annealing region and having attached thereto a separating label;
SUBSTITUTE SHEET (c) a second primer annealable to the second primer annealing region;
(d) a support matrix having attached thereto a group cooperably bindable to the separating label; and optionally
(e) a third primer annealable to a third primer annealing region of known sequence upstream of the first primer annealing region; and further optionally
(f) at least any one of the following: a buffer, a polymerase, a washing solution and a nucleotide solution.
Preferably, the second kit has a number of ligation products.
Accordingly to a fifth aspect of the present invention, there is provided a method for extending cDNA clones using the PCR amplification method of the first aspect of the invention. A cDNA clone of which only a central portion has been sequenced can be extended to both the 5' and 3' termini using specific primers hybridising to known regions of the cDNA and general oligonucleotides which are hybridisable to the termini of the cDNA.
Preferably, the method of the fifth aspect of the invention comprises the following steps:
i) synthesising double-stranded cDNA with the first strand primed with a first primer which hybridises to the poly-A tail of the mRNA;
ii) linear amplification of an aliquot of the cDNA using a primer, the primer being hybridisable to only one strand of a target cDNA, and having a separating label attached thereto;
iii) isolation of the labelled target cDNA extension product by binding the label to a support matrix having a group cooperably bindable with the label;
iv) tailing the 5' end of the PCR products resulting from the linear extension of a primer hybridising to the antisense strand with a dNTP; and
v) amplifying the unknown regions of the target cDNA between the known region and the 5' and/or the 3' terminus, by exponential amplification with an internal primer hybridising to the known region, and a general primer, which for the PCR product extended from the antisense strand- binding primer will comprise poly-dN where N is a base complementary to the dNTP used in step (iv) and for the PCR product extended from the sense strand-binding primer will comprise a primer complementary to or substantially identical to at least part of the first primer used in step (i).
The products of the PCR reaction may advantageously be visualised on an agarose gel, and directly sequenced if desired.
Preferably, the first primer hybridising to the poly-A tail may further comprise the sequence of the RACE primer (51 (ATCGATGGATCCGCGGCCGC(T)20)3' ; M.A. Frohman et al., (1988) , PNAS JL5/ p. 8998-9002) . Advantageously, the linear amplification of the cDNA may proceed for between 1 and 100 cycles, preferably 50 cycles.
Preferably, the dNTP used in step (iv) is dGTP; and the poly-dN in step (v) is poly-dC. Advantageously, the general primers used in step (v) for the extension product of the antisense strand-binding primer may comprise a restriction endonuclease cleavage site attached to poly-dN. Preferably, the cleavage site may be a BamHl a cleavage site. Thus the primer may be AACGAT(C)15. For the extension product of the sense strand-binding primer, the general primer may be the RACE primer, ATCGATGGATCCGCGGCCGC.
The above technique has been found to be especially effective, more so than the original RACE technique (Frohman et al. , Op. Cit.) .
According to a sixth aspect of the invention there is provided a third kit, comprising:
i) reagents suitable for the synthesis of cDNA from an RNA preparation and, optionally, an RNA preparation;
ii) a first strand priming primer hybridisable to the poly-A tail of an RNA species;
iii) a primer attached to a separatable label, hybridisable to a primer annealing region on one strand of a target mRNA; and, optionally, a further primer hybridisable to the alternative strand;
iv) a support matrix having attached thereto a group cooperably bindable to the separatable label;
v) a dNTP;
vi) a general primer hybridisable to a tail of the dNTPs in item (v) ;
vii) a 3' end general primer hybridisable with the first strand primer of item (ii) ; and optionally vii) at least one nested internal primer hybridisable to one strand of the known region of the target mRNA.
Preferably, the kit contains primers hybridisable to both cDNA strands, in order to extend the cDNA in both directions.
The present invention has several advantages. Firstly, the present method allows the linear PCR amplification of a target strand within a complex genomic nucleic acid mixture followed by an effective isolation and purification of the extended product from this mixture. Thus, only target strands can be exponentially amplified in complete isolation from any other fragment.
The method therefore introduces a way of amplifying, with a very high degree of specificity, any target sequence from any complex genomic nucleic acid mixture. This is in direct contrast with all of the other known procedures, wherein the step of exponential amplification of the desired target sequence takes place in a very complex mixture of genomic nucleic acid fragments and/or genomic restriction fragments. This naturally leads to a certain degree of non-specificity.
Secondly, the present cassette constructions are very simple and cheap to manufacture. They consist of two complementary oligonucleotides having sticky 3 or 4 nucleotide overhangs. These oligonucleotides are in the 20 to 30 bp range. They can be easily synthesised and can be used for ligation without any purification.
Thirdly, ligation of the cassette to the target nucleic acid does not lead to any of the problems encountered with the known IPCR technique (e.g. formation of concatamers) . This is because the cassettes have one sticky end (nucleotide overhand) and one blunt end (i.e. an end that does not have a nucleotide overhang) . This prevents the formation of concatamers. Also, there is no need for a recircularisation
SUBSTITUTE SHE step because the cassette provides the required first primer annealing region.
Fourthly, the present method can be done with any preparation of a genomic nucleic acid fragment. In particular, the method is independent of the size and molecular weight of the fragment.
Fifthly, the PCR products can be easily sequenced from either end. The sequence obtained from the first primer (i.e. the primer that anneals to the primer annealing sequence in the target sequence) confirms the overlapping sequence. The sequence obtained from the second primer (i.e. the primer that anneals to the annealing sequence in the cassette) represents the last part of the unknown region and provides the necessary nucleotide information to design and synthesize newly labelled primers for a next genomic walking step.
In sequencing an amplified target nucleic acid fragment, one has the choice of applying any of the known dideoxy termination or degradation sequencing techniques. In doing so, one can use radio-labelled or fluorescently labelled primers or terminators. The present method is very well suited for, and is compatible with, any of the known sequencing methods including enzymatic and chemical fluorescent procedures which allow automated on-line detection of the nucleotide sequence during electrophoresis.
This is an important advantage because direct sequencing of PCR products is not always simple and straightforward. For instance, noncoding PCR templates from spacer or intron regions often have high GC or AT content and dideoxy sequencing techniques (even using Taq DNA polymerase) do not always allow one to determine sequences unambiguously. In these cases, chemical degradation techniques have to be applied.
S UBSTITUTE SHEET Since the cassette contains the universal primer sequence (e.g. a M13 sequencing consensus sequence) , commercially available primers including fluorescently labelled primers can be used to sequence the first 300 to 500 bp from the cassette into the unknown region. If the amplified target fragment is quite large in size it cannot be sequenced in one go from both ends (i.e. from the first and second primer annealing region) and "walking primers" have to be synthesized and used for the DNA sequencing.
As mentioned above, an important advantage of the present technique is that it, unlike the earlier methods, allows different samples to be pooled and processed simultaneously through the steps of linear amplification, isolation, purification and, finally, exponential amplification all at the same time and in the same reaction tube.
This "multiplex" strategy is based on digesting genomic nucleic acid with different restriction endonucleases, ligating appropriate cassettes to the produced restriction fragments, pooling the ligation products and processing them in a simultaneous fashion through all of the subsequent amplification, isolation and purification steps.
The efficiency of the method can still further be increased by using a number of different, specific primers that anneal to different primer annealing sequences on the target fragment. These primers can carry different separating labels for the exponential PCR step. The labelled amplified target fragments can then be isolated and purified from this mixture by using solid supports with appropriate binding groups. Multiplexing large numbers of different samples is not possible by using any of the earlier methods.
A further advantage of the new technique is that any or a number of restriction endonucleases can be used to cut the genomic DNA into a number of different fragments. The endonuclease can be chosen so that it cuts the genomic DNA
~:Zfτ-, - mi- -
Figure imgf000039_0001
within a region of known sequence and within an adjacent region of unknown sequence. The cassettes can be designed so that they can anneal to the restriction endonuclease cut ends, as well as including the desired primer annealing regions.
The present method has an even further advantage that it eliminates or reduces the possibility of amplifying unwanted fragments. For example, if the specific primer exhibits non-specific hybridisation within a complex genomic DNA mixture (which can also happen with each of the aforementioned methods) one will get some nonspecific binding of certain DNA fragments on the solid support during separation. This will lead to a mixture of several different DNA fragments. However, these impurities will be present in quantities that are much lower than those described in each of the earlier methods. Moreover, the stringent separation conditions in the present method (e.g. high salt, alkaline conditions) will reduce the content of the unwanted DNA fragments even further before the exponential PCR amplification steps take place.
Using the new method, we have successfully walked along the nematode unc 31 gene contained on a yeast artificial chromosome (YAC) clone within the background of total yeast DNA (see discussion below and in particular figure 4) .
We have also used the technique to walk along total human DNA from exon 51 of the Duchenne Muscular Dystrophy (DMD) gene into the adjacent intron and within this intron itself (see discussion below and in particular figure 5) .
Furthermore, we have used the method including the reamplification step to extend the human microclone M54 in both directions, deriving new sequence data from the human CAM-Ll gene (see figure 6) . In all cases we have successfully elongated the known locus by several kilo base pairs (kb) . Numerous control walks have also been made within known (already cloned and sequenced) parts of the nematode unc31 and the human DMD gene. In all cases, each round of cassette-mediated PCR walking was performed using multiple restriction enzymes (EcoRI, Hindlll, Xbal, Bglll, PstI and Hinfl) and appropriate oligonucleotide cassettes for ligation.
Brief Description of the Drawings
Three specific embodiments of the present invention will now be described and with reference to the accompanying drawings, in which:-
Figure 1 is a general scheme of one use of the present cassette-mediated PCR technique, namely the exponential amplification of an unknown nucleic acid sequence within a gene;
Figure 2 is a schematic diagram of the method of the invention comprising the step of reamplification of a sample of the amplified mixture using a nested third primer;
Figure 3 is a general scheme portraying the application of the method of the invention to cDNA clone extension;
Figure 4 is a representation of the result of a successful walk of a 1 kb nucleic acid sequence within the nematode unc 31 gene contained in a YAC clone within total yeast genomic DNA, following the general scheme of figure 1;
Figure 5 is a representation of the result of a successful walk of an about 600bp nucleic acid fragment within the Duchenne Muscular Dystrophy (DMD) gene extending the known nucleotide sequence of intron 50 by about 400 bp using total human genomic DNA, following the general scheme of figure 1.
Figure 6 is a representation of the result of PCR walks extending human microclone M54 (Mackinnon et al., 1990 Am. J. Hum. Genet. 47, 181-186) by 700 bp in either direction, following the general scheme of figure 2.
Detailed Description of the Embodiments
In the general scheme of figure 1, the symbol R represents a restriction site, the symbol KL represents a known locus, the symbol UKL represents an unknown locus, the symbol ds NS represents a double stranded nucleotide sequence (such as genomic DNA) , the symbol ss NS represents a single stranded nucleotide sequence, the symbol OC represents an oligo- cassette, the symbol B represents a biotin labelled specific primer, the symbol SB represents a streptavidin coated bead, the symbols OHC and OHR represent nucleotide overhangs, and the symbols Pi and P2 represent appropriate primers for exponential amplification by PCR.
Following the general scheme of figure 1, a target fragment of nucleic acid (ds NS) is first excised from a larger sequence at restriction sites (R) by the use of an appropriate restriction enzyme (see step l) .
Next, oligo-cassettes are ligated to the ends of the excised fragment (see step 2) . Each cassette (OC) has a blunt end (E) and a nucleotide overhang (OHC) at its other end. The overhang (OHC) is complementary to the nucleotide overhang
(OHR) that is created when the ds NS is cleaved at the restriction sites (R) , thus enabling the cassette (OC) to be ligated to the ds NS.
In order to enable the required nucleic acid sequence to be isolated in a simple and straightforward manner, linear PCR amplification of the ligated cassette-target fragment is conducted using a specific biotin-labelled primer (B) (see step 3 ) .
The products of the linear PCR amplification step are then isolated by, for example, admixing the reaction mixture with streptavidin-coated magnetic beads (SB) (see step 4) . The biotin-labelled PCR products (from step 3) selectively bind to the streptavidin-coated magnetic beads (SB) . The products are thus easily separated.
Next, the separated biotin-labelled PCR products, while still bound to the coated magnetic beads, are denatured and then exponentially amplified by the PCR technique using the two appropriate primers (PI, P2) (see step 5) .
It is important to realise that P2 need not have the same sequence as B. If it does not, then a further specificity is introduced into the scheme, wherein it is ensured that only the fragment of interest is subjected to PCR amplification. The double stranded nucleic acid products can then be sequenced by any standard sequencing technique (see step 6) .
Once the products have been sequenced the neighbouring fragment of nucleic acid can be isolated and sequenced by repeating the above steps.
In Figure 2, a scheme similar to that in figure 1 is shown, but comprising an added reamplification step. In Figure 2 primer B is the nested primer, while primer A is the biotinylated first primer and primer C is the cassette- hybridising primer.
Linear amplification is allowed to proceed as for the method of Figure 1. The DNA sequences of interest are isolated on streptavidin-coated beads, and exponentially amplified using primers A and C. An aliquot of the product of the exponential amplification is then re-amplified using nested primer B and cassette primer C. The use of a nested primer,
SUBSTITUTESHEET which is complementary to a portion of the known sequence, adds a further level of specificity and improves the purity of the final product. This product can be sequenced directly without the need for cloning.
Figure 3 is a schematic representation of the application of the method of the invention to the extension of cDNA clones.
Step I comprises the synthesis of double stranded cDNA, the first strand having been primed with the RACE primer, 5' (ATCGATGGATCCGCGGCCGC(T)20)3* , which hybridises to the poly-A tail of the mRNA. The region of the cDNA shown shaded black is the region whose sequence is known, while the unshaded regions represent unknown cDNA sequences. Primers A and B, which are biotin-labelled, and C and D are constructed complementary to regions of the known cDNA sequence as shown.
In step II, the cDNA is split into two aliquots and linearly amplified for 50 cycles using only the biotinylated primers A or B, as shown. Each primer hybridises to a different strand of the cDNA.
Step III involves the isolation of the biotinylated product on streptavidin beads.
Step IV is carried out only on the aliquot which has been amplified using the primer hybridised to the antisense strand of the cDNA. The 5' end of the resulting extension product is tailed with dGTP using terminal transferase.
This allows hybridisation of the 5' end of the strand with a poly-dC primer.
Step V is exponential amplification of the two cDNA populations using two primers. Primers C and D, which are nested within the biotinylated primers A and B and the terminal primers, are used together with a poly-dC terminal
SUBSTITUTE SHEET primer for the antisense strand product and a RACE primer for the sense strand product. The products of the exponential amplification may be directly sequenced and cloned if necessary.
As stated above, figure 4 records the result of a successful walk of 1 Kb within the nematode unc 31 gene contained in a YAC clone and figure 5 records the result of a successful walk along a segment of total human DNA from exon 51 of the Duchenne Muscular Distrophy (DMD) gene into the adjacent intron and within this intron itself.
Figure 6 records the results of a bidirectional walk of 700 bp in each direction from human microclone M54.
Total human genomic DNA was digested in parallel with five different restriction enzymes: EcoRI (E) , Hindlll (H) , Xbal (X) , Bglll (B) and PstI (P) . Appropriate oligo-cassettes were ligated to the ends of all restriction fragments and PCR walking was carried out in parallel as described in figure 2 using two pairs of M54-specific oligonucleotides. After exponential amplification (step 4 in figure 2) an aliquot of each PCR product was analysed by electrophoresis using a 1% agarose gel.
Lanes P in panel A and B show two PCR products which extend the microclone M54 at both ends by approximately 700 bp towards two Pst sites. Lanes E and H in panel B also show other PCR products which are caused by hybridisation of the biotinylated primers to similar regions in the genome.
The experimental details will now be discussed.
Experimental Details
Figures 4 and 5
A sample of total genomic DNA was prepared. The total genomic DNA contained yeast genomic DNA (a recombinant YAC with the unc31 gene) , nematode genomic DNA and human genomic DNA. 250-500ng of the total genomic DNA was digested to completion in a 20μl solution with the six restriction enzymes EcoRI, Hindlll, Xbal, Bglll, PstI and Hinfl which were then inactivated by heating (equivalent to step 1 in figure 1) .
Half of the digested DNA (125-250ng) was ligated either to 5 pmol of the appropriate EcoRI, Hindlll, Xbal, Bglll or PstI oligo-cassettes or to 50 pmol of a Hinfl oligo-cassette in a total volume of 20μl (i.e. lOμl of the appropriate digested genomic DNA, 2μl of 10 times T4 DNA ligase buffer containing 200 mM Tris/HCl pH 7.4, lOOmM MgCl2 and lOOmM DTT, 2μl of 6mM rATP, 4μl water and lμl of the appropriate oligo-cassette in a concentration 5 or 50 pmol/μl) (equivalent to step 2 in figure 1) . Each of the double- stranded cassettes were 28 nucleotides long and had appropriate 4 nucleotide overhangs (EcoRI, Hindlll, Xbal, Bglll and PstI) or a 3 nucleotide (Hinfl) overhang. The cassettes were prepared from crude oligonucleotides by mixing together equimolar amounts of both oligonucleotides representing the upper and lower strands of the cassette, heating the mixture for 10 min to 80°C and slowly cooling the solution to room temperature over a period of 30 minutes. The reaction volume was then diluted with water to lOOμl and heated for 10 min to 75°C to heat-inactivate T4 DNA ligase.
Next, lμl (1.25 to 2.5ng) of the ligated product was amplified by linear PCR steps using a specific primer having a biotinylated 5'-end.
One of the walks along the nematode unc31 gene (figure 4) was performed using a 24 nucleotide long specific primer having the following sequence:
5' Biotin-d (CGT TTC GCC CGA TAC AAT AAC AAT) 3'.
SUBSTITUTE SHEET In case of the elongation of the DMD intron 50 region within total human DNA (figure 5) , a 24 nucleotide primer sequence was used having the following sequence:
5' Biotin-d (CAG CTG GGT TAT CAG AGG TGA GTG) 3'
(The addition of the labelled primers is equivalent to step 3 in figure 1.)
The linear PCR step was carried out in 1 x PCR buffer (10 mM Tris HC1, pH 8.3; 25°C; 50 mM KC1; 1.5mM MgCl2; 0.01% gelatin) (Cetus) with 250μM dNTP's, 0.5μM specific biotinylated primer and 2.5 units Tag polymerase (Cetus). The PCR rate was 50 cycles of 95°C for 0.5 minutes, 55°C for 1 minute, and 72°C for 1 minute.
The biotin-labelled products were then isolated by mixing them with 25μl of washed streptavidin-coated beads (Dynal S.A.) (equivalent to step 4 in figure 1).
Following an incubation period of 15 minutes at room temperature, the beads were washed three times with 40μl of 1 M NaCl in TE buffer followed by three washes with 1 x TE buffer. After each washing stage, the supernatants were carefully removed.
The bead-bound DNA was then denatured by heat and subjected to exponential PCR amplification (equivalent to step 5 in figure 1) . The conditions for exponential PCR amplification were similar to those for the linear PCR steps except that 35 cycles were carried out (instead of 50) and that two primers were present (instead of one) , each of which was unbiotinylated.
In the case of walking along the nematode unc31 gene, the following cassettes were used: a 21 bp long universal primer complementary to a part of the second primer annealing region within the cassette having the sequence of: 5* d(CGT TGT AAA ACG GCC AGT) 3'
and either an unlabelled 24 nucleotide nematode unc31- specific primer complementary to a part of the first primer annealing region within the target fragment having the sequence:
5' d(CAG CTG GGT TAT CAG AGG TGA GTG) 31
or a nested 24 nucleotide nematode unc31-specific primer complementary to and third primer annealing region within the target fragment having the sequence:
5' d(CTA CTC GAA TTG CTA TCC TAA TCT) 3'
In case of walking along the human DMD intron 50 the following cassettes were used, a 21 bp long universal primer complementary to a part of the second primer annealing region within the cassette having the sequence:
5' d(CGT TGT AAA ACG GCC AGT) 3'
and a nested DMD intron 50-specific unlabelled 24 nucleotide primer complementary to a third primer annealing region within the target sequence having the sequence:
5' d(GAG ACT CAC ACT GGA CAA CCA GTG) 3'.
The PCR amplification products were then separated on 1% low melting point (LMP) agarose, isolated and sequenced (equivalent to step 6) .
The sequencing was performed by both a standard chemical degradation and by a dideoxy termination technique using radioactive and/or fluorescent labels.
SUBSTITUTE SHEET Figure 6
The basic principle of the method is outlined in figure 2.
It involves 5 steps including
1. Linear amplification of the desired DNA fragment using a biotinylated primer complementary to the known locus;
2. Isolation of biotinylated specific PCR products by separation using streptavidin-coated magnetic beads;
3. Exponential amplification of the isolated polymer- bound specific DNA fragments using two primers, a cassette- specific primer and the locus specific primer but without biotin (primer A, figure 2) ;
4. Re-amplification of the desired fragment from a small aliquot of the PCR mixture from step 3 using the cassette- specific primer and a second locus-specific primer which lies internal to the first locus-specific primer (primer B, figure 2) ;
5. Direct sequencing of the PCR product.
For ligation to restriction fragments possessing 5' overhangs we have used a double-stranded oligonucleotide cassette 28 nucleotides long having an additional 4 nucleotide overhang. The "upper" oligonucleotide is the same for all cassette constructions and contains the (-21) M13 primer sequence: 51 d(CGT TGT AAA ACG GCC AGT GCC AAG T)3 » .
The "lower" oligonucleotides were synthesised for the restriction enzymes EcoRI, Hindlll, Xbal, and Bgll and their nucleotide sequences are:
5' d(AAT TAC TTG GCA CTG GCC GTC GTT TTA CAA CG) 3' EcoRI 5' dfAGC TAC TTG GCA CTG GCC GTC CTT TAA CAA CG) 3' Hindlll 5' dfCTA GAC TTG GCA CTG GCC GTC GTT TAA CAA CG) 3' Xbal 5' d(GAT CAC TTG GCA CTG GCC GTC GTT TTA CAA CG) 3' Bgll (the overhang is underlined) .
The lower oligonucleotides are not phosphorylated and therefore are not covalently bound to the restriction fragment during ligation. For ligation to restriction fragments possessing 3 ' overhangs we have used a double-stranded oligonucleotide cassette 28 nucleotides long having a 4 nucleotide overhang. The "lower" oligonucleotide is the same for all cassette constructions and contains the complementary (-21) M13 primer sequence:
5' d(ACT TGG CAC TGG CCG TCG TTT TAC AAC G) 3'. The "upper" oligonucleotide was synthesized for the restriction enzyme PstI and its sequence is: 5» d(CGT TGT AAA ACG ACG GCC AGT GCC AAG TTG CA. 3'.
EcoRI, Hindlll, Xbal, Bglll and PstI oligo-cassettes are prepared at 5 pmol/μl concentration by dissolving approximately 500 pmol of the upper and the respective lower oligonucleotide in 100 μl water. The cassettes are heated to 80°C for 5 min and than slowly cooled down to RT over a period of 30 min before using in the ligation reactions. The cassettes are stored at -20°C and thawed on ice before use.
Ten μl of digested genomic DNA (approximately 100 ng) were combined with 2μl 10 x ligation buffer, 2 μl 10 mM ATP, 1 μl of the EcoRI, Hindlll, Xbal, Bglll, or PstI oligo-cassette (5 pmol/μl) , 4 μ water and 1 μl T4 DNA ligase (1 unit) . The mixtures were incubated overnight at 16°C. 80 μl of water were added and the mixture heated 10 min at 70°C to destroy the ligase. The cassette-ligated DNA was aliquoted and stored at -20°C. This represents a stock for over 100 walking reactions.
Linear PCR was performed in 20 μl using 0.5 ml test tubes.
The following items were added: 13.5 μl water, 2.0 μl 10 x PCR buffer (Cetus) , 2.0 μl 2.5 mM dNTP mix, 1.0 μl cassette- ligated DNA (1 ng) , 1.0 μl 10 μM biotin-labelled locus- specific primer, and 0.5 μl Taq polymerase (2.5 units). After overlaying with light mineral oil the following cycles were performed: 95°C 90 s [95°C 30s, 55°C 60s, 72°C 60s] x 50, 72°C 180s. Al cycles were performed using the maximum heating and cooling rates possible with the Techne PHC-1 or
SHEET PHC-2
The biotinylated product was recovered and puryfied by the addition of 25 μl beads directly to the PCR mixture. After incubation at room temperature with occasional mixing, the beads were sedimented with a strong magnet. The supernatent and the oil were removed and the beads washed 3 times with 40 μl TE/0.1 M NaCl and 3 times with TE. The beads were finally resuspended in 4.5 μl H-O.
Two primers were used to amplify the single-stranded template bound to the beads: the locus-specific primer from step 1 of the method but without biotin (primer A, figure 2) and the cassette-specific primer (primer C, figure 2) with the following sequence: d(TGT AAA ACG ACG GCC AGT GCC) containing the M13 universal forward primer sequence. Exponential PCR is performed in 20 μl comprising 9.0 μl water, 2.0 μl 10 x PCR buffer, 2.0 μl 2.5 mM dNTP mix 1.0 μl 10 μM cassette-specific primer, 1.0 μl 10 μM locus-specific primer, 4.5 μl bead bound DNA template, 0.5 μl Taq polymerase. After overlaying with light mineral oil the following cycles were performed: 95°C 90s, [95°C 30s, 55°C 60s, 72°C 60s] x 35, 72°C 180s.
In order to isolate the extended product with sufficient purity and in adequate quantity for direct sequencing, a fraction (1 μl from a 1 in 50 dilution) of the first exponential amplification was reamplified using a nested locus specific primer and the cassette primer (primers B and C, figure 2) . Reaction conditions were similar to those described in the exponential amplification step.
Prior to direct sequencing, the PCR product was purified by gel electrophoresis' using LMP agarose, in order to remove excess nucleotides and primers, as well as minor DNA contaminants. The DNA can be recovered using a Qiagen gel extraction kit, a Gene Clean II or Mermaid kit (Bio 101) . Sequencing of PCR products was performed by linear
SUB amplification sequencing using Taq polymerase and the fluorescnet dye terminators from Applied Biosystems (Taq Dye Deoxy ™ Terminator Cycle Sequencing Kit) . After cycling the fluorescent products were purified by G50 spin columns, lyophilized and loaded into a single lane of a 373A sequencer (Applied Biosystems) . Reliable sequence information can be obtained in most cases from both ends of the PCR product using the M13 (-21) sequencing primer and the nested locus-specific primer. Walks along the PCR product were performed easily using 20mer synthetic oligonucleotides. The ends of each PCR product, including the primer and cassette sequence, were determined by solid- phase chemical degradation using radio-labelled, reamplified PCR products (16) .
cDNA was synthesised from 0.1 μg of human fetal brain mRNA using AMV reverse transcriptase (Anglican Biotechnology, Colchester UK) under standard conditions but using the RACE- oligo-dT primer: 5'ATCGATGGATCCGCGGCCGCTTTTTTTTTTTTTTTTTTTT3' .
The reverse transcriptase was heat-inactivated and the cDNA diluted to lOOμl. A portion of this single stranded cDNA (1 μl) was then used for PCR.
For PCR amplification in a total of 50 μl the following reaction constituents were combined: 32.5 μl water, 5μl 10 x PCR buffer (Cetus), 5 μl 2.5 mM dNTP mix, 2.5 μl of each primer 5' (TTT GTC GAC. and primer 51 d(TTT GTC GAC) (the underlined portions representing a tail containing a Sail site) , 2 μl human fetal brain cDNA (2ng) , and 1 μl (5 units) Taq polymerase (Cetus) . The mixture was overlayed with 40 μl of light mineral oil. Standard PCR cycles were 95°C 60s, [95°C 30s, 55°C 30s, 72°C 180s] x 40.5 μl of the PCR product were sized on a 1% "agarose gel. The remaining 45 μl were passed through a Strategene PrimeErase™ column according to the manufacturer's recommendation in order to remove excess nucleotides, primers and polymerase, digested with Sail and subsequently cloned into pUClδ. Recombinant colonies having
- * * the correct insert size of about 2.4 kb were identified using either a shortened miniprep procedure (17) or by PCR amplification in a microtiter dish using universal forward and reverse primer and DNA prepared from a 200 ml culture according to the Qiagen protocol. Sequencing of double- stranded DNA was performed by linear amplification using the ABI dye terminators and a 373A sequencer. The insert was completely sequenced on both strands by adopting a walking protocol using 20-mer synthetic primers. Individual reads were between 350 and 400 bp.
Experimental Results
The results are recorded in figures 4, 5 and 6.
Figure 4 shows a successful 1 kb walk along the nematode unc31 gene within total yeast DNA. In particular, lane 1 of figure 4 shows the predicted 1 kb band resulting from the exponential amplification of the target DNA between the nematode unc31-specific primer:
5' d(CAG CTG GGT TAT CAG AGG TGA GTG) 3'
and a Hindlll site to which the cassette was ligated.
Lane 2 shows the result of exponential amplification with a nested nematode unc31-specific primer:
5' d(CTA CTC GAA TTG CTA TCC TAA TCT) 3'.
Lane 3 shows a range of molecular weight markers (123 base pair ladder, Bethesda Research Laboratories) .
The PCR product from the exponential PCR amplification step with the cassette specific primer and the nested nematode unc31-specific primer (lane 2) was subjected to DNA sequencing. The results of which show the expected nucleotide sequence confirming the walk. The short molecular weight bands in lane 1 and 2 represent side products which are often observed during PCR and are due to some side reaction with the applied primers. These bands are not caused by the walking method itself.
Figure 5 shows a successful 600 bp walk along the human DMD gene within total human genomic DNA.
In particular, lane 1 shows a 600bp walk along the DMD intron 50 resulting from exponential amplification with a nested human DMD intron 50-specific primer:
5" d(GAG ACT CAC ACT GGA CAA CCA GTG)
towards a PstI site to which the cassette was ligated. This walk elongated the known portion of the DMD intron 50 by 400 nucleotides.
Lane 2 shows a range of molecular weight markers (123 base pair ladder, Bethesda Research Laboratories) .
The PCR product from the exponential PCR amplification step with the cassette specific primer and the nested human DMD intron 50-specific primer (lane 1) was subjected to DNA sequencing. The results of which show the expected overlapping nucleotide sequence confirming the walk. Around 400 bp were obtained from the end of the cassette and this represented new DMD intron 50 sequence. This was used to synthesise a new biotinylated specific primer for the next cycle.
From the results, it is seen that the sequences obtained from the present oligo-cassette mediated PCR technique are in agreement with those known for the nematode unc31 gene and those that are partially known for the human DMD gene.
SUBSTITUTESHEET Figure 6 shows the extended sequence derived from human microclone M54 of the human CAM-LI gene.
A single PCR product was produced after the reamplification step 4 (figure 6, panel A, lane P) . This suggests that a PstI site is located approximately 700 bp upstream of M54. The failure to amplify a product after digestion with the other enzymes suggest that these cut too far away from the locus. The use of a second primer set, directed downstream of M54 resulted in three different PCR products as can be seen from the agarose gel (figure 6 panel B, lanes E, H and P) and would suggest that three restriction sites (EcoRI, Hindlll and PstI) might be located downstream from M54. Direct sequencing of the ends of each fragment by solid- phase chemical degradation revealed that only the two 700 bp long PCR products containing a PstI site at their ends are the real extension products of the microclone M54 because they showed the correct overlapping sequence between the internal primer and the ends of the microclone.
The 800 bp long Hindlll-PCR product (figure 6, panel B, lane H) is not a extension of the M54 microclone. It was identified as an LI repeat. The sequence of the Hindlll- product matches the 11 kb long human LI repeat located in the intergenic region of the epsilon and gamma globin gene between nucleotide positions 7744 and 8544 and shows about 80% homology. The human LI repeat has a predicted Hindlll site at position 7744-7750. However, the sequence of the microclone M54 is not contained within the above mentioned LI repeat nor does the primer set used for walking show any significant match. Hybridisation experiments with M54, on the other hand, show that the human genome has more copies of this microclone. Presumably, the Hindlll walk represents an extension from such an M54-like sequence into an adjacent LI repeat.
The EcoRI PCR product (figure 6, panel B, lane E) did not show any significant matches with any sequences in the database .
It will of course be understood that the present invention has been described above by way of example only and that modifications and variations can be made by the skilled person without departing from the scope of the invention.
SUBSTITUTE SHEET

Claims

1. A polynucleotide amplification method comprising the steps of
i. forming a ligation product by ligating a target fragment, having sticky ends and including a first primer annealing region of known sequence, with a cassette, having a sticky end complementary to one of the sticky ends of the target fragment, the cassette including a second primer annealing region of known sequence, such that in the ligation product the known second primer annealing region is remote from the first primer annealing region,
ii. denaturing the ligation product,
iii. annealing only a first primer to the first primer annealing region, the first primer having attached thereto a separating label,
iv. adding nucleotides to the bound primer by use of a polymerase enzyme to form a double-stranded nucleic acid extension product,
v. denaturing the formed double stranded nucleic acid extension product,
vi. optionally repeating steps 3 to 5, leading to linear amplification in the production of single stranded nucleic acid having the separating label attached thereto,
vii. isolating the prepared nucleic acid by binding the attached label to a support matrix having a group cooperatively bindable with the label, and
viii. subjecting the isolated nucleic acid to exponential
SUBSTITUTE SHEET PCR amplification.
2. A method according to claim 1 wherein steps iii to iv are repeated up to 100 times.
3. A method according to claim 2 wherein the steps are repeated about 50 times.
4. A method according to any preceding claim wherein the exponential PCR amplification step is conducted using a nested PCR primer that anneals to a third primer annealing region distanced away from the original first primer annealing region.
5. A method according to claim 4 wherein the third known primer annealing region is situated between the first primer annealing site and the second primer annealing site of the cassette.
6. A method according to any preceding claim wherein the separating label is attached to the 5* end of the first primer.
7. A method according to any preceding claim wherein the separating label is a biotin label and the support matrix comprises a streptavidin coated matrix.
8. A method according to claim 7 wherein the matrix is in the form of a bead or rod.
9. A method according to claim 7 wherein the matrix is the surface of a well of a microtiter dish.
10. A method according to any preceding claim wherein the cassettes comprise two complementary oligonucleotides that are each in the range of 20 to 30 nucleotides long.
11. A method according to claim 10 wherein the cassette
SUBSTITUTE SHEET sequence contains a universal primer sequence.
12. A method according to claim 11 wherein the primer sequence is
5'd(TGT AAA ACG GCC AGT) 3 • , or
5»d(GTT TTC CCA GTC ACG AC) 3'.
13. A method according to any one of claims 10 to 12 wherein the cassette has the sequence of
5 d(CGTTGTAAAACGGCCAGTGCCAAGT) 3 ' 3 d(GCAACATTTTGCCGGTCACGGTTCATTAA) 5 ' ,
5 d(CGTTGTAAAACGGCCAGTGCCAAGT) 3 ' 3 d(GCAACATTTTGCCGGTACACGGTTCATCGA) 5' ,
5 d(CGTTGTAAAACGGCCAGTGCCAAGT) 3' 3 d(GCAACATTTTGCCGGTCACGGTTCACTAG) 5 ' ,
5 d(CGTTGTAAAACGGCCAGTGCCAAGT) 3 ' 3 d(GCAACATTTTGCCGGTCACGGTTCAGATC) 5 ' ,
5 d(CGTTGTAAAACGGCCAGTGCCAAGTTGCA) 3 ' 3 d(GCAACATTTTGCCGGTCACGGTTCA) 5 » , or
5 d(CGTTGTAAAACGGCCAGTGCCAAGT) 3 ' 3 d(GCAACATTTTGCCGGTCACGGTTCATNA) 5 ' ,
wherein N = any of the four bases G,A,T,C.
14. A method according to any preceding claim wherein the denatured ligation product is exponentially PCR amplified while still in the matrix support-bound state.
15. A method according to any preceding claim wherein the
SUBSTITUTE SHEET PCR products from step viii are sequenced by any of the existing dideoxy termination or chemical degradation techniques using radio- or fluorescently-labelled nucleotides or primers.
16. A method according to any preceding claim wherein a number of ligation products are formed in step i and a number of first primers are added in step iii and a number of second primers are added in step viii.
17. A method according to claim 16 wherein each of the first primers has a different separating label.
18. A method according to claim 17 wherein a number of support matrices are added, each matrix having attached thereto a respective different group.
19. A method according to any preceding claim wherein the target ds nucleic acid fragment is derived from a digestion of a sample of DNA.
20. A method according to claim 19 wherein the sample of DNA is genomic DNA.
21. The use of a ligation product, which product comprises a target fragment ligated to a cassette, in a method of genomic walking in any direction along the genomic nucleic acid, wherein the target fragment includes a first primer annealing region of known sequence and having annealed thereto a primer which has attached thereto a separating label, and wherein the cassette includes a second primer annealing region of known sequence.
22. A first kit comprising:
(a) a sample of genomic nucleic acid;
(b) means for excising a target fragment of nucleic
SUBSTITUTE SHEET acid having a first primer annealing region of known sequence from the genomic nucleic acid;
(c) a cassette ligatable to the excised fragment and having a region corresponding to a second primer annealing region of known sequence;
(d) a first primer, annealable to the first primer annealing region, having attached thereto a separating label;
(e) a second primer annealable to the second primer annealing region;
(f) a support matrix having attached thereto a group cooperatively bindable to the separating label; and optionally
(g) a third primer annealable to a third primer annealing region of known sequence upstream or downstream of the first primer annealing region; and further optionally
(h) at least any one of the following: a buffer, a polymerase, a washing solution and a nucleotide solution.
23. A kit according to claim 22 wherein the sample of genomic nucleic acid is DNA.
24. A kit according to claim 23 wherein the DNA is from one or more different organisms or segments of genomic nucleic acids
25. A kit according to anyone of claims 22 to 24 wherein the excising means is a restriction enzyme.
26. A kit according to claim 25 wherein a number' of
SUBSTITUTESHEET restriction enzymes are included.
27. A kit according to anyone of claims 22 to 26 wherein a number of cassettes ligatable to the excised fragment(s) are provided.
28. A kit according to anyone of claims 22 to 27 wherein the kit comprises a number of first primers, each annealable to the first primer annealing regions, and having attached thereto the same or a different separating label.
29. A kit according to anyone of claims 22 to 28 wherein the kit comprises a number of second primers, each annealable to the second primer annealing regions.
30. A kit according to anyone of claims 22 to 29 wherein the kit can include a number of support matrices, each having attached thereto the same or different group that is cooperatively bindable to the separating labels.
31. A kit according to anyone of claims 22 to 30 wherein the kit further comprises a number of third primers, each being annealable to the third primer annealing regions of known sequences that are situated on the target fragments.
32. A kit according to anyone of claims 22 to 31 wherein the kit includes anyone of an amount of incubation buffer, a sample of T4 DNA ligase, a buffer for in vitro amplification, a deoxynucleotide triphosphate solution, a polymerase, light mineral oil, one or more washing solutions, and means to attach a separating label to a (or any) first oligonucleotide primer.
33. A second kit comprising:
(a) a ligation product comprising a fragment of target genomic nucleic acid ligated to a cassette, wherein the target fragment includes a
SUBSTITUTE SHEET first primer annealing region of known sequence and the cassette includes a second primer annealing region of known sequence;
(b) a first primer, annealable to the first primer annealing region, having attached thereto a separating label;
(c) a second primer annealable to the second primer annealing region;
(d) a support matrix having attached thereto a group cooperably bindable to the separating label; and optionally
(e) a third primer annealable to a third primer annealing region of known sequence upstream of the first primer annealing region; and further optionally
(f) at least any one of the following: a buffer, a polymerase, a washing solution and a nucleotide solution.
34. A kit according to claim 33 wherein the second kit has a number of ligation products.
35. A method for extending cDNA clones comprising a PCT amplification method as claimed in claim 1.
36. A method according to claim 35, comprising the steps of i) synthesising double-stranded cDNA with the first strand primed with a first primer which hybridises to the poly-A tail of the mRNA;
ii) linear amplification of an aliquot of the cDNA using a primer, the primer being hybridisable to only one strand of a target cDNA, and having a separating label attached thereto;
iii) isolation of the labelled target cDNA extension product by binding the label to a support matrix having a group cooperably bindable with the label;
iv) tailing the 5'end of the PCR products resulting from the linear extension of a primer hybridising to the antisense strand with a dNTP; and
v) amplifying the unknown regions of the target cDNA between the known region and the 51 and/or the 3' terminus, by exponential amplification with an internal primer hybridising to the known region, and a general primer, which for the PCR product extended from the antisense strand- binding primer will comprise poly-dN where N is a base complementary to the dNTP used in step (iv) and for the PCR product extended from the sense strand-binding primer will comprise a primer complementary to or substantially identical to at least part of the first primer used in step (i).
37. A method according to claim 35 or claim 36, further comprising direct sequencing of the extension product.
38. A method according to any one of the claims 35 to 37, where the first primer comprises oligo-dT and the RACE primer.
39. A method according to any one of claims 35 to 38, wherein the first primer is
5' (ATCGATGGATCCGCGGCCGC(T) -Q) 3' .
40. A method according to any one of claims 35 to 39, wherein step (IV) is accomplished using a terminal transferase.
SUBSTITUTE SKfc!
41. A method according to any one of claims 35 to 40, wherein the dNTP is dGTP, and the poly-dN is poly-dC.
42. A method according to any one of claims 35 to 41, wherein the poly-dC general primer further comprises a sequence encoding a restriction endonuclease deavage site.
43. A method according to claim 42 where the primer is AACGAT(C)15.
44. A method according to any one of claims 38 to 43, wherein the general primer for the extension product of the sense-strand-binding primer is the RACE primer, ATCGATGGATCCGCGGCCGC.
45. A kit comprising:
(i) reagents suitable for the synthesis of cDNA from an RNA preparation and, optionally, an RNA preparation;
ii) a first strand priming primer hybridisable to the poly-A tail of an RNA species;
iii) a primer attached to a separatable label, hybridisable to a primer annealing region on one strand of a target mRNA; and, optionally, a further primer hybridisable to the alternative strand;
iv) a support matrix having attached thereto a group cooperably bindable to the separatable label;
v) a dNTP;
vi) a general primer hybridisable to a tail of the dNTPs in item (v) ;
vii) a 3' end general primer hybridisable with the first strand primer of item (ii) ; and optionally
SUBSTITUTE SHEET vii) at least one nested internal primer hybridisable to one strand of the known region of the target mRNA.
46. A kit according to claim 45, comprising primers hybridisable to both strands such that the cDNA can be extended in both directions.
47. A kit according to claim 46 or claim 45, further comprising a terminal transferase.
SUBSTITUTE SHEET
PCT/GB1991/000803 1990-05-22 1991-05-22 Polynucleotide amplification WO1991018114A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB909011454A GB9011454D0 (en) 1990-05-22 1990-05-22 Polynucleotide amplification
GB9011454.7 1990-05-22

Publications (1)

Publication Number Publication Date
WO1991018114A1 true WO1991018114A1 (en) 1991-11-28

Family

ID=10676376

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1991/000803 WO1991018114A1 (en) 1990-05-22 1991-05-22 Polynucleotide amplification

Country Status (4)

Country Link
EP (1) EP0530243A1 (en)
JP (1) JPH05508313A (en)
GB (1) GB9011454D0 (en)
WO (1) WO1991018114A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993011261A1 (en) * 1991-11-25 1993-06-10 Keygene N.V. A novel pcr method with a single primer for nucleic acid analysis
EP0628640A1 (en) * 1993-06-04 1994-12-14 Becton, Dickinson and Company Simultaneous amplification of multiple targets
EP0645449A1 (en) * 1993-09-24 1995-03-29 Roche Diagnostics GmbH Method for specific cloning of nucleic acids
WO1996017082A2 (en) * 1994-11-28 1996-06-06 E.I. Du Pont De Nemours And Company Compound microsatellite primers for the detection of genetic polymorphisms
WO1997030156A2 (en) * 1996-02-14 1997-08-21 Idexx Laboratories, Inc. NUCLEOTIDES AND PEPTIDES CORRESPONDING TO THE CANINE IgE HEAVY CHAIN CONSTANT REGION AND RELATED METHODS
EP0981535A1 (en) * 1997-05-12 2000-03-01 Life Technologies, Inc. Methods for production and purification of nucleic acid molecules
WO2000024929A2 (en) * 1998-10-26 2000-05-04 Christof Von Kalle Linear amplification mediated pcr (lam pcr)
EP1001037A2 (en) * 1998-09-28 2000-05-17 Whitehead Institute For Biomedical Research Pre-selection and isolation of single nucleotide polymorphisms
US6120996A (en) * 1994-07-11 2000-09-19 New York Blood Center, Inc. Method of identification and cloning differentially expressed messenger RNAs
WO2001000820A2 (en) * 1999-06-30 2001-01-04 Incyte Pharmaceuticals, Inc. METHODS AND COMPOSITIONS FOR PRODUCING 5' ENRICHED cDNA LIBRARIES
US6399334B1 (en) 1997-09-24 2002-06-04 Invitrogen Corporation Normalized nucleic acid libraries and methods of production thereof
US7972778B2 (en) 1997-04-17 2011-07-05 Applied Biosystems, Llc Method for detecting the presence of a single target nucleic acid in a sample
US7985547B2 (en) 1994-09-16 2011-07-26 Affymetrix, Inc. Capturing sequences adjacent to type-IIs restriction sites for genomic library mapping
US9074244B2 (en) 2008-03-11 2015-07-07 Affymetrix, Inc. Array-based translocation and rearrangement assays
JP2018534950A (en) * 2015-11-30 2018-11-29 デューク ユニバーシティ Therapeutic targets and methods of use for modification of the human dystrophin gene by gene editing
US20210040460A1 (en) 2012-04-27 2021-02-11 Duke University Genetic correction of mutated genes
US11970710B2 (en) 2015-10-13 2024-04-30 Duke University Genome engineering with Type I CRISPR systems in eukaryotic cells

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0297379A2 (en) * 1987-06-30 1989-01-04 Miles Inc. Method for amplifying genes
WO1990001065A1 (en) * 1988-07-26 1990-02-08 Genelabs Incorporated Rna and dna amplification techniques
EP0356021A2 (en) * 1988-07-28 1990-02-28 Zeneca Limited A method for the amplification of nucleotide sequences

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0297379A2 (en) * 1987-06-30 1989-01-04 Miles Inc. Method for amplifying genes
WO1990001065A1 (en) * 1988-07-26 1990-02-08 Genelabs Incorporated Rna and dna amplification techniques
EP0356021A2 (en) * 1988-07-28 1990-02-28 Zeneca Limited A method for the amplification of nucleotide sequences

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Biochemical and Biophysical Research Communications, volume 167, no. 2, 16 March 1990, Academic Press Inc. M. Kalman et al.: "Polymerase chain reaction (PCR) amplification with a single specific primer", pages 504-506 *
Biotechniques, volume 8, January 1990, K.H. Roux et al.: "A strategy for single site PCR amplification of dsDNA: Priming digested cloned or genomic DNA from an anchor-modified restriction site and a short internal sequence", pages 48-57 *
Gene, volume 84, 1989, Elsevier Science Publishers B.V., V. Shayamala et al.: "Genome walking by a single-specific-primer polymerase chain reaction: SSP-PCR", pages 1-8 *
Nucleic Acids Research, volume 18, no. 10, 25 May 1990, (Eynsham, Oxford, GB) A. Rosenthal et al.: "Genomic walking and sequencing by oligo-cassette mediated polymerase chain reaction", pages 3095-3096 *
Proc. Natl. Acad. Sci., volume 86, August 1989, Biochemistry O. Ohara et al.: "One-sided polymerase chain reaction: the amplification of cDNA", pages 5673-5677 *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993011261A1 (en) * 1991-11-25 1993-06-10 Keygene N.V. A novel pcr method with a single primer for nucleic acid analysis
EP0628640A1 (en) * 1993-06-04 1994-12-14 Becton, Dickinson and Company Simultaneous amplification of multiple targets
EP0645449A1 (en) * 1993-09-24 1995-03-29 Roche Diagnostics GmbH Method for specific cloning of nucleic acids
US6120996A (en) * 1994-07-11 2000-09-19 New York Blood Center, Inc. Method of identification and cloning differentially expressed messenger RNAs
US7985547B2 (en) 1994-09-16 2011-07-26 Affymetrix, Inc. Capturing sequences adjacent to type-IIs restriction sites for genomic library mapping
WO1996017082A2 (en) * 1994-11-28 1996-06-06 E.I. Du Pont De Nemours And Company Compound microsatellite primers for the detection of genetic polymorphisms
WO1996017082A3 (en) * 1994-11-28 1996-08-08 Du Pont Compound microsatellite primers for the detection of genetic polymorphisms
WO1997030156A2 (en) * 1996-02-14 1997-08-21 Idexx Laboratories, Inc. NUCLEOTIDES AND PEPTIDES CORRESPONDING TO THE CANINE IgE HEAVY CHAIN CONSTANT REGION AND RELATED METHODS
WO1997030156A3 (en) * 1996-02-14 1997-10-09 Idexx Lab Inc Nucleotides and peptides corresponding to the canine ige heavy chain constant region and related methods
US8563275B2 (en) 1997-04-17 2013-10-22 Applied Biosystems, Llc Method and device for detecting the presence of a single target nucleic acid in a sample
US8859204B2 (en) 1997-04-17 2014-10-14 Applied Biosystems, Llc Method for detecting the presence of a target nucleic acid sequence in a sample
US8822183B2 (en) 1997-04-17 2014-09-02 Applied Biosystems, Llc Device for amplifying target nucleic acid
US9506105B2 (en) 1997-04-17 2016-11-29 Applied Biosystems, Llc Device and method for amplifying target nucleic acid
US8551698B2 (en) 1997-04-17 2013-10-08 Applied Biosystems, Llc Method of loading sample into a microfluidic device
US8278071B2 (en) 1997-04-17 2012-10-02 Applied Biosystems, Llc Method for detecting the presence of a single target nucleic acid in a sample
US7972778B2 (en) 1997-04-17 2011-07-05 Applied Biosystems, Llc Method for detecting the presence of a single target nucleic acid in a sample
US8257925B2 (en) 1997-04-17 2012-09-04 Applied Biosystems, Llc Method for detecting the presence of a single target nucleic acid in a sample
US8067159B2 (en) 1997-04-17 2011-11-29 Applied Biosystems, Llc Methods of detecting amplified product
EP0981535A1 (en) * 1997-05-12 2000-03-01 Life Technologies, Inc. Methods for production and purification of nucleic acid molecules
EP0981535A4 (en) * 1997-05-12 2000-11-29 Life Technologies Inc Methods for production and purification of nucleic acid molecules
US6399334B1 (en) 1997-09-24 2002-06-04 Invitrogen Corporation Normalized nucleic acid libraries and methods of production thereof
EP1001037A2 (en) * 1998-09-28 2000-05-17 Whitehead Institute For Biomedical Research Pre-selection and isolation of single nucleotide polymorphisms
EP1001037A3 (en) * 1998-09-28 2003-10-01 Whitehead Institute For Biomedical Research Pre-selection and isolation of single nucleotide polymorphisms
WO2000024929A3 (en) * 1998-10-26 2000-09-21 Kalle Christof Von Linear amplification mediated pcr (lam pcr)
US6514706B1 (en) 1998-10-26 2003-02-04 Christoph Von Kalle Linear amplification mediated PCR (LAM PCR)
WO2000024929A2 (en) * 1998-10-26 2000-05-04 Christof Von Kalle Linear amplification mediated pcr (lam pcr)
WO2001000820A2 (en) * 1999-06-30 2001-01-04 Incyte Pharmaceuticals, Inc. METHODS AND COMPOSITIONS FOR PRODUCING 5' ENRICHED cDNA LIBRARIES
WO2001000820A3 (en) * 1999-06-30 2001-09-20 Incyte Pharma Inc METHODS AND COMPOSITIONS FOR PRODUCING 5' ENRICHED cDNA LIBRARIES
US9074244B2 (en) 2008-03-11 2015-07-07 Affymetrix, Inc. Array-based translocation and rearrangement assays
US9932636B2 (en) 2008-03-11 2018-04-03 Affymetrix, Inc. Array-based translocation and rearrangement assays
US20210040460A1 (en) 2012-04-27 2021-02-11 Duke University Genetic correction of mutated genes
US11976307B2 (en) 2012-04-27 2024-05-07 Duke University Genetic correction of mutated genes
US11970710B2 (en) 2015-10-13 2024-04-30 Duke University Genome engineering with Type I CRISPR systems in eukaryotic cells
JP2018534950A (en) * 2015-11-30 2018-11-29 デューク ユニバーシティ Therapeutic targets and methods of use for modification of the human dystrophin gene by gene editing
EP3384055A4 (en) * 2015-11-30 2019-04-17 Duke University Therapeutic targets for the correction of the human dystrophin gene by gene editing and methods of use
JP7108307B2 (en) 2015-11-30 2022-07-28 デューク ユニバーシティ Therapeutic targets and methods of use for modification of the human dystrophin gene by gene editing

Also Published As

Publication number Publication date
GB9011454D0 (en) 1990-07-11
JPH05508313A (en) 1993-11-25
EP0530243A1 (en) 1993-03-10

Similar Documents

Publication Publication Date Title
US5876932A (en) Method for gene expression analysis
EP2048248B1 (en) Method of amplifying a target nucleic acid by rolling circle amplification
CA2119557C (en) Selective restriction fragment amplification: a general method for dna fingerprinting
US5262311A (en) Methods to clone polyA mRNA
US5994068A (en) Nucleic acid indexing
US5565340A (en) Method for suppressing DNA fragment amplification during PCR
US5525493A (en) Cloning method and kit
US5837468A (en) PCR-based cDNA substractive cloning method
JP4040676B2 (en) Amplification of simple repeats
WO1991018114A1 (en) Polynucleotide amplification
US6846626B1 (en) Method for amplifying sequences from unknown DNA
WO1998040518A9 (en) Nucleic acid indexing
EP1501944A2 (en) Amplification of dna to produce single-stranded product of defined sequence and length
WO1997004131A1 (en) Single primer amplification of polynucleotide hairpins
EP1853725A1 (en) Method for producing an amplified polynucleotide sequence
JP6718881B2 (en) Nucleic acid amplification and library preparation
WO1997042346A1 (en) Amplification of nucleic acids
JPS63500006A (en) Nucleic acid base sequencing method using exonuclease inhibition
JP2002535999A (en) Genome analysis method
US6090548A (en) Method for identifying and/or quantifying expression of nucleic acid molecules in a sample
US5952201A (en) Method of preparing oligonucleotide probes or primers, vector therefor and use thereof
KR20210104108A (en) Nucleic Acid Amplification and Identification Methods
WO1993011261A1 (en) A novel pcr method with a single primer for nucleic acid analysis
RU2811465C2 (en) Method of amplification and identification of nucleic acids
EP4012029B1 (en) Method for capturing nucleic acid molecule, preparation method for nucleic acid library, and a sequencing method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE

WWE Wipo information: entry into national phase

Ref document number: 1991909458

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1991909458

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1991909458

Country of ref document: EP