WO2004001015A2

WO2004001015A2 - Method for sequencing nucleic acids

Info

Publication number: WO2004001015A2
Application number: PCT/US2003/019904
Authority: WO
Inventors: Lu Wang; Liu Xiangjun
Original assignee: Pel-Freez Clinical Systems, Llc
Priority date: 2002-06-25
Filing date: 2003-06-25
Publication date: 2003-12-31
Also published as: WO2004001015A3; AU2003256298A8; AU2003256298A1

Abstract

The present invention provides methods for performing sequencing by synthesis reactions. Some of the methods involve reducing the background noise level of sequencing by synthesis reactions by preventing nucleotide incorporation events that are not specific to the sequence being analyzed. Decreasing background results in longer read lengths in sequencing by synthesis methods. The present methods generally involve preventing extension of the 3' end of a target nucleic acid while performing the sequencing by synthesis reaction. Other methods involve the use of blocking agents to provide a defined endpoint in the sequencing reaction.

Description

METHOD FOR SEQUENCING NUCLEIC ACIDS

FIELD OF INVENTION

The present invention relates to sequencing nucleic acids. The present invention also relates to methods for reducing non-specific incorporation events in sequencing by synthesis of nucleic acids and increasing the reading length of sequencing by synthesis reactions. More particularly, the present invention provides such methods by preventing competitive extension of the 3' end of a target nucleic acid. The present invention also provides methods for multiplex sequencing by synthesis of nucleic acids using blocking agents.

BACKGROUND OF THE INVENTION

The detection of specific nucleic acids is an important tool for diagnostic medicine and molecular biology research. Nucleic acid identification currently plays an important role in identifying infectious organisms such as bacteria and viruses, in probing the expression of normal genes and identifying mutant genes such as oncogenes, in typing tissue for compatibility preceding tissue transplantation, in matching tissue or blood samples for forensic medicine and for exploring homology among genes from different species.

The "gold standard" for determining the identity of a nucleic acid is sequencing of the nucleic acid of interest. Although there are several different methods for directly sequencing nucleic acids, many of these methods remain cumbersome and take an extended period of time to obtain the nucleic acid sequence. So called sequencing by synthesis methods allow for the determination of a nucleic acid sequence merely by synthesizing the nucleic acid. Generally, these methods work by detecting the incoφoration of bases when synthesizing a nucleic acid and correlating the base incorporations with the nucleic acid sequence. These methods have proven relatively easy to implement while providing reliable, real-time results. However, the methods have encountered several drawbacks, not least of which is the practical problem that most of the sequencing by synthesis methods can only sequence a limited number of positions before the sequence information becomes unreadable or unreliable.

One major focus of tissue typing, paternity testing and disease association has been on the human leukocyte antigen (HLA) gene. The HLA alleles are the most diverse antigenic system in the human genome and encode literally hundreds of alleles that fall into several distinct subgroups or subfamilies. However, standard techniques for DNA typing have often proven inadequate in resolving many of these important alleles. Not only are techniques that are capable of unlocking this genetic diversity rare, such as sequencing analysis, they are also expensive and time consuming.

Accordingly, there is a need for improved sequencing by synthesis methods. For example by providing those which can provide reliable and accurate results while increasing the total reading length of sequencing by synthesis methods.

SUMMARY OF THE INVENTION

One aspect of the present invention provides a method of reducing non-specific incorporation events in a sequencing by synthesis reaction. The methods involves performing a sequencing by synthesis reaction on a template nucleic acid; and preventing the 3' end of the template nucleic acid from undergoing extension thereby reducing competing non-template specific incorporation events in the sequencing by synthesis reaction. In some methods, the sequencing by synthesis reaction can involve some, all or none of:

(i) hybridizing a sequencing primer with the template nucleic acid;

(ii) elongating the sequencing primer by the addition of a nucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil, or an analog thereof, wherein the nucleotide will only elongate the sequencing primer when the deoxynucleotide is complementary to the corresponding base in the template nucleic acid adjacent to the last position of the sequencing primer;

(iii) removing substantially all unincorporated deoxynucleotides;

(iv) repeating (ii) one or more times with an additional deoxynucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil or an analog thereof;

(v) detecting incorporation of the deoxynucleotides onto the sequencing primer; and

(vi) determining the sequence of the template nucleic acid based on the order and amount in which the nucleotides are incorporated onto the sequencing primer.

In some aspects of the present methods preventing the 3' end of the template nucleic acid from undergoing extension can occur with an agent that prevents extension of the 3' end of the nucleic acid, such as with a non-extendible nucleotide. In other aspects preventing the 3' end of the template nucleic acid from undergoing extension can include some or all of:

(i) hybridizing a non-extendible nucleic acid to the 3' end of the template nucleic acid;

(ii) placing a homopolynucleotide sequence on the 3' end of the template nucleic acid;

(iii) binding a protein to the 3' end of the template nucleic acid;

(iv) binding the 3' end of the template nucleic acid to a solid support; or

(v) adding a self- annealing sequence to the 3' end of the template nucleic acid, such as a palindromic sequence or universal bases, wherein the terminal 3' nucleotide does not self-hybridize to the template nucleic acid.

The present inventions further provide a nucleic acid whose sequence has at least a 5' terminus and a 3' terminus, wherein the 3' terminus of the nucleic acid comprises a sequence which can hybridize with itself provided that the 3' terminal base of the nucleic acid is not capable of being extended by a nucleic acid polymerase. One aspect of the present invention provides methods for determining the sequence of a nucleic acid. The method involves hybridizing a sequencing primer with a nucleic acid, elongating the sequencing primer in a template dependent manner with nucleotides and removing unincorporated nucleotides. Elongation of the sequencing primer is terminated at a predetermined position using a blocking agent. Nucleotide incorporation is detected and the sequence of the nucleic acid is determined based on the order and amount in which the deoxynucleotides were incorporated into the sequencing primer. The blocking event enhances the sequence determination of the polymorphic nucleic acid. The blocking agent can be a non- extendable nucleotide or a blocking oligonucleotide. Where the blocking agent is an oligonucleotide the blocking oligonucleotide and the sequencing primer can be joined together in the same moiety. The present method are particularly useful for multiplex sequencing by synthesis where the above step is performed simultaneously on one or more additional nucleic acids, often in the same reaction.

In some of the above methods, the nucleic acid sequence is determined by comparing the orders and amounts in which the deoxynucleotides were incorporated with a database of calculated or known values for the orders and amounts in which the deoxynucleotides would be incorporated for known nucleic acids wherein the controlled elongation of one of the sequencing primers by one or more deoxynucleotides compared to the other sequencing primer enhances the sequence determination of the nucleic acids.

The present invention also provides kits for performing the present methods.

One aspect of the present invention provides a method for determining the sequence of a polymorphic nucleic acid comprising:

(a) hybridizing a sequencing primer with a nucleic acid;

(b) elongating the sequencing primer by the addition of a nucleotide corresponding to one of adenine, cytosine, guanine, thymine, uracil or an analog thereof wherein the nucleotide will only elongate the sequencing primer when the deoxynucleotide is complementary to the corresponding base in the nucleic acid adjacent to the last position of the sequencing primer;

(c) removing substantially all unincorporated deoxynucleotides; (d) repeating (b) one or more times with an additional deoxynucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil;

(e) terminating the elongation of the sequencing primer at a predetermined position with a blocking agent;

(f) detecting incorporation of the deoxynucleotides onto the sequencing primer; and

(g) determining the sequence of the nucleic acid based on the order and amount in which the deoxynucleotides were incorporated into the sequencing primer wherein terminating the elongation of the sequencing primer with the blocking agent enhances the sequence determination of the polymorphic nucleic acid.

Another aspect of the present invention provides that the incoφoration of the deoxynucleotide releases phosphate in a quantity which is proportional to the amount of the deoxynucleotide incoφorated and the proportion of the released phosphate is correlated with the amount of the deoxynucleotide that is incoφorated. Still another aspect of the invention provides the blocking agent as a non-extendable nucleotide or a blocking oligonucleotide. With some of these methods the blocking oligonucleotide has a 5' end which is resistant to exonuclease degradation. Additionally the blocking oligonucleotide and the sequencing primer can be joined together in the same moiety. In certain of these aspects the blocking oligonucleotide portion of the moiety hybridizes more strongly with the polymoφhic nucleic acid than the sequencing primer portion of the moiety.

Other aspects of the above methods provide for simultaneously performing the steps above on one or more additional nucleic acids. This simultaneous performance of the methods on the one or more polymoφhic nucleic acids can occur in the same reaction. Additionally or alternatively, different sequencing primers are hybridized to each of the different nucleic acids or the same sequencing primer is used for some or all of the nucleic acids, hi certain aspects (b) and (d) further comprise adding the deoxynucleotides in an order such that one of the sequencing primers is elongated by one or more deoxynucleotides compared to one of the other sequencing primers, hi yet additional aspects (b) and (d) are performed such that the deoxynucleotides are not added in a repetitive manner supplying each of the deoxynucleotides in each elongation step. Step (g) can also include comparing the orders and amounts in which the deoxynucleotides were incoφorated with a database of calculated or known values for the orders and amounts in which the deoxynucleotides would be incoφorated for known nucleic acids wherein the controlled elongation of one of the sequencing primers by one or more deoxynucleotides compared to the other sequencing primer enhances the sequence determination of the nucleic acids. In certain of these aspects the deoxynucleotides are added in an order which corresponds to the putative sequence of the combination nucleic acids and is based on the database of calculated or known values for the orders and amounts in which the deoxynucleotides would be incoφorated for known combinations of nucleic acids.

In yet other aspects of the above methods the database comprises values determined by:

(i) selecting two different nucleic acids from a group of nucleic acids;

(ii) selecting a first sequencing primer for the first nucleic acid; (iii) selecting a second sequencing primer for the second nucleic acid; and

(iv) determining a controlled order in which the deoxynucleotides can be added to determine the sequence of the first nucleic acid, second nucleic acid or both.

The methods can also be include methods comprising repeating (i) - (iv) with different combinations of nucleic acids.

In certain preferred aspects the nucleic acids are polymoφhic nucleic acids, and/or alleles. hi some of these aspects the one or more polymoφhic positions at which elongation of the sequencing primer is terminated is a single nucleotide polymoφhism.

Other aspects of the present invention provide kits for determining the sequence of a polymoφhic nucleic acid comprising instructions for carrying out any of the previous aspects and one or more reagents for carrying out the method. The one or more reagents can be selected from the group consisting of:

(i) one or more sequencing primers; (ii) one or more reaction buffers;

(iii) one or more oligonucleotides suitable for nucleic acid amplification;

(iv) one or more enzymes;

(v) one or more solid supports; and

(vi) one or more pieces of lab equipment.

The present invention also provides kits for performing the present methods. All aspects of the present invention can be used with any other suitable aspect of the present invention described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a pyrogram resulting from the experiment described in Example 1 ; and

FIG. 2 is a pyrogram resulting from the experiment described in Example 2.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

This application relates to U.S. Provisional patent application nos. 60/391,509 and 60/391, 267, both of which were filed June 25, 2002. The entire contents of both these applications are hereby incoφorated by reference.

The present invention provides techniques for increasing the reading length, i.e., the number of positions that can be accurately sequenced, of sequencing by synthesis methods. The present invention also provides techniques for determining the sequence of closely related nucleic acids using blocking primers in conjunction with sequencing by synthesis methods in conjunction with increasing the reading length of sequencing by synthesis methods or alone.

The present methods generally involve reducing incoφoration events that are not specific to a sequence of interest in a target nucleic acid, i.e., the nucleic acid being sequenced, by making the 3' end of a target nucleic acid non-extendible. Suφrisingly, it has been determined that by reducing incoφoration events that are not specific to the sequence of interest not only is the accuracy of the results increased but the background level of the sequencing by synthesis reaction is reduced and a greater number of positions can be sequenced. Reducing background signal results in improved accuracy in peak height detection, which is particularly important when detecting polymoφhic sequences, such as with heterozygote templates and a higher net sequence specific signal (i.e. a sequence specific signal taken minus background signal) which increases reading length in sequencing by synthesis reactions. In some of the present methods reduce background signal by at least half compared to controls. More preferably, the backgrounds achieved by the same invention range from one-third to one-twentieth, one-fourth to one-fifteenth or one-fifth to one-tenth those achieved by controls run under the same conditions except where the 3' end of the template sequence is not modified to prevent extension. Such background signal reductions give average reading lengths of about 40 bases, but can provide maximal reading lengths up to about 75, 80, 85, 90, 100 or more bases.

Sequencing by synthesis methods generally involve adding a target nucleic acid, e.g. single or double-stranded nucleic acid, which is to be sequenced and a sequencing primer that targets a sequence of interest in the target nucleic acid together in a reaction mixture which permits hybridization of the target nucleic acid and the sequencing primer. Nucleotides, or their analogs that can be incoφorated onto the sequencing primer such as deoxynucleotides, that correspond to one of the naturally occurring bases, adenine, cytosine, guanine, thymine or uracil, are dispensed into the reaction mixture one at a time. The respective nucleotides are incoφorated onto, and elongate, the sequencing primer ifthe nucleotide is complementary to the adjacent corresponding base in the one of the nucleic acids. Generally, as will be understood by those skilled in the art, the reaction mixture will contain additional components that facilitate the sequencing including those that facilitate incoφoration of the nucleotides onto the sequencing primer, such as DNA polymerases, and those that aid in the detection of the incoφoration of the nucleotides onto the sequencing primer, such as luciferase, sulfurylase, ATP, and the like. Addition of specific nucleotides is repeated as desired to further elongate the sequencing primer. Nucleotides which have not been incoφorated into the sequencing primer are removed from the reaction mixture, such as through enzymatic degradation or washing, so that they do not interfere with later nucleotide incoφoration by providing false signals. In order to determine the sequence of the target nucleic acid, incoφoration of the nucleotides into the sequencing primer is detected and the sequence of the target nucleic acid is determined from the order and amount in which the nucleotides are incoφorated onto the sequencing primers. Examples of these techniques can be found in U.S. Patent Nos. 4,863,849; 5,405,746; 6,210,891; and 6,258,568 and PCT applications WO 98/13523, WO 98/28440, WO 00/43540, WO 01/42496, WO 02/20836 and WO 02/20837. Average reading lengths achieved by the these techniques is about 20 bases and generally ranges from 15 to 45 bases. Accordingly, some of the present methods can provide average and maximal reading lengths can be increased to at least 125%, 150%, 175%o, 200% or more when compared to control reactions.

As is well known in the art, polymerases polymerize nucleic acid molecules in a 5'— »3' direction, they thus extend the 3' terminus of a complementary primer in a template dependent manner. Thus nucleic acid synthesis is dictated by complementary base pairing. In such a reaction, the target molecule serves as a template, for the extension of the sequencing primer, such that the primer extension product has a sequence that is complementary to that of the template. As used within the context of the primer extension reaction, the term nucleotide refers to any of the naturally occurring deoxynucleotides (i.e., dATP, dTTP, dUTP, dGTP and dCTP), dideoxynucleotides, or derivatives of the foregoing, so long as the nucleotide can be incoφorated at the 3'-end of a primer during template-dependent primer extension. Hence, a nucleotide can be an extendible nucleotide and or a non-extendible nucleotide. An extendible nucleotide refers to nucleotides to which another nucleotide can be attached at the 3' position of the ribose moiety. Thus, extendible nucleotides include the naturally occurring deoxynucleotides dATP, dTTP, dUTP, dCTP and dGTP, as well as derivatives of these nucleotides that are extendible. Nucleoside triphosphate analogues, and the like (Piccirilli, J. A. et al, Nature 343:33- 37 (1990)) can be substituted or added to those specified above, provided that the base pairing, polymerase and strand displacing functions are not adversely affected to the point that the amplification does not proceed to the desired extent. It is also well known that upstream designates a region situated on the side of the 5' end of the nucleic acid or of the polynucleotide sequence in question, and the expression downstream designates a region situated on the side of the 3' end of the said nucleic acid or of the said polynucleotide sequence.

Generally, the present methods involve reducing the background level in sequencing by synthesis reactions. To this end, the present inventors have discovered that one cause of incoφoration events that are not specific to the sequence of interest in a target or template nucleic acid, and attendant background noise caused by these nucleotide incoφorations, results from the 3' end of the single stranded target nucleic acid annealing onto itself or a different nucleic acid and acting as an extension point for a polymerase. This increased background reduces the number of nucleotide incoφorations which can be accurately detected, generally because the specificity decreases when background signal increases, thus resulting in a shorter reading length for the sequencing by synthesis reaction. Reading lengths as described herein are generally measured by nucleotide dispensation events, although nucleotide incoφorations can also provide a measure of reading length. The present methods overcome this problem by rendering the 3' end of the target nucleic acid non- extendible and substantially eliminating the cause of some incoφoration events that are not specific to the sequence of interest in the target. Alternatively, the 3' end of the target nucleic acid can be prevented from participating an extension reaction, hi the present methods the sequence of interest is preferably internal in the target nucleic acid and not at the 3' end thereof.

According to the present methods, the method of rendering the 3' end non-extendible or preventing extension of the 3' end is not particularly limited, h some of the present methods an agent that prevents extension of the 3' end of the nucleic acid can be used to perform these functions. In other aspects, non-extendible nucleotides, such as dideoxynucleotides, can be used to terminate the 3' end of the target sequence. The 3' end of the target nucleic acid can also be a stretch of homopolynucleic acids, such as poly- A, poly-C, poly-G or poly-T. A nucleic acid, peptide nucleic acid, locked nucleic acid, protein, solid support or other moiety can also be hybridized or bound to the 3 ' end of the target nucleic acid thereby rendering the 3' end of the nucleic acid non-extendible. Peptide nucleic acids are discussed for example in U.S. Patent Nos. 5,539,082; 5,773, 571, 6,395,474; and 6,403,763 and locked nucleic acids are discussed in U.S. Patent Nos . 6,391,592; 6,251,639; and Kumar et al, Bioorg. Med. Chem Lett. 1998, 8(16):2219-22; and Wahlestedt et al, Proc. Natl. Acad. Sci. USA 2000, 97(10):5633-8. Another means for carrying out the present methods involves providing a target nucleic acid whose sequence immediately upstream of the 3' end comprises a sequence that can anneal to itself provided that the terminal 3 ' base or bases are unmatched and remain unhybridized so that extension of the 3' end of the target nucleic acid is not possible. Such a 3' self-annealing sequence can take many forms including a palindromic sequence or a number of universal bases or combinations thereof. Combinations of these approaches can also be used in performing the present methods.

At times, the target nucleic acid having the sequence of interest will be obtained and/or isolated from a patient having the sequence of interest. In this instance, and others, it is often desired to amplify a small amount of target nucleic acid in order to produce a sufficient amount of target nucleic acid for performing the sequencing by synthesis reaction. This amplification step can be exploited to facilitate the present methods by providing the target nucleic acid whose 3' end is non-extendible. For example, a target nucleic acid can be amplified by any suitable method, such as PCR, and the amplified nucleic acids can be terminated at the appropriate position by the incoφoration of the agent that prevents extension of the 3' end of the primer onto the 3' end of the nucleic acid. The agent that prevents extension of the 3' end of the nucleic acid is an agent that is capable of generally preventing the replication of a nucleic acid template by a nucleic acid polymerase (DNA polymerase, RNA polymerase, reverse transcriptase or replicase). An agent that prevents extension of the 3' end of the nucleic acid may be a nucleic compound (for example a modified nucleotide), or a non-nucleic compound, which is not recognized as template by the relevant polymerase. In one aspect of the present methods, termination includes incoφorating a moiety, such as a nucleotide analog, into the nucleic acid so that it is no longer capable of undergoing further extension. Preferably, this can include incoφorating a non-extendible nucleotide at the desired position of interest thus preventing elongation of the nucleic acid by the polymerase. However any moiety can be used so long as it can be incoφorated onto and prevent extension of the nucleic acid. A non-extendible nucleotide refers to nucleotide analogs that once incoφorated into the primer cannot be extended further, i.e., there is no 3' hydroxyl group or the 3' hydroxyl group has been modified such that another nucleotide cannot be attached at the 3' position. Thus, suitable non-extendible nucleotides include nucleotides in which the 3' hydroxyl group is substituted with a different moiety such that another nucleotide cannot be joined to a primer once the non-extendible nucleotide is incoφorated into the primer. Such moieties include, but are not limited to, -H, -SH and other substituent groups. Specific examples of non- extendible nucleotides include dideoxynucleotides and arabinoside triphosphates as discussed in U.S. Patent No. 6,355,433.

The amplified nucleic acid can also be chosen or designed such that, when amplified, the 3' terminus has a homopolynucleic acid sequence or self- annealing sequence as described above where the 3' terminal base is unmatched upon annealing. Additionally, any or all such sequences or agents that prevent extension of the 3' end of the nucleic acid described herein can be ligated to the 3' end of the target nucleic acid through methods well known in the art. h one method the self-annealing sequence is provided by incoφorating a sufficient number of universal bases, such as 4, 5, 6, 7, 8, 9, 10 or more, onto the 3' end, but not at the 3' terminus itself, of the amplified target nucleic acid instead of "normal" bases so that the universal bases are then capable of annealing onto the target nucleic acid sequence or themselves. As used herein, universal nucleotide, base, nucleoside or the like, refers to a molecule that can bind to two or more, i.e., 3, 4, or all 5, naturally occurring bases in a relatively indiscriminate or non-preferential manner. Preferably, the universal base can bind to all of the naturally occurring bases in this manner, such as 2'-deoxyinosine (inosine). Most, preferably, the universal base can bind all of the naturally occurring bases with equal affinity, such as 3-nitropyrrole 2 '-deoxynucleoside (3-nitropyrrole) and those disclosed in U.S. Patent Nos. 5,438,131 and 5,681,947. Generally, when the base is "universal" for only a subset of the natural bases, that subset will generally either be purines (adenine or guanine) or pyrimidines (cytosine, thymine or uracil). Examples of nucleotides that can be considered universal for purines are known as the "K" base (N6-methoxy-2,6-diaminopurine), as discussed in Bergstrom et al, Nucleic Acids Res. 25:1935 (1997) and pyrimidines are known as the "P" base (6H,8H-3,4- dihydropyrimido[4,5-c][l,2]oxazin-7-one), as discussed in Bergstrom et al, supra, and U.S. Patent No. 6,313,286. Other suitable universal nucleotides include 5- nitroindole (5-nitroindole 2'-deoxynucleoside), 4-nitroindole (4-nitroindole 2'- deoxynucleoside), 6-nitroindole (6-nitroindole 2 '-deoxynucleoside) or 2'- deoxynebularine. A partial order of duplex stability has been found as follows: 5- nitroindole > 4-nitroindole > 6-nitroindole > 3-nitropyrrole. Combinations of these universal bases can also be used as desired.

In other methods, the agent that prevents extension of the 3' end of the nucleic acid can be an oligonucleotide. When such an agent is an oligonucleotide (i.e. a blocker oligonucleotide) it can be designed to hybridize to the 3' end of the target nucleic acid of interest thereby depriving the polymerase a template to extend on the 3' end of the nucleic acid. As will be understood by the skilled artisan, preferably the blocker oligonucleotide does not hybridize to the sequence of interest within the target nucleic acid. The blocker oligonucleotide can be any length and is selected to be complementary to a portion of a target molecule. In some embodiments, the blocker oligonucleotide is substantially incapable of serving as a primer. Thus, the 3' terminus of the blocker oligonucleotide is preferably modified so that it cam ot be extended. Any compound which accomplishes this objective can be used. Exemplary blocking groups are biotin, di-deoxynucleotide triphosphates ("ddNTPs"), also referred to as chain terminating ddNTPs. hi other methods, the oligonucleotide can be made non-extendable by adding bases to the 3 ' end that are not complementary to the target sequence and therefore do not base-pair and cannot be enzymatically extended. These agents and methods are also useful for the 3' end of the target nucleic acid. In several preferred embodiments, the blocking group is detectably labeled. The blocker oligonucleotide is preferably between about 10 to about 40 nucleotides and more preferably between about 15 and about 35 nucleotides. However, the blocker oligonucleotide can be as small as desired, for example two nucleotides in length (where the nucleotide at the 3' end comprises a blocking moiety). Likewise, the 3' terminus of the blocker oligonucleotide will typically be capable of hybridizing to the target molecule, however, the 3' terminus of the blocker oligonucleotide need not be capable of such hybridization. The length of the blocker oligonucleotide can vary depending upon the experimental needs of the investigator and a recognition that the Tm decreases as the length decreases (i.e. preferential hybridization cannot be assured). Tm is the temperature at which 50%) of the base pairing between two strands of a nucleic acid is disrupted. Tm is a function of the length of single stranded DNA and the base composition thereof. Accordingly, the length and or sequence of the blocker oligonucleotide can be adjusted such that the Tm of the blocker oligonucleotide will be a desired value, for example between about 37°C and about 98°C, more preferably between about 70°C and about 90°C.

Significantly, the sequence information of the sequence of interest in the target nucleic acid is not required. Thus, the target sequence may be partially or fully undefined, hi this embodiment, the sequencing primer and blocker oligonucleotides are designed such that, upon hybridization to the target molecule, the 3' terminus of the primer and the 5' terminus of the blocker oligonucleotide will be separated by a gap (which may contain either known or unknown sequences, or a combination of known and unknown sequences). Such a gap may be from 1 to about 10,000 nucleotides in length. The gap between the sequencing primer and the blocking oligonucleotide is then determined using sequencing by synthesis techniques.

As will be understood by the skilled artisan, in order to prevent extension of the 3' end of the target nucleic acid, the blocker oligonucleotide should hybridize to the target sequence prior to the sequencing primer or the initiation of the sequencing reaction. One skilled in the art will also understand that 5' end of the blocking oligonucleotide should not be readily displaced from the 3' end of the target nucleic acid.

One advantage of adding a self-annealing sequence onto the terminus of a nucleic acid template is that all the PCR amplicons will have the self-annealing sequence. This results in more complete prevention of 3' end non-specific annealing to other regions of target DNA which can serve to act as a point of extension for a polymerase. Other methods that modify the 3' end of PCR amplicon, usually after PCR reaction is completed, are less preferred as they do not provide as complete modification of the 3' end of the template nucleic acid. However, while the sequencing by synthesis reaction generally uses a single stranded DNA as a template, methods for using double-stranded DNA template are also envisioned, such as those discussed in Nordstrom, T. et al, Anal. Biochem. 282, 186-193 (2000). Modification of both 3' ends of a PCR amplicon is advantageous for preventing non-specific 3' end annealing.

In preferred embodiments of the present invention, the 5 '-end of the blocking oligonucleotide is also resistant to enzymatic 5 '-3' exonuclease activity. For example, one such blocking oligonucleotide can be easily synthesized by replacing the first (5 '-most) phosphodiester bond with a thioester bond that resists exonucleolytic cleavage. Other, analogous strategies might also be used to prevent exonucleolytic removal of the blocking oligonucleotide. Additionally or alternatively, enzymes that lack 5 '-exonuclease activity can be used in the present methods. A few such enzymes are referred to as the Klenow fragment or Stoffel fragment of Taq polymerase, which are known and commercially available DNA polymerases capable of adding nucleotides to the extending end of a primer, but lacking 5' exonuclease activity. As a result, once the oligonucleotide blocker anneals to its complementary region on a DNA template, it is difficult to remove.

In a preferred embodiment of the invention, the binding of a non- extendable oligonucleotide blocker is favored over the binding of the sequencing primer. This is achieved by using a non-extendable oligonucleotide blocker designed to have a higher melting point than the melting points possessed by the primers. This non-extendable oligonucleotide blocker can be manufactured by making the blocker significantly longer than the primers. The higher melting point gives the blocker a kinetic advantage at certain temperatures over that of its associated primers and assures its presence as a stable duplex when encountered by the extending primers. Accordingly, it is preferred that the length of sequencing primer be less than about 75%o of the length of blocker oligonucleotide, more preferably less than about 60% of the length of blocker oligonucleotide and most preferably less than about 50% of the length of blocker. Alternatively, it is preferred that the Tm of sequencing primer be less than about 75% of the Tm of the blocker oligonucleotide, more preferably less than about 60%> of the Tm of the blocker oligonucleotide and most preferably less than about 50% of the Tm of blocker. One skilled in the art will realize however that the blocking oligonucleotide and the primer can be substantially the same length, such as where the respective oligonucleotides are within about 90%> or 95 %> of the others length. In fact, the primer can be longer than the blocker. This is particularly true where modified nucleic acids such as PNAs and LNAs are used as blockers. By ensuring that the primer is shorter than the blocker, there is increased probability of blocker oligonucleotide hybridization occurring before sequencing primer hybridization. An equivalent approach to satisfy the objective of hybridization of blocker oligonucleotide to the target before the sequencing primer is to add the moieties in a serial fashion with the blocker oligonucleotide being added to the reaction mixture before the primer. Alternatively, it should be noted that the order of binding can also be controlled by altering the ratio and/or concentration of reactants.

Those skilled in the art will appreciate that the length of an oligonucleotide moiety, which is important to the Tm thereof vis-a-vis hybridization to a complementary sequence, can be manipulated in order to increase the speed of hybridization of the moiety to the complementary sequence. Thus, for example, given a target sequence having two regions of defined sequence, X and Y; a first oligonucleotide having a length X' complementary to region X; and a second oligonucleotide having a length Y' complementary to region Y, the first oligonucleotide will typically hybridize under more stringent conditions to the target faster than the second oligonucleotide when X'>Y' in length. This facet of oligonucleotide hybridization is amenable to efficient exploitation for the disclosed sequencing procedure.

The sequencing primer or oligonucleotide, e.g. the first oligonucleotide, of some of the methods possess a 3' terminus which can be extended by a DNA polymerase. The oligonucleotide may be of any length ranging from about 5 nucleotides to several hundred. Preferably, the primer oligonucleotide will have a length of greater than 10 nucleotides, and more preferably, a length of from about 12- 50 nucleotides, such as 12-20 or 14-17. The primer oligonucleotides can also be chosen to have a desired melting temperature, such as about 40 to about 80°C, about 50 to about 70°C, about 55 to about 65°C, or about 60 °C. The length of the primer oligonucleotide must be sufficient to permit the primer oligonucleotide to be capable of hybridizing to the target molecule. The sequence of the primer oligonucleotide is selected such that it is complementary to a predetermined sequence of the target' molecule. This predetermined sequence may be a previously determined sequence (such as a gene, regulatory sequence, etc.) or may be a hypothetical sequence (such as a restriction endonuclease recognition site, a combination of such sites, etc.).

Preferably, the target nucleic acid or sequence of interest forms part of a coding region in a gene associated with a genetic disease, and the primer oligonucleotide 's sequence is selected such that its extension will form a desired sequence that contains the genetic mutation that characterizes the disease. As described herein, by suitably targeting the sequences of the oligonucleotides of the present invention, it is possible to diagnose or predict genetic disease in individuals whose gene sequences differ by as few as one nucleotide from the corresponding sequences of these who do not have the disease.

In one embodiment of the present methods a sequencing or extension primer is hybridized upstream and adjacent to or near a position or sequence of interest on a target nucleic acid. Elongation of the extension primer is performed by sequential, stepwise addition of nucleotides to the reaction mixture. Preferably, the nucleotides are added to the mixture in a non-cyclical or non-repetitive manner so that all of the nucleotides are not added in each elongation step. Alternatively, nucleotides can be added to the reaction in a cyclical or repetitive manner. Unincoφorated nucleotides are removed from the reaction mixture after each attempted incoφoration event so that they do not become incoφorated later and provide false signals. Removal of the unincoφorated nucleotides can be performed by washing, such as where the nucleic acid and primers are attached to solid support and the and the surrounding mixture is removed and washed after each reaction step. Preferably, however, the unincoφorated nucleotides are enzymatically degraded, generally at a rate that is slower than nucleotide incoφoration but that is fast enough that the unincoφorated nucleotides do not interfere with later incoφoration events. A suitable enzyme meeting these criteria is apyrase.

In contrast, incoφoration of the nucleotides onto the sequencing primer is detected, hi one preferred embodiment, each incoφoration event releases pyrophosphate in an amount that is proportional to the amount of the nucleotide incoφorated. Pyrophosphate can be determined by many different methods and a number of enzymatic methods have been described in the literature (Reeves et al, (1969), Anal. Biochem., 28, 282-287; Guillory et al, (1971), Anal. Biochem., 39, 170-180; Johnson et al, (1968), Anal. Biochem., 15, 273; Cook et al, (1978), Anal. Biochem. 91, 557-565; and Drake et al, (1979), Anal. Biochem. 94, 117-120). It is preferred to use luciferase and luciferin in combination to identify the release of pyrophosphate since the amount of light generated is substantially proportional to the amount of pyrophosphate released which, in turn, is directly proportional to the amount of base incoφorated. The amount of light can readily be estimated by a suitable light sensitive device such as a luminometer. Luciferin-luciferase reactions to detect the release of pyrophosphate are well known in the art. In particular, a method for continuous monitoring of pyrophosphate release based on the enzymes ATP sulphurylase and luciferase has been developed by Nyren and Lundin (Anal. Biochem., 151, 504-509, 1985) and termed ELJDA (Enzymatic Luminometric Inorganic Pyrophosphate Detection Assay). The use of the ELIDA method to detect pyrophosphate is preferred according to the present invention. The method may however be modified, for example by the use of a more thermostable luciferase (Kaliyama et al, 1994, Biosci. Biotech. Biochem., 58, 1170-1171) and/or ATP sulfurylase (Onda et al, 1996, Bioscience, Biotechnology and Biochemistry, 60:10, 1740-42). The preferred detection enzymes involved in the phosphate detection reaction are thus ATP sulphurylase and luciferase.

In some embodiments of the present invention described above and others, including those where the 3' end of the target nucleic acid are not prevented from undergoing extension, elongation of the extension primer is terminated by a blocking agent, either at the position of interest, at the 3 '-end of the sequence of interest or just downstream (generally within 5 positions) of either of these positions. As used in the present methods, terminating elongation of the sequencing primer is an active termination step such that, under normal reaction conditions, the sequencing primer cannot undergo further elongation. Termination of the extension of the sequencing primer is detected as is incoφoration of the nucleotides onto the extension primer. The sequence of the target nucleic acid is then determined based on the order and amount of nucleotides incoφorated onto the sequencing primer. The termination event, preferably at or immediately downstream of the position of interest, facilitates identification of the sequence of interest in the target nucleic acid by providing a discrete end point in the sequencing of the target nucleic acid.

In some of the present methods, with or without blocked extension of the 3' end of the target nucleic acid, termination of elongation of the sequencing primer is achieved via a blocking agent. The blocking agent is an agent that is capable of blocking the replication of a nucleic acid template by a nucleic acid polymerase (DNA polymerase, RNA polymerase, reverse transcriptase or replicase). A blocking agent may be a nucleic compound (for example a modified nucleotide), or a non-nucleic compound, which is not recognized as template by the relevant polymerase. In one aspect of the present methods, termination includes incoφorating a moiety, such as a nucleotide analog, into the sequencing primer so that the primer is no longer capable of undergoing further extension. Preferably, this can include incoφorating a non-extendible nucleotide at the desired position of interest onto the sequencing primer thus preventing elongation of the primer by the polymerase, however any moiety can be used so long as it can be incoφorated onto and prevent extension of the primer. A non-extendible nucleotide refers to nucleotide analogs that once incoφorated into the primer cannot be extended further, i.e., there is no 3' hydroxyl group or the 3' hydroxyl group has been modified such that another nucleotide cannot be attached at the 3¹ position. Thus, suitable non-extendible nucleotides include nucleotides in which the 3' hydroxyl group is substituted with a different moiety such that another nucleotide cannot be joined to a primer once the non-extendible nucleotide is incoφorated into the primer. Such moieties include, but are not limited to, -H, -SH and other substituent groups. Specific examples of non- extendible nucleotides include dideoxynucleotides and arabinoside triphosphates. Preferably, although not necessarily, the terminating moiety can also identify the base at the position in the target nucleic acid complementary to the terminating moiety.

In other methods, the blocking agent can be an oligonucleotide. Suitable oligonucleotides include nucleic acids, peptide nucleic acids and locked nucleic acids. Peptide nucleic acids are discussed for example in U.S. Patent Nos. 5,539,082; 5,773, 571, 6,395,474; and 6,403,763 and locked nucleic acids are discussed in U.S. Patent Nos . 6,391,592; 6,251,639; and Kumar et al, Bioorg. Med. Chem Lett. 1998, 8(16):2219-22; and Wahlestedt et al, Proc. Natl. Acad. Sci. USA 2000, 97(10):5633-8. When the blocking agent is an oligonucleotide, the blocker oligonucleotide is designed to hybridize downstream of the sequencing primer in an orientation such that the 3' terminus of the sequencing primer can be extended to abut the 5' terminus of the blocker oligonucleotide when both molecules are hybridized to the same strand of the target molecule. In some embodiments of the invention, the 5' terminus of the blocker oligonucleotide is designed such that, when hybridized, the 5' terminal nucleotide of the blocker oligonucleotide will oppose a predetermined site in another nucleic acid molecule. A nucleotide of a hybridized oligonucleotide is said to oppose another nucleotide ifthe two nucleotides are opposite one another in the hybridized product (i.e. positioned such that they would base pair with one another if they were complementary). Thus, the function of the blocker oligonucleotide is to block the 3' terminus of the sequencing primer from being extended past the 5' terminus of the blocker oligonucleotide. In some preferred embodiments, the 5' terminus of the blocking oligonucleotide is designed to hybridize either at the position of interest on the target nucleic acid or immediately downstream thereof, i.e. at the next position. Accordingly, the blocking oligonucleotide stops the sequencing of the target nucleic acid providing easy identification of the last nucleic acid on which sequencing is performed. Identification of the position or sequence of interest can be based on the nucleotides incoφorated onto the sequencing primer alone, or in conjunction with the sequence of the blocking oligonucleotide.

This blocker oligonucleotide can be any length and is selected to be complementary to a portion of a target molecule. Preferably, the block oligonucleotide is substantially incapable of serving as a primer. Other embodiments of blocker oligonucleotides are described herein.

As will be understood by the skilled artisan, in order to block extension of the sequencing primer, the blocker oligonucleotide should hybridize to a target sequence before the sequencing primer, or before the primer extension product has been extended to a site beyond the site at which the 5' terminus of the blocker oligonucleotide can hybridize. One skilled in the art will also understand that the blocking oligonucleotide should not be readily displaced by the extension of the sequencing primer.

In another embodiment of the present methods, the sequencing primer and blocking agent can be joined together by a linker and both be part of the same moiety, hi one such embodiment, the entire moiety is an integrated oligonucleotide where both the sequencing primer and the blocker are oligonucleotides joined together by an oligonucleotide sequence. In this embodiment, it will be readily apparent that one of the termini of oligonucleotide can be the 3'-terminus of the sequencing primer whereas the other termini of the oligonucleotide can form the 5'- terminus of the blocking oligonucleotide. Generally, there is no limitation on the length or composition of the linker, noting, however, that the linker between the primer and blocker should be long enough and flexible enough to permit both primer and blocker to hybridize or bind the target nucleic acid with sufficient strength to perform their respective functions. When the linker is composed of nucleic acids, it is preferred that the linker have a sequence that does not bind nucleic acids in the reaction and thus interfere with sequencing.

Alternatively, the linker region can be some other flexible chemical structure, such as a substituted or unsubstituted alkyl or aryl group having an appropriate number of carbon atoms, ribose or 1', 2'-dideoxyribose chains. Preferably, the linker does not interact or interfere with the target nucleic acid or other nucleic acids in the sample.

As will be apparent to the skilled artisan, inhibition of amplification in the presence of a blocker with a specific sequence implies that the blocker has bound to a sequence in the inter-primer region of the template. If a template or templates were used that possessed the same first and second primer regions, but one lacked the region complementary to the blocker, then sequencing of the template lacking the blocker would take place. Thus the blocker allows one to distinguish between two templates with the same primer regions but that differ in their putative blocking regions in this way. It also can allow selective inhibition of amplification of one or more such templates (all possessing substantially the same primer regions) in the same reaction mixture, while allowing amplification to proceed uninhibited from other templates, depending on the presence or absence of sequences complementary to those of the blocker.

Thus, the present methods can also be used for genotyping by blocking sequencing of one allotype differing from another allotype by only a short insertion to which a blocker could be directed. The allotype lacking that insertion would then be detectible in the presence of the other allotype if both were amplifiable by the same pair of primers. Such a scheme is also applicable to sequences that have different primer regions.

The sequence of the target nucleic acid can be deciphered by analyzing the order and amount in which the nucleotides are incoφorated onto the sequencing primer. In a preferred embodiment of the present invention, the sequence of the nucleic acid is determined by comparing the order and amount in which the nucleotides are incoφorated onto the sequencing primer with a database of theoretical, calculated or known values for the orders and amounts in which the nucleotides would be incoφorated for known nucleic acids.

The present methods can be extended to multiplexing formats in which the identity of several nucleotides, or the identity of a nucleotide at multiple variant sites, is determined in a single reaction. Such formats allow for rapid sequence determinations in many loci and/or individuals simultaneously. The multiple sites can be multiple sites on the same target nucleic acid, such sites being within the same gene or at sites in different genes. Alternatively, the multiple sites can be different sites on target nucleic acids obtained from different individuals. Likewise, one sequencing primer can be used for all of the sites, different primers can be used for each of the sites or multiple sites can use the same primer. In certain multiplexed methods, primers for each of the different variant sites are annealed to their respective binding sites. In the same manner one blocking agent can be used for each site, different blocking agents can be used for different sites or multiple sites can share the same blocking agent. The sequencing primers and/or blocking agents that are specific to a template nucleic acid or a target sequence of interest can be present in different concentrations compared to other sequencing primers and/or blocking agents present in the same reaction. In preferred multiplexing methods, the 1-5 bases immediately upstream of the position or sequence of interest in the nucleic acids are different for each of the nucleic acids and their identity is known, thereby providing a reference or control for each of the nucleic acid sequences as well as allowing ready discrimination between the nucleic acids. hi the present multiplexing methods sequencing primers can be hybridized to a plurality of sites on one or more nucleic acids having target sequences of interest. The sequencing primers are elongated as described above until elongation of each of the primers is terminated with a blocking agent. Each of the sequencing reactions can, in turn, be blocked simultaneously or sequentially as desired. For example, all sequencing reactions can stopped simultaneously by adding a mixture of non-extendable nucleotides to the reaction mixture at the same time. In contrast, the sequencing reactions can be stopped sequentially, by one of many methods including: adding a non-extendible nucleotide that is specific to one of the sequencing reactions and not the others; using combinations of primers, blocking agents, distance between primers and blocking agents and specific nucleotide dispensation orders to terminate the sequencing reactions in a desired order; or combinations of these. Preferential extension or blocking of the sequencing of one or more nucleic acids can also be achieve by using primers and blocker oligonucleotides that have different Tm and altering the reaction conditions and/or temperature accordingly. Preferably, in methods where sequencing of more than one nucleic acid occurs simultaneously, the nucleic acids are not closely related. Here too, the blocking agents that prevent extension of the primer serve to identify the nucleotide(s) present at the variant sites of the target nucleic acids.

In the methods where more than one nucleic acid is sequenced in the same reaction, out-of-phase elongation of the primers can enhance the sequence determination of the nucleic acids. In out-of-phase elongation one of the sequencing primers is elongated one or more positions compared to the other primer. For example, one sequencing primer can be elongated by x nucleotides, where x is an arbitrary number, whereas the other sequencing primer is elongated by x ± y nucleotides, where y is an arbitrary number that is not zero, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc. In an alternative arrangement, the sequencing primers can be positioned at corresponding positions on the two nucleic acids, such as where a portion of the nucleic acids share a common sequence, and one of the primers can then be elongated one or more positions past the shared sequence and the other primer, hi this alternative arrangement, it can be irrelevant how many positions the sequencing primers have previously been elongated relative to one another. The out-of-phase elongation of the primers can, but need not, be maintained through one or more subsequent elongation steps as desired. It will be readily apparent to one skilled in the art that in order to elongate one of the primers out-of-phase relative to the other primer, the sequences of the nucleic acids must differ at least at that position. h some embodiments of the present methods, the first and second sequencing primers are the same primer, e.g., they have exactly the same sequence. However, the sequencing primers need not have the same sequence and can hybridize to different sequences in the different nucleic acid targets in different places. Using different sequencing primers can thereby automatically insure that elongation of the primers will proceed out-of-phase or they can be used to discern similar sequences in the two nucleic acids that have differing sequences immediately adjacent to the part of the nucleic acid to be sequenced. When using two different primers the primers can be closely related so that they have roughly the same sequence or length except for a small number, such as 1, 2, 3, 4 or 5, of different nucleotides. Generally, the different nucleotides between the primers will correspond to differences in the primer annealing region in the first and second nucleic acids, such as at polymoφhic positions, h a similar manner, the primer can have the same sequence except that one of the primers can be shorter or longer than the other primer by a given number of nucleotides, such as 1-5.

Generally, in the out-of-phase elongation the dispensation order of the nucleotides is designed so that each nucleic acid analyzed provides 1-10, i.e. 1, 2, 3, 4, 5, 6, etc., or more nucleotide incoφoration events that are free from interference of the incoφoration events of the other nucleic acid(s) in the reaction, i.e. certain nucleotide incoφorations are unique to each of the nucleic acids being sequenced.

The primers and oligonucleotides of the present kits and methods may be prepared using any suitable method using, e.g., the methods described in Beaucage, S. et al, Tetrahedron Letters 22:1859-1862 (1981). Commercially available instruments capable of generating oligonucleotide moieties are preferred, as these are widely utilized and typically time and cost effective.

The nucleic acid sequence that can be sequenced by the methods of the present invention can be DNA or RNA. Where the sequence is initially present as DNA, such DNA need not be either transcribed or translated. Thus, the present invention may be used to identify and/or amplify non-transcribed DNA or non- translated DNA, as well as DNA that is transcribed or translated. Likewise, where the desired sequence is initially present in an RNA molecule such RNA need not be translated.

Although the nucleic acid molecule which is to be sequenced can be in either a double-stranded or single-stranded form, ifthe nucleic acid is double-stranded at the start of the sequencing reaction it is preferably first treated to render the two strands into a single-stranded, or partially single-stranded, form. Methods are known to render double-stranded nucleic acids into single-stranded, or partially single- stranded, forms, such as heating, or by alkali treatment, or by enzymatic methods (such a by helicase action, etc.), or by binding proteins, etc. General methods for accomplishing this treatment are provided by Sambrook, J. et al, In: Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989)) and by Haymes, B. D., et al (In: Nucleic Acid Hybridization. A Practical Approach, IRL Press, Washington, D.C. (1985)).

As will be understood by the skilled artisan, the present methods are not limited simply to sequencing a nucleic acid but can be used any time when a sequencing by synthesis reaction is performed. As such, the present methods can also be useful for several applications other than simply nucleic acid sequencing. For example, the present methods can also be used for quantification of DNA template, sequencing at the nucleotide level of a nucleic acid and/or polymoφhism sequence typing, including SNP typing.

Most preferably, the RNA or DNA sequence that is to be sequenced will be amplified via a DNA polymerase or a reverse transcriptase to form a DNA amplification product, however, in embodiments in which an RNA amplification product is desired, an RNA polymerase may be employed. Suitable polymerase enzymes are reviewed in Watson, J. D., In: Molecular Biology of the Gene, 4th Ed., W. A. Benjamin, Inc., Menlo Park, Calif. (1987), which reference is incoφorated herein by reference, and similar texts. Examples of suitable DNA polymerases include the large proteolytic fragment of the DNA polymerase I of the bacterium E. coli, commonly known as "Klenow" polymerase, E. coli DNA polymerase I, the bacterio- phage T7 DNA polymerase. Where desired, "thermostable enzymes" may be employed, as used herein, a "thermostable enzyme" is an enzyme which can catalyze a reaction at temperatures of between about 50°C. to about 100°C. Exemplary thermostable polymerases are described in European Patent Appln. 0258017, incoφorated herein by reference. Thermostable "Taq" DNA polymerase is available from Cetus, Coφ. Examples of suitable RNA polymerases include E. coli RNA polymerase, T7 RNA polymerase, etc. Reverse transcriptases are discussed by Sambrook, J. et al. (hi: Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989)) and by Noonan, K. F. et al. (Nucleic Acids Res. 16:10366 (1988)). All of the enzymes used in the amplification reactions of the present invention can be combined in the presence of a suitable buffer, such that the amplification process of the present invention can be done in a single reaction volume without any change of conditions such as addition of reactants.

Among the molecules which may be sequenced include any naturally occurring prokaryotic (for example, pathogenic or non-pathogenic bacteria, Escherichia, Salmonella, Clostridium, Agrobacter, Staphylococcus and Streptomyces, Streptococcus, Rickettsiae, Chlamydia, Mycoplasma, etc.), eukaryotic (for example, protozoa and parasites, fungi, yeast, higher plants, lower and higher animals, including mammals and humans) or viral (for example, Heφes viruses, HIV, influenza virus, Epstein-Barr virus, hepatitis virus, polio virus, etc) or viroid nucleic acid. The nucleic acid molecule can also be any nucleic acid molecule which has been or can be chemically synthesized. Thus, the nucleic acid sequence may or may not be found in nature. In sum, the methods of the present invention are capable of identifying or amplifying any nucleic acid molecule, and do not require that the molecules to be amplified have any particular sequence or origin. Accordingly, the invention places no restrictions on the nature of the sample being evaluated. Such samples may, for example be derived from an animal (such as a human or other mammal), or a plant, or may be synthetically derived.

In particular, the invention may be used to identify and amplify nucleic acid molecules present in blood (and blood products, such as serum, plasma, platelets), stool, sputum, mucus, serum, urine, saliva, teardrop, biopsy samples, histology tissue samples, PAP smears and other vaginal swabs, skin scrapes, semen, moles, warts, etc. Similarly, it may be used to identify and amplify nucleic acid molecules present in plant tissue.

The nucleic acids of such samples may be wholly unpurified, partially purified, or fully purified from any other component naturally associated with the sample. Typically, however, the sample will have been treated to a sufficient degree such that extraneous materials which might otherwise interfere with amplification of the nucleic acids are removed. Protocols and techniques are readily available for such purification and are known to those skilled in the art.

Since the invention places no constraints on the nature of the nucleic acid sequence that is to be identified and/or amplified, the invention is capable of identifying nucleic acid molecules that are naturally found in the sample, as well as sequences which though produced by the source animal or plant is indicative of disease (such as a gene sequence encoding a hemoglobin histopathy, or an oncogene product expressed exclusively or preferentially by neoplastic cells). Moreover, the invention may also be used to determine whether gene sequences of pathogenic bacteria, mold, fungi or viruses are present in a tissue sample. The present methods may therefore be used to diagnose disease, or to establish pedigree and identity, as well as to assess the purity of agricultural products (milk, processed foodstuff, etc.), waste water, drinking water, air, etc.

The methods can be used in a number of different applications. For example, in the medical field, the methods of the invention can be used to determine which allele is present at a single nucleotide polymoφhic (SNP) site or to detect mutations at a particular site. Because many diseases are associated with SNPs or mutations, the methods can be used in a variety of diagnostic, research and prognostic applications. In addition, for diploid subjects, the methods can be used to determine if the individual is homozygous or heterozygous for a particular allele, i.e., to determine the genotype of the individual. This is an important capability because individuals that are homozygous for an allele associated with a disease are at greater risk than individuals that are heterozygous or homozygous for the allele that is not linked to the disease. Furthermore, individuals that are homozygous for an allele associated with a particular disease sometimes suffer the symptoms of the disease to a greater extent than heterozygotes. The ability of the methods to interrogate particular sites also finds value for identification puφoses, including for example, in resolving forensic and paternity cases. The methods also have utility in detecting the presence of nucleic acids from particular pathogens (e.g. certain viruses, bacteria or fungi). Thus, the present methods can be used in conducting genotyping analyses and can be performed in multiplexing formats. The methods and kits have utility in diverse applications including, for example, analyzing point mutations and single nucleotide polymoφhisms, detection of pathogens, paternity disputes, prenatal testing and forensic investigations.

Any or all of the oligonucleotides can be labeled, and for many purposes, it is desirable that at least one of the oligonucleotides be labeled. Additionally, the dNTPs can be labeled. Beneficially, when an oligonucleotide is labeled, the label can be conjugated to the 3' thereof such that elongation from the 3' end thereof is not possible. Specifically, when the blocker oligonucleotide is labeled, the label can be conjugated to the 3' end thereof such that the blocker oligonucleotide can hybridize with the target whereby elongation from the 3' end thereof is not possible. Exemplary labeling protocols are well known; see, e.g., European Patent Appln. 292128.

The labels can facilitate either the direct, proximal or indirect detection and/or capture of the amplified product. Additionally, two of the moieties can be part of a unitary structure such that only two oligonucleotide moieties are utilized in the amplification reaction. As used herein, a label that is directly detectable produces a signal which is capable of detection either directly or through its interaction with a substance such as a substrate (in the case of an enzyme), a light source (in the case of a fluorescent compound) or a photomultiplier tube (in the case of a radioactive or chemiluminescent compound). Examples of preferred direct labels include radioisotopic labels, e.g., the use of oligonucleotides which have incoφorated ³²P, ³⁵S, ¹²⁵L ³H, ¹⁴C. One approach for direct labeling of oligonucleotides is the end-labeling approach whereby T4 polynucleotide kinase is used to introduce a label into the 5' terminus of the oligonucleotide (See, e.g., Richardson, C. C, The Enzymes, Nol. XIN, Nucleic Acids Part A, Ed. Boyer, P. D., Acad. Press, p. 299 (1981)). Alternatively, terminal deoxynucleotidyl transferase can be utilized to add a series of supplied deoxynucleotides onto the 3' terminus of the oligonucleotide; single nucleotide labeling methods can also be used (See, e.g. Bollum, F. J. The Enzymes, Vol. X, Ed. Boyer, P. D. Acad. Press, (1974); Yousaf, S. I. et al, Gene 27:309 (1984); and Wahl, G. M. et al. Proc. Natl. Acad. Sci. (U.S.A.) 76:3683-3687 (1979). Labeled ddNTPs, e.g., α- ³²P ddATP, can also be utilized.

A label that is indirectly detectable does not in and of itself provide a detectable signal, however, it can be used to identify an oligonucleotide to which the indirectly detectable label is attached. Biotin, antibodies, enzymes, ferritin, antigens, haptens, etc. when conjugated to a dNTP or ddNTP comprise examples of indirectly detectable labels. Preferred non-radioactive direct labels include fluorescein-11- dUTP (see Simmonds, A. C. et al Clin. Chem. 37:1527-1528 (1991), incoφorated herein by reference) and digoxigenin-11 dUTP (see Muhlegger, K. et al. Nucleosides & Nucleotides 8:1161-1163 (1989), incoφorated herein by reference) can be utilized as labels. Additionally, non-radioactively labeled oligonucleotides, such as hapten labeled oligonucleotides may be used (See, e.g., Adams, C. W., PCT Patent Appln. WO 91/19729). A detection scheme involving such hapten-labels includes utilization of antibodies to the hapten, the antibodies being labeled. Biotin is an especially preferred indirect label, whereby the detection of biotinylated nucleic acid molecules is accomplished using labeled or insolubilized avidin, streptavidin, anti-biotin antibodies, etc. Biotinylated molecules can also be readily separated from non- biotinylated molecules by contacting the molecules with insoluble or immobilized avidin.

In this regard, for example, biotin- 11-dUTP can be utilized in lieu of dTTP, or biotin- 14-dATP in lieu of DATP (See. generally, Langer, P. R. et al, Proc. Natl. Acad. Sci. (U.S.A.) 78:6633-6637 (1981), which is incoφorated herein by reference). Biotinylated phosphoramidites can also be used (Misiura, K. et al. Nucl. Acids. Res. 18:4345-4354 (1990), which is incoφorated herein by reference). Such phosphoramidites allows for precise incoφoration thereof at desired locations along the growing oligonucleotide moiety during the synthesis thereof.

Chemiluminescent substrates can also be used as the indirect label. Enzymes, such as horseradish peroxidase ("HRP"), alkaline phosphatase ("AP"), etc. which can be directly cross-linked to nucleic acids may be employed (see, Renz, M. and Kurz, C. Nucl. Acids Res. 12:3435-3444 (1964), incoφorated herein by reference). Luminal, a substrate for HRP, and substituted dioxetanes, substrates for AP, can be utilized as chemiluminescent substrates, xemplary of the HRP labeling protocol is the ECL system available from Amersham (Arlington Heights, 111., USA).

In lieu of direct or indirect labels, a proximity label may be employed. Such a label is a chemical moiety which produces a signal only in the presence of a second label which interacts with it. Typically, a first proximity label is used in combination with a corresponding second proximity label.

The methods can also be utilized in pooling studies to determine the allele frequency of a variant site in a study population. Typically, in these type of the experiments, the DNA samples from different individuals arc pooled together. Then the method of this invention can be used to analyze the presence of each allele in the mixed templates. By comparing the signal intensities of each allele with a reference set (for example, the hcterozygotes, the homozygotes or a mixture of both at a known ratio), the prevalence of the alleles in the population can be determined. (For a general discussion of pooling studies see, e.g., Breen G. et al, BioTechniques 28:464-468 (2000); Risch N. and Teng, J., Genome Res. 8:1273-1288 (1998); Shaw, S. H. et al, Genomer Res. 8:111-123 (1998); and Scott, D. A. et al, Am. J. Hum. Genet. 59:385- 391 (1996), each of which is incoφorated by reference in its entirety).

It will be understood to those skilled in the art that the present invention readily lends itself to automation. As an automated process, several samples containing the same, or different, combinations of nucleic acids can be sequenced in parallel using the same or different primers and nucleotide dispensation orders. The nucleotide incoφorations can then be analyzed to determine the sequences of the nucleic acids in each reaction. Sequencing according to the present methods can also be monitored in real time.

In a preferred embodiment, the nucleotide dispensation order is not random or cyclical, but is based on the putative sequences of the nucleic acids believed to be in the reaction. The dispensation orders can be contained in a suitable database, such as in a computer readable format accessible internally to a computer, via software or over a local or global network. One skilled in the art will readily understand this aspect of the invention presupposes a general knowledge of the sequences of the nucleic acids to be sequenced. A skilled artisan will also readily understand that this can be easily achieved by focusing the present methods on a desired nucleic acid or DNA target, such as an HLA allele, for example using amplification or separation techniques that provide the nucleic acids of interest.

A database can contain the sequences of one or more classes of nucleic acids having sequences of interest. The database can also contain the nucleotide dispensation order or orders which for each of the nucleic acid sequences in the database. Nucleotide dispensation orders for two or more nucleic acids sequenced in the same reaction can be stored in a database. Generally, the nucleotide dispensation order for multiple nucleic acids are built by additive comparison of the nucleotide dispensation orders for each of the separate nucleic acids individually. Suitable sequencing primers for each of the nucleic acids can then be selected as desired. The nucleotide dispensation order for sequencing the nucleic acids can then be programmed based on the above knowledge. The nucleotide dispensation order for multiple combinations of nucleic acids can readily be determined repeating this process for the different combinations of nucleic acids. All such combinations can be accurately and expediently determined by the use of computer automation.

Because several different nucleotide dispensation orders can be envisioned for sequencing the same combination of nucleic acids, preferably the primers for each of the nucleic acids are chosen so that each nucleic acid analyzed provides 1, 2, 3, 4, 5, 6 or more nucleotide incoφoration events that are free from interference of the incoφoration events of the other nucleic acid(s) in the reaction, i.e. certain nucleotide incoφorations are unique to each of the nucleic acids being sequenced. Where possible, it is preferred to select sequencing primers that provide a nucleotide dispensation order which is unique to the combination of nucleic acids being sequenced. In other words, the nucleotide dispensation order will not readily provide sequencing of any other combination of nucleic acids outside of the combination being sequenced.

Where overlap of nucleotide dispensation orders occurs, the sequencing methods can be repeated on the same combination of nucleic acids using different primers and or nucleotide dispensation orders to confirm the identity of the nucleic acids. Such repeats can occur simultaneously or sequentially as needed. hi a preferred embodiment, the nucleic acids to be sequenced can be obtained from an individual and are preferably alleles of interest in the individual's genome. Thus the present methods can be used to provide the tissue type of the individual and the individual may be homozygous or heterozygous for the alleles of interest. The present methods can also be used to simultaneously sequence alleles from different individuals, such as prospective tissue donors/recipients or in paternity testing. Accordingly, preferred methods focus on identifying HLA alleles. The alleles of the HLA loci are classified as Class I - HLA- A, HLA-B, HLA-C, HLA-E, HLA-F and HLA-G, or Class II - HLA-DRA, HLA-DRB1, HLA-DRB2-9, HLA- DQA1, HLA-DQBl, HLA-DPAl, HLA-DPBl, HLA-DMA, HLA-DMB, HLA-DOA and HLA-DOB. There are over a hundred identified alleles that fall in some of these loci and these alleles are closely related and can differ in sequence by only one, or a few, positions. The HLA gene is discussed by Schreuder et al in Tissue Antigens, 58:109 (2001) and the references disclosed therein, all of which are incoφorated by reference. Additional information regarding HLA alleles, and in particular sequence information is available at www.ebi.ac.uk/imgt/hla and www.anthonynolan.org.uk/research.html.

The present methods, primers, blocking agents, kits and the like disclosed herein also apply generally to in vitro amplification and or sequencing techniques, such as real time techniques including quantitative (fluorescent) PCR, although the present methods, primers, blocking agents, kits and the like disclosed herein are not limited to PCR techniques, hi these embodiments the present methods can be used to identify and quantitate the different nucleic acids present in a sample by selecting the appropriate primers and blocking agents. These methods can also provide population data for the nucleic acids present in a sample.

The present methods can be carried out by performing any of the described steps herein, either alone or in various combinations. Additionally, one skilled in the art will realize that the present invention also encompasses variations of the present methods, compositions and kits that specifically exclude one or more of the steps described above.

The present invention also provides kits for carrying out the methods described herein. In. one embodiment, the kit is made up of instructions for carrying out any of the methods described herein. The instructions can be provided in any intelligible form through a tangible medium, such as printed on paper, computer readable media, or the like. The present kits can also include one or more reagents, buffers, hybridization media, nucleic acids, primers, nucleotides, molecular weight markers, enzymes, solid supports, databases, computer programs for calculating dispensation orders and/or disposable lab equipment, such as multi-well plates, in order to readily facilitate implementation of the present methods. Enzymes that can be included in the present kits include nucleotide degrading enzymes, such as apyrase, luciferase, sulfurylase, DNA polymerases, and the like. Solid supports can include beads and the like whereas molecular weight markers can include conjugatable markers, for example biotin and streptavidin or the like. Examples of preferred kit components can be found in the description above and in the following examples.

Unless otherwise specified, "a" or "an" means "one or more".

EXAMPLES

Example 1

The present example illustrates the present methods for reducing background in sequencing by synthesis reactions, hi this example a self-annealing sequence with a mismatched 3' terminal base was provided onto the template having the target sequence. Adding the self-annealing sequence was accomplished by PCR amplification of a template that had the following sequence at its 5' end: 5'-aggactgtctaGTCCCCACAGCACGTTTCTTGGAGTAC-3'. (SEQ ID NO: 1).

The PCR primer was labeled with biotin at its 5 '-end for easy recovery and identification. The PCR amplicon resulting from the above amplification resulted in a template nucleic acid having the following sequence (which is complementary to SEQ LD NO: 1) at its 3' end:

3 '-tcctgacagatCAGGGGTGTCGTGCAAAGAACCTCATG-5 ' (SEQ ID NO: 2).

The target sequence of the template was as follows: CGAGTACTGGAACAGCCAGAAGGACCTCCTGGAGCAGAAGC (SEQ ID NO: 3).

Sequencing of the target sequence of the template was performed by separating the double-stranded nucleic acids resulting from the PCR amplification and annealing a sequencing primer with the sequence GGCGGCCTGATGC (SEQ ID NO: 4) to one of the single-stranded nucleic acids. Sequencing of the target sequence was performed using a PSQ 96 System developed by Pyrosequencing AG (Uppsala, Sweden) using the manufacturer's recommended instructions. Annealing of the sequencing primer was performed according to the manufacturer's recommended instructions. The nucleotide dispensation order of the sequencing reaction was provided as cyclical repeats of TGAC. The pyrogram resulting from this example is shown in FIG. 1. As will be understood by the skilled artisan, the 3 ' end of the template nucleic acid has a sequence which is capable of annealing back on itself forming a stem-loop structure except that the 3' terminal base is unmatched: a ^c ag t c c ^{t -} 3' g _a tCAGGGGTGTCGTGCAAAGAACCTCATG-5'

As the 3' terminus is unmatched, the 3' end of the template nucleic acid is substantially incapable of acting as a point of initiation for the polymerase in the sequencing reaction.

One skilled in the art will readily appreciate that the target sequence can also be discerned by sequencing the complementary nucleic acid strand having the complement of the target sequence and appropriately selected primers. Example 2

In this example a sequencing reaction of the target sequence in Example 1 was performed as described in that Example except that the 3' end of the template nucleic acid was not modified to be non-extendable. The reaction conditions for this Example were the same as shown in Example 1 thus providing a comparative example for the present methods. The pyrogram resulting from this Example is shown in FIG. 2.

RESULTS

The nucleotide dispensation order and expected nucleotide incoφoration events are shown for the Examples in the Table 1 below:

Table 1

As can be seen from a comparison of FIGS. 1 and 2 in light of the above table, the present methods significantly reduce the background signal in the pyrosequencing reaction. For example, the fifth, sixth, eighth, ninth and eleventh nucleotide dispensation events would not expected to be provide an incoφoration signal as shown in Table 1. This expectation is well represented in FIG. 1 in which all nucleotide dispensations provided signal values below 5 signal units, whereas in FIG. 2 the same nucleotide dispensations provided significant signal exceeding 10 signal units in all cases and several dispensations provided 15 signal units or more of signal. Thus background can be reduced by the present methods by a factor of at least 2, 3, 4 , 5 or more up to^'a factor of 10, 15 to 25. This reduction in non-target sequence specific background signal (signal not resulting from sequencing of the target sequence) provides for easier inteφretation of the sequencing by synthesis results by reducing false positive incoφoration signals.

Example 3

The present example illustrates the present methods used in the multiplex sequencing of four different single nucleotide polymoφhisms simultaneously in the same tube. Sequencing of the following SNPs is performed using a PSQ 96 System developed by Pyrosequencing AG (Uppsala, Sweden) using the manufacturer's recommended instructions:

This example is designed such that the 3' end of the sequencing primer anneals at a position that allows one to design an out-of-phase nucleotide dispensation order. One such example is to initiate the nucleotide incoφoration onto each SNP containing template using different dNTPs (dATP or dGTP or dCTP or dTTP).

This example can be carried out as follows:

Tube 1 contains SNP target 1 (for example about 1 pmole of oligonucleotide IT and about 1 pmole of oligonucleotide IC): oligonucleotide IT: 3'ATGCTAGCTAGGCTAGCTATCGGCATATCGTCTACTTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC S' oligonucleotide IC:

3 'ATGCTAGCTAGGCTAGCTATCGGCATATCGTCTACCTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

Tube 2 contains SNP target 2 (for example about 1 pmole of oligonucleotide 2C and about 1 pmole of oligonucleotide 2A) oligonucleotide 2C: 3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTTGCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 ' oligonucleotide 2A:

3'ATGCTAGCTAGGCTAGCTATCGGCATATCGTTGCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅'

Tube 3 contains SNP target 3 (for example about 1 pmole of oligonucleotide 3G and about 1 pmole of oligonucleotide 3T) oligonucleotide 3G: ATGCTAGCTAGGCTAGCTATCGGCATATCGTGACAGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide 3T:

3'ATGCTAGCTAGGCTAGCTATCGGCATATCGTGACATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC S' Tube 4 contains SNP target 4 (for example about 1 pmole of oligonucleotide 4A and about 1 pmole of oligonucleotide 4C) oligonucleotide 4A: 3 'ATGCTAGCTAGGCTAGCTATCGGCATATCGTATGCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 ' oligonucleotide 4C:

3ΑTGCTAGCTAGGCTAGCTATCGGCATATCGTATGCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

To one well of 96 well plate add in SNP target 1, SNP target 2, SNP target 3 and SNP target 4, which can each be added in an amount of about 1.5 pmoles. Four pyrosequencing primers with the sequences:

5' AGCCGTATAGCAGATG3'

5' GATAGCCGTATAGCAA 3'

5' TAGCCGTATAGCACTG3'

5' ATAGCCGTATAGCATA 3' are also added to the same well according to the manufacturer's recommended conditions. The primer will hybridize to the sequences at the bracketed positions indicated below.

SNP 1 5' AGCCGTATAGCAGATG 3'

3 ' ATGCTAGCTAGGCTAGCTA [TCGGCATATCGTCTAC] TTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTA [TCGGCATATCGTCTAC] CTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

SNP 2 5' GATAGCCGTATAGCAA 3'

3 ' ATGCTAGCTAGGCTAG [CTATCGGCATATCGTT] GCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAG [CTATCGGCATATCGTT] GCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

SNP 3 5' TAGCCGTATAGCACTG 3'

3 ' ATGCTAGCTAGGCTAGCT [ATCGGCATATCGTGAC] AGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCT [ATCGGCATATCGTGAC] ATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

SNP 4 5' ATAGCCGTATAGCATA 3'

3 ' ATGCTAGCTAGGCTAGC [TATCGGCATATCGTAT] GCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGC [TATCGGCATATCGTAT] GCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

A pyrosequencing reaction is run using the following nucleotide dispensation order (NDO) where dd indicates a chain terminating dideoxynucleotide. Nucleotide incoφoration onto each template according to complementary base pairing and the net nucleotide incoφoration as detected by PSQ96 are indicated below (- indicates chain termination has occurred previously).

It is clear that the first incoφoration events, ddA and/or ddG, terminate elongation of SNPi and detection of these incorporation events identifies the polymorphisms in SNPi. It will be readily apparent to one skilled in the art that the dispensation order of ddA and ddG could be reversed and the same results obtained. Next, the dispensation of C elongates the primers on SNP2 and SNP₄ but not SNP3 (SNPi no longer being capable of extension). Addition of T elongates SNP3 primer only, which is thereafter terminated by addition of ddC and/or ddA. Addition of G elongates the primers on SNP2 and SNP4 and the SNP4 primer is terminated by addition of ddT and/or ddG. Further elongation of the primer on SNP2 can then occur as shown. As is apparent from this example, not all of the nucleic acid sequences in a multiplex reaction need to be subjected to termination by a blocking agent.

Example 4

This example is designed such that the 3' end of the sequencing primer anneals at a position that allows one to design an out-of-phase nucleotide dispensation order. One such example is to initiate the nucleotide incoφoration onto each SNP containing template using different dNTP (dATP or dGTP or dCTP or dTTP). The advantage of such method is to minimize the net number of nucleotide incoφoration, generally less than 8, from maximal number of polymoφhic templates so the output signal (light) does not saturate. Sequencing of the following SNPs can be performed using a PSQ 96 System developed by Pyrosequencing AG (Uppsala, Sweden) using the manufacturer's recommended instructions.

This example can be carried out as follows: Tube 1 contains SNP target 1 (for example about 1 pmole of oligonucleotide IT and about 1 pmole of oligonucleotide IC): oligonucleotide IT: 3 'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]CTACTTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 ' oligonucleotide IC:

3 ' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]CTACCTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅ '

Tube 2 contains SNP target 2 (for example about 1 pmole of oligonucleotide 2C and about 1 pmole of oligonucleotide 2A) oligonucleotide 2C: 3^TGCTAGCTAGσCTA[GCTATCGGCATATCGT]TGCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅ ^, oligonucleotide 2A:

3 ' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]TGCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

Tube 3 contains SNP target 3 (for example about 1 pmole of oligonucleotide 3G and about 1 pmole of oligonucleotide 3T) oligonucleotide 3G: 3'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]GACAGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅' oligonucleotide 3T:

3'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]GACATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅'

Tube 4 contains SNP target 4 (for example about 1 pmole of oligonucleotide 4A and about 1 pmole of oligonucleotide 4C) oligonucleotide 4A: 3'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]ATGCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide 4C:

3 'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]ATGCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

To one well of 96 well plate add in SNP target 1, SNP target 2, SNP target 3 and SNP target 4, which can each be added in an amount of about 1.5 pmoles. A pyrosequencing primer with the sequence 5' CGATAGCCGTATAGCA 3' is also added to the same well according to the manufacturer's recommended conditions. The primer will hybridize to the sequences at the bracketed positions above.

A pyrosequencing reaction is run using the following nucleotide dispensation order (NDO). Nucleotide incoφoration onto each template according to complementary base pairing and the net nucleotide incoφoration as detected by PSQ96 are indicated below.

The underlined sequences indicate the sequences that have elongated during the pyrosequencing reaction. oligonucleotide I : 3'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT1CTACTTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide IC:

3 'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT1CTACCTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 ' oligonucleotide 2C: 3'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]1GCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅' oligonucleotide 2A:

3 'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]TCCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 ' oligonucleotide 3G: 3 ' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]GACAGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 ' oligonucleotide 3T:

S'ATGCTAGCTAGGCTAtGCTATCGGCATATCGTlGACATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC S' oligonucleotide 4A: 3'ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]ATGCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅' oligonucleotide 4C:

3 ' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]ATGCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

Example 5

This example is designed such that the 3' end of the sequencing primer anneals at a position that allows one to design an out-of-phase nucleotide dispensation order. This example should be carried out as Example 4 above with the noted changes, hi this example the sequencing primer is designed so that hybridization of the 5' end of the primer can occur to a position immediately downstream of the SNP site. The advantage of such method is to minimize the net number of nucleotide incoφoration, generally less than 8, from maximal number of polymoφhic templates so the output signal (light) does not saturate. Sequencing of the following SNPs can be performed using a PSQ 96 System developed by Pyrosequencing AG (Uppsala, Sweden) using the manufacturer's recommended instructions. The same set of target DNAs are used for this experiment as in Example 2.

Tube 1 contains SNP target 1 (for example about 1 pmole of oligonucleotide IT and about 1 pmole of oligonucleotide IC): oligonucleotide IT: 3' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]CTACTTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅' oligonucleotide IC: 3'

ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]CTACCTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅'

Tube 2 contains SNP target 2 (for example about 1 pmole of oligonucleotide 2C and about 1 pmole of oligonucleotide 2A) oligonucleotide 2C: 3' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]TGCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide 2A: 3'

ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]TGCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

Tube 3 contains SNP target 3 (for example about 1 pmole of oligonucleotide 3G and about 1 pmole of oligonucleotide 3T) oligonucleotide 3G: 3' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]GACAGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC ₅' oligonucleotide 3T: 3'

ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]GACATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

Tube 4 contains SNP target 4 (for example about 1 pmole of oligonucleotide 4A and about 1 pmole of oligonucleotide 4C) oligonucleotide 4A: 3' ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]ATGCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide 4C: 3'

ATGCTAGCTAGGCTA[GCTATCGGCATATCGT]ATGCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

To one well of 96 well plate add in SNP target 1, SNP target 2, SNP target 3 and SNP target 4, which can each be added in an amount of about 1.5 pmoles. A pyrosequencing primer with the sequence:

5' TCTCAGATTAGCTCGCGATCGATTAGTTCTAGTCGTAGCCGTATAGCA 3 ^• is also added to the same well according to the manufacturer's recommended conditions. The underlined 5' end of the primer will hybridize to the target sequence on the template nucleic acid downstream of the SNP, the italicized (middle) portion of the primer will not hybridize to the template nucleic acid but will act as a flexible linker and the underlined 3' end of the primer will hybridize to the target sequence on the template nucleic acid upstream of the SNP as depicted below: SNP 1 CGATAGCCGTATAGCA 3' 5' GATCTCAGATTAGCTC' oligonucleotide IT: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTCTACTTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide IC: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTCTACCTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

SNP 2 CGATAGCCGTATAGCA 3' 5' GATCTCAGATTAGCTC

Oligonucleotide 2C: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTTGCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide 2A: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTTGCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

SNP 3 CGATAGCCGTATAGCA 3' 5' GATCTCAGATTAGCTC oligonucleotide 3G: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTGACAGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5' oligonucleotide 3T: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTGACATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

SNP 4 CGATAGCCGTATAGCA 3' 5' GATCTCAGATTAGCTC

Oligonucleotide 4A: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTATGCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

Oligonucleotide 4C: 3' ATGCTAGCTAGGCTAGCTATCGGCATATCGTATGCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5'

Example 6

This example is designed such that the 3' end of the sequencing primer anneals at a position that allows one to design an out-of-phase nucleotide dispensation order. In this example the sequencing primer is designed so that hybridization of the 5' end of the primer can occur to a position immediately downstream of the SNP site. The advantage of this Example over Example 1 is that the reagent-loading cartridge can be readily used and no reengineering to add in more cells is needed. Sequencing of the following SNPs can be performed using a PSQ 96 System developed by Pyrosequencing AG (Uppsala, Sweden) using the manufacturer's recommended instructions. The same set of target DNAs are used for this experiment as in Example 2.

The same set of template DNAs are used for this experiment as used above. To the same well containing all 4 targets DNAs, add the following 4 pyrosequencing primers:

5' ATGATCTCAGATTAGCΓCGΛΓΓ GΓΓCΓ GΓAGCCGTATAGCAGATG 3 ' 5' GCTAGCCGTATAGCAAΓCG ΓΓ GΓΓCΓ GΓGATAGCCGTATAGCAA 3 ^• 5' GAGCCGTATAGCACTGYC047 .4G77UZ4G7TAGCCGTATAGCACTG 3 ' 5' CTAGCCGTATAGCATATCGATTAGTTCTAGTATAGCCGTATAGCATA 3'

The underlined 5' end of the primers will hybridize to the target sequence on the template nucleic acid downstream of the SNP, the italicized (middle) portion of the primer will not hybridize to the template nucleic acid but will act as a flexible linker and the underlined 3' end of the primer will hybridize to the target sequence on the template nucleic acid upstream of the SNP as depicted below:

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTCTACTTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTCTACCTACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTTGCTCCGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTTGCTACGCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTGACAGCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTGACATCTCTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTATGCAGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

3 ' ATGCTAGCTAGGCTAGCTATCGGCATATCGTATGCCGACTAGAGTCTAATCGAGCTAGCTAGGCTATAC 5 '

As will be understood by one skilled in the art, for any and all puφoses, particularly in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as "up to," "at least," "greater than," "less than," "more than" and the like include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above. In the same manner, all ratios disclosed herein also include all subratios falling within the broader ratio.

One skilled in the art will also readily recognize that where members are grouped together in a common manner, such as in a Markush group, the present invention encompasses not only the entire group listed as a whole, but each member of the group individually and all possible subgroups of the main group. Accordingly, for all puφoses, the present invention encompasses not only the main group, but also the main group absent one or more of the group members. The present invention also envisages the explicit exclusion of one or more of any of the group members in the claimed invention.

All references disclosed herein are specifically incoφorated by reference thereto.

While preferred embodiments have been illustrated and described, it should be understood that changes and modifications can be made therein in accordance with ordinary skill in the art without departing from the invention in its broader aspects as defined in the following claims.

Claims

CLAIMSWhat is claimed is:

1. A method of reducing non-specific incoφoration events in a sequencing by synthesis reaction, comprising:

(a) performing a sequencing by synthesis reaction on a template nucleic acid; and

(b) preventing the 3' end of the template nucleic acid from undergoing extension thereby reducing competing non-template specific incoφoration events in the sequencing by synthesis reaction.

2. The method ofclaim 1 wherein (a) comprises:

(i) hybridizing a sequencing primer with the template nucleic acid;

(ii) elongating the sequencing primer by the addition of a nucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil, wherein the nucleotide will only elongate the sequencing primer when the deoxynucleotide is complementary to the corresponding base in the template nucleic acid adjacent to the last position of the sequencing primer;

(iii) removing substantially all unincoφorated deoxynucleotides ;

(iv) repeating (b) one or more times with an additional deoxynucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil; and

(v) detecting incoφoration of the deoxynucleotides onto the sequencing primer.

3. The method of claim wherein (a) further comprises:

(vi) determining the sequence of the template nucleic acid based on the order and amount in which the nucleotides are incoφorated onto the sequencing primer.

4. The method ofclaim 1 further comprising blocking the 3' end of the template nucleic acid with an agent that prevents extension of the 3' end of the nucleic acid.

5. The method ofclaim 1 wherein the 3' end of the template nucleic acid comprises a non-extendible nucleotide.

6. The method of claim 4 wherein the non-extendible nucleotide is a dideoxynucleotide.

7. The method ofclaim 1 wherein (b) comprises:

(i) hybridizing a non-extendible nucleic acid to the 3 ' end of the template nucleic acid.

8. The method ofclaim 1 wherein the 3' end of the template nucleic acid comprises a homopolynucleotide sequence.

9. The method ofclaim 1 wherein (b) comprises:

(i) binding a protein to the 3' end of the template nucleic acid.

10. The method ofclaim 1 wherein (b) comprises:

(i) binding the 3' end of the template nucleic acid to a solid support.

11. The method of claim 1 wherein (b) comprises

(i) adding a self-annealing sequence to the 3' end of the template nucleic acid wherein the terminal 3' nucleotide does not self-hybridize to the template nucleic acid.

12. The method ofclaim 11 wherein the self-annealing sequence comprises a palindromic sequence.

13. The method of claim 11 wherein the self-annealing sequence comprises universal bases.

14. A nucleic acid, comprising a sequence having a 5' terminus and a 3' terminus, wherein the 3' terminus of the nucleic acid comprises a sequence which can hybridize with itself provided that the 3' terminal base of the nucleic acid is not capable of being extended by a nucleic acid polymerase.

15. The nucleic acid ofclaim 14 wherein the self-annealable sequence comprises a palindromic sequence.

16. A method for determining the sequence of a polymoφhic nucleic acid comprising:

(a) hybridizing a sequencing primer with a nucleic acid;

(b) elongating the sequencing primer by the addition of a nucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil, wherein the nucleotide will only elongate the sequencing primer when the deoxynucleotide is complementary to the corresponding base in the nucleic acid adjacent to the last position of the sequencing primer;

(c) removing substantially all unincoφorated deoxynucleotides;

(d) repeating (b) one or more times with an additional deoxynucleotide corresponding to one of adenine, cytosine, guanine, thymine or uracil;

(f) detecting incoφoration of the deoxynucleotides onto the sequencing primer; and (g) determining the sequence of the nucleic acid based on the order and amount in which the deoxynucleotides were incoφorated into the sequencing primer wherein terminating the elongation of the sequencing primer with the blocking agent enhances the sequence determination of the polymoφhic nucleic acid.

17. The method of claim 16 wherein the incoφoration of the deoxynucleotide releases phosphate in a quantity which is proportional to the amount of the deoxynucleotide incoφorated and the proportion of the released phosphate is correlated with the amount of the deoxynucleotide that is incoφorated.

18. The method of claim 16 wherein the blocking agent is non- extendable nucleotide or a blocking oligonucleotide.

19. The method of claim 18 wherein the blocking oligonucleotide has a 5' end which is resistant to exonuclease degradation.

20. A kit for determining the sequence of a nucleic acid comprising:

(a) instructions for carrying out the method ofclaim 1 or claim 16; and

(b) one or more reagents for carrying out the method.