WO2002006527A2 - Method for detection of truncated proteins - Google Patents

Method for detection of truncated proteins Download PDF

Info

Publication number
WO2002006527A2
WO2002006527A2 PCT/US2001/010672 US0110672W WO0206527A2 WO 2002006527 A2 WO2002006527 A2 WO 2002006527A2 US 0110672 W US0110672 W US 0110672W WO 0206527 A2 WO0206527 A2 WO 0206527A2
Authority
WO
WIPO (PCT)
Prior art keywords
detector
sequence
nucleotide sequence
nucleic acid
mutation
Prior art date
Application number
PCT/US2001/010672
Other languages
French (fr)
Other versions
WO2002006527A3 (en
Inventor
David Hornby
Maryam Matin
Original Assignee
Transgenomic, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transgenomic, Inc. filed Critical Transgenomic, Inc.
Priority to AU2001253092A priority Critical patent/AU2001253092A1/en
Publication of WO2002006527A2 publication Critical patent/WO2002006527A2/en
Publication of WO2002006527A3 publication Critical patent/WO2002006527A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6897Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids involving reporter genes operably linked to promoters

Definitions

  • the present invention is directed to methods and materials useful for the analysis of DNA and other polynucleotides.
  • the invention is particularly suited to applications involving the detection and/or characterization of a genetic mutation.
  • Genetic mutations have been shown to be involved in many diseases and pathological conditions. It follows that methods and reagents that enable one to detect and characterize a genetic mutation in an individual are inherently useful to the medical community. For example, identification and characterization of mutations can be used to predict the propensity of an individual to develop a disease, e.g., cancer, or the likelihood that an individual will pass such a propensity on to his or her children, e.g., an inheritable disease.
  • a disease e.g., cancer
  • a genetic mutation is an alteration in the DNA sequence of an individual's genome, relative to the normally occurring, or wild type sequence. While some genetic mutations are benign and do not affect the phenotype of the affected individual, other mutations can cause serious physiological consequences leading to pathologies and sometimes death. These medically relevant mutations often act by causing an alteration in the amino acid sequence of a protein encoded by a mutated nucleotide sequence, i.e., a protein mutation, or by causing a change in the expression level of one or more genes.
  • the first type is frequently caused by a mutation in a stretch of DNA that encodes a protein sequence, but can also be caused by mutations outside the coding region, e.g., a mutation that affects the splicing of exons.
  • the second type is typically caused by mutation in a regulatory region.
  • a number of different types of DNA mutations have been identified. For example, a point mutation is one involving the substitution of one nucleotide residue, or "base,” for another.
  • nonsense mutations are those that result in the introduction of a translation stop codon (i.e., opal, amber or ochre) in a protein coding sequence, often referred to as "nonsense" mutations.
  • a nonsense mutation causes translation to terminate prematurely, resulting in truncation of the entire length of the protein downstream from the mutation, often leading to a total loss of function for the affected protein. This loss of function can have devastating medical consequences.
  • Nonsense mutations figure prominently in those mutations that have been found to be associated with genetic disease.
  • Another class of mutation consists of insertions and deletions of one or more bases in a DNA sequence. Like nonsense mutations, insertion and deletion mutations can be particularly potent and are often associated with genetic disease. These mutations typically exert their effect by causing a shift in reading frame for the sequence downstream from the mutation. Unless the insertion or deletion consists of three bases, or some multiple of three, the reading frame for any codons subsequent to the mutation will be out of register, resulting in a complete disruption of the encoded amino acid sequence. This type of mutation will in many instances eliminate the proper stop codon and introduce a stop codon at some other location in the sequence, thereby altering the molecular weight of the mutated protein. A frame shift mutation that results in the introduction of a stop codon is referred to as a "truncating mutation.” As use herein, nonsense mutations and truncating mutations are encompassed by the term "stop codon-introducing mutations.”
  • BRCA1 and BRCA2 genes are well known examples of genes wherein mutations have been shown to have seriously deleterious repercussions. The mutation of these genes has been found to play a critical role in a large number of cases of breast and/or ovarian cancer.
  • BRCA1 is a tumour suppressor gene located on the long arm of chromosome
  • Tumour suppressor genes play a role in regulating cell growth.
  • a woman is predisposed to breast or ovarian cancer (Narod et al., Cancer 74:2341 -46 (1994)), when one copy of the BRCA1 gene is inherited in a defective (mutant) form.
  • Development of cancer in either organ involves a number of additional mutations, at least one of which involves the other copy (allele) of BRCA1.
  • a woman who inherits one mutant allele of BRCA1 from either her mother or father has a high risk of developing breast cancer.
  • the gene encoding BRCA1 is relatively large with a transcription unit of 24 exons that spans approximately 81 ,000 bps of genomic DNA (Miki et al., 1994; Smith et al., Genome Research 6:1029-49 (1996)).
  • the BRCA1 protein comprises 1863 amino acids and has a molecular weight of 220 kDa.
  • the majority of the exons are small, often less than 100 bp in size, but there is one large exon (exon 11 ) which comprises 61% of the entire coding region (Hogervorst et al., Nature Genetics 10:208-12 (1995)). In amplifying the smaller exons, a significant proportion of incidental intronic sequence is included in diagnostics.
  • Genetic mutations in diploid organisms can be either homozygous or heterozygous.
  • a homozygous mutation is one that appears in both gene copies, while a heterozygous mutation occurs in only one of the two copies.
  • an individual that is heterozygous for a mutation that results in a loss or impairment of function for the encoded protein will experience no deleterious phenotype due to the ability of the other wild-type copy of the gene to encode normal protein sufficient to maintain function.
  • an individual homozygous for the same mutation e.g., the progeny of two heterozygous individuals, will be incapable of producing the normal protein and this can result in a disease or lethal phenotype. Familiar examples of such genetic conditions include sickle cell anemia and Tay-Sachs disease.
  • the presence of a heterozygous mutation can also render an individual more prone to certain diseases, cancer constituting an important example.
  • the development of many cancers has been linked to a inherited heterozygous mutation in the p53 cancer suppressor gene (Akashi and Koeffler, Clin Obstet Gynecol. 41 :172-99 (1998)). It is believed that the inactivation of one copy of the gene in itself does not normally result in cancer, since the other copy encodes enough wild-type p53 protein to provide protection.
  • the individual is susceptible to cancer due to the possibility of a mutation arising in the other copy of the gene in some cell of the individual, in which case no wild-type p53 protein is encoded and its cancer suppression function is lost.
  • a related problem is that some mutation detection methods have difficulty determining whether an identified mutation is heterozygous or homozygous. Thus, in any method of mutation detection it is valuable to be able to identify and distinguish between heterozygous and homozygous mutations, and if possible quantify the amount of mutant allele present in a tissue sample. The ability to quantify facilitates the detection of allelic imbalances such as loss-of-heterozygosity in the presence of a substantial background of normal cells.
  • Sequence heterogeneity in a DNA sample can also result from a cell preparation that is not homogeneous.
  • a cancer cell sample is often infiltrated and contaminated by normal cells, with the normal cells sometimes present in large excess over the cancer cells. It can be extremely difficult to detect a mutation in the cancer cells because of an excess of wild-type sequences in the sample arising from the normal cells. Any method able to identify and characterize a cancer-related mutation from a sample dominated by normal cells would thus be of extreme relevance to the medical community.
  • PCR polymerase chain reaction
  • DNA hybridization microarrays represent another technology that has found increasing application in mutation detection. While the relative ease and economy of microarray methods often makes them a better choice than direct sequencing for large-scale mutation screening, these methods have experienced difficulty detecting certain mutations, particularly frameshift mutations and mutations in heterogeneous samples.
  • PTT is a method specifically suited for the detection of premature translation termination mutations, i.e., stop codon-introducing mutations (Roest et al., Human Molecular Genetics 2:1719-21 (1993)).
  • This procedure is used for mutation screening in certain diagnostic testing facilities and involves the following steps: (1 ) isolation of genomic DNA and amplification of coding sequence of the target gene by PCR, or, alternatively, isolation of RNA and RT-PCR amplification of the target gene; (2) use of the amplified target as a template for the production of the radiolabeled product by in vitro transcription and translation; and (3) MW analysis of the radiolabeled protein by SDS-PAGE.
  • a key feature of PTT is a specifically designed, tailed, forward primer used in the PCR amplification.
  • This primer contains four specific regions.
  • a restriction site is frequently engineered into the 5' end of the primer.
  • a transcription promoter sequence e.g., a T7, SP6 or T3 promoter, is included to direct the transcription of RNA by the corresponding viral RNA polymerase.
  • the promoter sequence is followed by a spacer sequence of 3-6 bases and then a eukaryotic translation initiation sequence, i.e., a Kozak sequence (Kozak, Nucleic Acids Research 12:857-872 (1984)).
  • the 3' end contains 17-20 bases complementary to the target gene sequence and in frame with the ATG codon from the Kozak sequence.
  • the forward primer is necessarily relatively long, typically at least 56 bases in length.
  • Detection of the encoded protein product is typically achieved by the incorporation of a radiolabeled amino acid during synthesis, using [ 35 S] methionine, [ 35 S] cysteine or [ 3 H] leucine.
  • the newly synthesised proteins can be labeled by direct incorporation of biotin-labeled lysines. After SDS-PAGE separation and transfer to a membrane, the biotinylated proteins are detected by a streptavidin- conjugate and visualised using a chemiluminescent substrate.
  • PTT The complexity of PTT means that its application can require an experienced technician and considerable time, i.e., three days for the procedure and 1 day for the direct sequencing to characterize any mutations identified.
  • Mutation detection by PTT is known to suffer from a number of shortcomings. For one, early mutations (i.e., mutations near the N-terminus of the product of the test construct) can result in products which are too short for detection, either because little or no label is incorporated or because the electrophoretic migration falls outside the range of resolution. On the other hand, late mutations (occurring near the C-terminal end) might result in a minimal shift in size that cannot be resolved by conventional SDS- PAGE. Finally, mutations at translation initiation and termination sites represent a special case.
  • one object of the present invention is to provide improved methods for detecting mutations and other variants in a polynucleotide sequence, especially a DNA sample.
  • a further object of the invention is to provide nucleic acids and other reagents useful in practicing the improved methods described herein.
  • the invention comprises the steps of providing a detector nucleic acid which includes a detector nucleotide sequence, wherein the detector nucleotide sequence encodes a detector polypeptide having a detectable activity; inserting the target sequence within an open reading frame of the detector nucleotide sequence to give a chimeric sequence; expressing the chimeric sequence to give a chimeric polypeptide; determining the activity of the encoded chimeric polypeptide, and correlating the activity of the encoded chimeric polypeptide with the presence or absence of a mutation in the target nucleotide sequence.
  • the detector polypeptide can be an enzyme comprising two catalytically essential domains separated by a linker region, wherein additional polypeptide sequence can be inserted into the linker region to produce a chimeric enzyme which possesses a detectable catalytic activity as long as the catalytically essential domains remain linked, but wherein a decrease in the catalytic activity occurs if the insertion results in the loss of one of the catalytically essential domains from the detector polypeptide.
  • the loss of a catalytically essential domain can be the result of truncation or a frameshift in the linker region disrupting the reading frame of one of the catalytically essential domains.
  • the target sequence can obtained by IP-RP-HPLC separation of a mixture of
  • the target nucleotide sequence comprises BRCA1 or BRCA2.
  • the detector nucleotide sequence comprises engineered bacteriophage multi-specific DNA methyltransferase M.SPRI.
  • the detector nucleic acid comprises a cloning vector including a detector nucleotide sequence that is a lethal gene and the target sequence is ligated into the detector nucleotide sequence.
  • the activity of the chimeric polypeptide is determined by determining the viability of a host transformed with the chimeric sequence.
  • the detector nucleotide sequence encodes a DNA methyltransferase, more preferably a modified cytosine (C-5)- specific DNA methyltransferase, more preferably a modified form of a precursor methyltransferase originally containing a plurality of target recognition domains, wherein the modification comprises the elimination of one of the target recognition domains.
  • a DNA methyltransferase more preferably a modified cytosine (C-5)- specific DNA methyltransferase, more preferably a modified form of a precursor methyltransferase originally containing a plurality of target recognition domains, wherein the modification comprises the elimination of one of the target recognition domains.
  • the nucleotide sequence that originally encoded the eliminated target recognition domain can be replaced with a linker containing a plurality of restriction sites.
  • the detector nucleotide sequence comprises M.SPRX or a derivative thereof sharing at least 70% identity therewith.
  • the invention includes amplifying the target sequence under examination, using PCR primers containing 5' and 3' ends adapted for ligation into a vector; ligating the PCR products into an engineered MTase gene; transforming a host cell which has the property of being killed by active Mtase; and proliferating the cell to indicate the presence of a mutation within the sequence under examination.
  • the activity of the chimeric peptide is determined by assessing the methylation state of a DNA molecule derived from a host cell expressing the chimeric sequence.
  • the DNA molecule is a detector nucleic acid
  • the methylation state is assessed by treating the detector nucleic acid with a restriction endonuclease and determining the extent to which the detector nucleic acid is degraded by the endonuclease.
  • the detector nucleic acid is a vector, and the extent of degradation is determined by electrophoresis or by IP-RP-HPLC.
  • the detector nucleic acid comprises pSPRX and the method includes transforming a mcrBC " host, analyzing plasmids from mcrBC " transformants by Haelll digestion, wherein if the insert is wild type, the plasmid is resistant to Haelll cleavage, and wherein if the insert contains a mutation, the plasmid is at least partially degraded after Haelll restriction digestion.
  • the invention includes purifying the chimeric polypeptide and carrying out size-based separation of the chimeric polypeptide.
  • the invention includes obtaining plasmids from the transformed host cells and sequencing the plasmids.
  • the invention includes a method for diagnosing, in an individual, a disease which is associated with a mutation in a target nucleotide sequence, which method includes performing a method as described in claim 1 using a sample of nucleic acid from that individual.
  • detector nucleic acid adapted for use in detecting a mutation in a target nucleotide sequence present in a sample nucleic acid
  • the detector nucleic acid includes a detector nucleotide sequence which encodes a detector polypeptide having a detectable activity
  • the detector polypeptide is an enzyme comprising two catalytically essential domains separated by a linker region, wherein additional polypeptide sequence can be inserted into the linker region to produce a chimeric enzyme which possesses a detectable catalytic activity so long as the catalytically essential domains remain linked, but wherein a decrease in the catalytic activity occurs if the insertion results in the loss of one of the catalytically essential domains from the detector polypeptide, the detector sequence comprising an insertion site wherein a target nucleotide sequence can be inserted to yield a chimeric sequence encoding the chimeric enzyme.
  • the detector nucleic acid is derived from a multi- component enzyme which can accommodate extra sequence within it with little or no loss of activity.
  • the detector polypeptide comprises a Mtase, especially an MTase selected from the group consisting of M.SPRI, M. Aqul and M.MSPI.
  • the detector nucleotide sequence comprises M.SPRX, e.g., the pSPRX vector described herein.
  • the detector nucleotide sequence comprises a target sequence, wherein the target sequence preferably comprises all or a diagnostic part of a eukaryotic gene correlated with a genetic disease, more preferably wherein the eukaryotic gene comprises a gene in which mutations having been characterized and correlated with disease states.
  • Yet another aspect of the invention is a method for diagnosing, in an individual, a disease which is associated with a mutation in a target nucleotide sequence, the method comprising providing a detector nucleic acid which includes a detector nucleotide sequence, wherein the detector nucleotide sequence encodes a detector polypeptide having a detectable activity; inserting the target sequence within an open reading frame of the detector nucleotide sequence to give a chimeric sequence; expressing the chimeric sequence to give a chimeric polypeptide; determining the activity of the encoded chimeric polypeptide and correlating the activity of the encoded chimeric polypeptide with the presence or absence of a mutation in the target nucleotide sequence.
  • Another aspect of the invention is a host cell harboring a detector nucleic acid of the invention.
  • the target nucleotide sequence is derived from genomic DNA, typically as the product of PCR amplification of a sequence of interest from genomic DNA.
  • the target nucleotide sequence is derived from mRNA, typically as the cDNA amplification product of RT- PCR.
  • FIG. 1 is a pseudo-three dimensional representation of the modified form of
  • M.SPRI containing additional protein sequence such as that encoded by stretches of the BRCA1 gene and the product which is partially inactive as a consequence of truncation.
  • FIG. 2 is an overview of a protocol embodying one aspect of the present invention, involving the analysis of a heterozygous mutant.
  • FIG. 3 depicts the nucleotide sequence of M.SPRX with the polylinker region underlined.
  • FIG. 4(A) is a map of the plasmid pSPRX.
  • FIG 4(B) depicts the nucleotide sequence of pSPRX.
  • FIG. 5 shows the gel electrophoretic separation of pSPRX-(8)1 isolated following transformation of an mcrBC- host and subjected to restriction analysis using Hae ⁇ . The appearance of 7 degraded (D) and 5 undegraded (U) plasmids from a sample of twelve bacterial colonies from a transformation plate presents a striking demonstration that the patient is heterozygous for a stop codon-introducing mutation in BRCA1.
  • FIG. 6 shows SDS-PAGE gel analysis of size differences between the full- length chimeric polypeptide encoded by pSPRX-(8)2 (FIG. 5(A)) and the truncated chimeric polypeptide encoded by pSPRX-(8)1 (FIG. 5(B)), as described in Example 3.
  • the lanes preceding the purified sample are molecular weight markers and partially fractionated extracts of the recombinant strain expressing either normal or truncated protein.
  • FIG. 7 shows the bacterial transformations described in Example 5, illustrating the simplicity of the stop codon-introducing mutation test.
  • FIG. 8 shows superimposed chromatograms of pSPRX-(8)1 and pSPRX-(8)2
  • pSPRX-(8)2 chromatogram is represented by a solid, the pSPRX-(8)1 chromatogram by a dashed line.
  • the reduced activity of pSPRX-(8)2 is evidenced by the appearance of small peaks (fragments) in the corresponding curve.
  • the present invention provides novel methods and reagents that satisfy that need.
  • the invention is generally suited for a variety of applications pertaining to the analysis and characterization of DNA and other polynucleotides.
  • the present invention can be used to detect and characterize a genetic mutation in a nucleic acid sample of interest.
  • the invention can be used to detect both point mutations that result in premature translation termination, as well as insertion/deletions mutations that cause a shift in translation reading frame without necessarily altering the mass of the encoded protein to an extent that can be detected by SDS-PAGE analysis. It thus represents a substantial improvement over some currently available methods, such as PTT, that only detect mutations that result in a substantial shift in SDS-PAGE mobility.
  • the invention can also be used to characterize and distinguish between heterozygous and homozygous mutations.
  • the invention is capable of detecting a mutation in the presence of non-mutant DNA, a problem which has proven refractory to a number of the currently available methodologies.
  • the invention is able to conveniently identify mutations in gene sequence regions that are refractory to some of the currently employed methods, e.g., mutations that occur early (N- terminal) or late (C-terminal) in a particular gene or gene segment, and mutations at translation initiation and termination sites.
  • the invention involves the use of a detector nucleotide sequence encoding a polypeptide, the activity of which can be attenuated or modified when a target sequence including a mutation is cloned into the gene, i.e., a chimeric sequence is formed.
  • nucleotide sequence indicates a chain of nucleotides, e.g., an oligonucleotide or polynucleotide.
  • the detector nucleotide sequence encodes a detector polypeptide comprising two or more catalytically-essential domains separated by a linker region, where the catalytically- essential domains can function in concert to catalyze a detectable reaction.
  • the catalytic activity of the detector polypeptide is such that it can tolerate the addition of some additional polypeptide sequence into the linker region while still maintaining its activity.
  • the ability of the catalytically- essential domains to work in concert to catalyze the reaction is to some extent independent of the length and amino acid sequence of the linker region.
  • elimination of one of the catalytically-essential domains, either by truncation or by alteration of its reading frame, will result in a detectable activity loss, if not complete inactivation.
  • a detector polypeptide that includes additional amino acid sequence in this linker region is referred to as a chimeric polypeptide.
  • the target sequence is inserted into that segment of the detector nucleotide sequence which encodes the linker region. So long as the insertion does not alter the reading frame of the downstream catalytically-essential domain (i.e., the insertion consists of a number of bases that is a multiple of 3, and does not include a stop codon), the resulting construct will encode a chimeric polypeptide containing the two catalytically-essential domains separated by a linker region extended by the number of amino acids encoded by the test sequence.
  • this insertion will not disrupt the enzyme's catalytic activity (assuming that the length of insertion does not exceed some maximum insertion capacity characteristic of that detector polypeptide).
  • the target sequence contains a stop codon-introducing mutation, that mutation will result in truncation of the C-terminal catalytically-essential domain and a substantial loss of activity.
  • an insertion/deletion mutation can shift the coding frame of the C- terminal catalytically-essential domain, scrambling that domains amino acid sequence and attenuating activity.
  • the existence of a mutation in the target nucleotide sequence results in a detectable reduction in catalytic activity.
  • the activity of the chimeric polypeptide is determined and correlated with the presence or absence of a mutation in the target nucleotide sequence.
  • the “detector nucleic acid” is a nucleic acid comprising the detector polynucleotide.
  • the detector nucleic acids supplies functional elements that enable the transfer of the detector nucleic acid into a host cell, replication in a host cell and expression of the detector nucleotide sequence.
  • the detector nucleic acid will possess other functional elements, e.g., genetic elements that allow for positive or negative selection or that permit inducible gene expression. Such elements include origin of replication, promoter region, translation initiation sequence (e.g., Shine-Delgamo sequence, ribosome binding site), and in some cases enhancer or other regulatory sequences.
  • the detector nucleic acid generally takes the form of a vector.
  • vector is defined to include, inter alia, any plasmid, cosmid, phage or the like which can transform a prokaryotic or eukaryotic host, generally by existing extrachromosomally within the host.
  • the vector used herein will be an autonomous replicating plasmid with an origin of replication recognized by the host.
  • those skilled in the art will be well able to construct vectors of the present invention based on those of the prior art.
  • reporter nucleotide sequence will in general be a polynucleotide sequence that encodes a detector polypeptide.
  • the portion of the detector nucleotide sequence that encodes the linker region of the detector polypeptide comprises a polylinker region containing one or more unique restriction sites, thereby facilitating insertion of a target DNA sequence.
  • the detector polypeptide and polylinker regions are described in more detail below.
  • the "detector polypeptide” will generally be an enzyme which causes a detectable change in a substrate. The nature of the invention requires that the detector polypeptide be able to accommodate an insertion of a foreign sequence into the linker region to form a "chimeric polypeptide" while still retaining a detectable level of activity.
  • the detector polypeptide can accommodate an insertion of up to about 10 amino acids, preferably about 50 amino acids, more preferably about 100 amino acids, still more preferably about 150 amino acids, still more preferably about 200 amino acids, still more preferably about 250 amino acids, still more preferably about 300 amino acids, and most preferably 330 or more amino acids while still retaining a detectable level of activity. While insertion of sequence will be tolerated, an insertion event that results in a loss of the C-terminal catalytically-essential domain, either by truncation or shift in reading frame, will cause a detectable loss of activity as compared with the level shown by the non-truncated and non-frame-shifted chimeric polypeptide.
  • a difference in the relative level of activity (e.g., truncated vs. non-truncated) generally suffices to indicate a mutation, and that a determination of absolute activity is typically not required.
  • the activities need not actually be quantified, but can also simply be compared, optionally with controls run contemporaneously or historically.
  • a depressed activity can then be correlated with a loss of the C-terminal catalytically essential domain, and hence the presence of a nonsense or frameshift mutation as a result of the introduction of the test nucleotide sequence.
  • catalytically-essential domain refers to one of two or more polypeptide domains that, when connected by means of a linker region of varying length, function in concert to catalyze a detectable reaction, i.e., they confer upon the detector polypeptide a detectable catalytic activity.
  • the loss of one such catalytically-essential domain e.g., through truncation or as the result of a shift in reading frame, results in a detectable decrease in activity.
  • the loss in activity need not be total, as long as it is possible to distinguish between chimeric polypeptides that contain the catalytically-essential domain and those wherein the catalytically- essential domain has been lost.
  • the loss of a catalytically-essential domain will result in essentially complete inactivation.
  • the detector polypeptide is an enzyme that can be lethal if expressed in an appropriate host cell.
  • the encoded detector polypeptide is a DNA methyltransferase (a "Mtase"). These enzymes catalyze the methylation of DNA bases at locations characteristic of the particular methyltransferase. Methylation alters the ability of DNA to act as a substrate for a number of DNA modifying enzymes, in particular restriction enzymes, i.e., enzymes that cleave DNA in a target dependent manner.
  • Mtase activity in a host cell can be measured by checking DNA derived from the cell for susceptibility to cleavage by an enzyme that selectively cleaves only non-methylated DNA.
  • methylation will render DNA susceptible to cleavage by endogenous enzymes in the host cell that do not cleave non-methlyated DNA. In such a situation, methylation can be lethal, which provides a convenient system of screening for Mtase activity.
  • the detector polypeptide is a 5-methylcytosine methyltransferase (m5C-methyltransferase).
  • m5C-methyltransferase The function of these enzymes is to transfer a methyl group from S-adenosyl-L- methionine (SAM) to the C5 position of a cytosine residue contained within a specific double stranded DNA sequence.
  • SAM S-adenosyl-L- methionine
  • Two classes of m5C-methyltransferases are known: mono-specific methyltransferases, which recognize and modify a single recognition sequence, and multi-specific methyltransferases, which recognize and modify cytosines in multiple sequence contexts.
  • Mono-specific enzymes are commonly found in bacteria, where they are involved in restriction/modification systems to protect host DNA from cleavage by the corresponding restriction endonucleases (Piekarowicz et al., Nucleic Acids Research 19:1831-35 (1991); Mi and Roberts, Nucleic Acids Research 20:4811-16 (1992)).
  • a number of bacteriophage have been shown to express multi-specific Mtases (see, e.g., Trautner et al., Nucleic Acids Research 16:6649-58 (1988); Noyer-Weidner, M. and Reiners-Schramm, L., Gene 66, 269-78 (1988)).
  • bacteria expressing an MTase also elaborate a cognate restriction endonuclease, which recognizes and cleaves DNA at the same sequence recognized and methylated by the Mtase. Cytosine methylation at the site will prevent restriction, however, so a cell's endogenous Mtase activity normally prevents cleavage of its own DNA.
  • foreign DNA present in the cell that has not been methylated at the restriction site will be susceptible to cleavage. The result is somewhat analogous to the immune system of higher organisms, since it allows the cell to specifically target foreign DNA for destruction while protecting its own DNA from cleavage.
  • Mtases generally indicates their relationship with their cognate restriction endonuclease.
  • the cognate Mtase of the restriction enzyme Haelll is M. Haelll.
  • Mono-specific Mtases typically contain a variable region capable of recognizing the specific sequence targeted for methylation, sometimes referred to as a Target Recognition Domain (TRD), surrounded by conserved motifs involved in catalyzing the methylation reaction.
  • TRD Target Recognition Domain
  • Multispecific Mtases capable of targeting more than one sequence for methylation, are particularly suited for use in certain preferred embodiments of the invention.
  • These enzymes include 10 conserved motifs involved in catalyzing the methylation reaction and a plurality of TRDs, each TRD corresponding to a particular target sequence and conferring upon the enzyme the ability to specifically methylate that sequence. In some cases it has proven possible to eliminate a TRD without abolishing the ability of the enzyme to methylate at sequences recognized by the remaining TRDs.
  • the detector polynucleotide is derived from a multispecific Mtase by substituting a nucleotide sequence encoding a TRD with a polylinker region capable of accepting a target nucleotide sequence.
  • the detector polynucleotide can be derived from a monospecific Mtase by introducing a polylinker region into some region of the enyzme that does not disrupt its Mtase activity.
  • a mutation in the inserted target DNA can be determined by assessing the ability of the encoded chimeric polypeptide to methylate DNA at the sequence motif recognized by that particular Mtase. Because methylation generally confers resistance to cleavage by the cognate restriction enzyme, Mtase activity can be assessed by determining to what extent DNA exposed to the chimeric polypeptide is susceptible to cleavage by that enzyme. For example, plasmid DNA isolated from a cell expressing chimeric polypeptide with M. Haelll activity will be refractory to Haelll degradation relative to plasmid DNA from a cell wherein the M.
  • Haelll activity is attenuated as the result of insertion of target DNA containing a mutation, e.g., a truncating mutation. While C5-methylation protects against cleavage by certain restriction enzymes, it is also known that certain strains of bacteria specifically target DNA containing 5-methylcytosine for restriction. For example, many laboratory strains of
  • Escherichia coli K-12 (e.g., DH5 ⁇ ) contain mcr (for modified cytosine restriction)
  • the detector polypeptide is derived from the CCCGGG-specific C ⁇ Mtase M.Aqul (Karreman and de Waard, J. of Bacteriol. 172:266-72 (1990)). M.Aqul occurs naturally as a heterodimer made up of
  • catalytic subunit
  • DNA recognition subunit
  • ⁇ subunit is believed to associate with the ⁇ subunit by means of a hydrophobic arm
  • a polynucleotide sequence that encodes a monomer fusion product of the two subunits attached to one another by means of a polypeptide linker is preferable to use.
  • a polypeptide linker The construction and expression of such a plasmid containing such a hybrid (pREVENTI) is described in PCT Application No. WO97/01639. Insertion of target DNA into the region encoding the polypeptide linker does not attenuate the activity of the chimeric Mtase so long as the insertion does not result in a nonsense mutation or alter the reading frame of the downstream domain.
  • the detector polynucleotide is derived from the CCGG-specific C5 Mtase gene M.Mspl, which, when active, methylates the outer cytosine residue and thereby elicits a potent mcrBC response (Raleigh and Wilson (1986).
  • M.Mspl the CCGG-specific C5 Mtase gene
  • M.SPRI M.SPRI
  • MTBS_BPSPR B. subtilisis bacteriophage SPR
  • M.SPRI has a structure consisting of 10 well-conserved motifs surrounding the variable region containing the TRDs (Trans-Betcke et al. (1986) Gene 42:89-96; Lauster et al. (1989) J. Mol. Biol. 206:305-12).
  • TRDs Trans-Betcke et al. (1986) Gene 42:89-96; Lauster et al. (1989) J. Mol. Biol. 206:305-12.
  • the resulting modified gene identified as M.SPRX, is well suited to serve as a detector polynucleotide in certain preferred embodiments of the invention.
  • any DNA sequence that is flanked by sites present in the linker such as Xho ⁇ and EcoRI, can be inserted into the linker.
  • a target nucleotide sequence can be introduced into the M.SPRX gene without adversely affecting enzymatic activity.
  • the target nucleotide sequence is all or part of a gene suspected of being involved in a disease condition, e.g., a BRCA1 or BRCA2 gene.
  • FIG. 1 shows a pseudo-three dimensional, or "cartoon", representation of the modified form of M.SPRI containing additional protein sequence such as that encoded by stretches of the BRCA1 gene and the product which is inactive as a consequence of truncation.
  • additional protein sequence such as that encoded by stretches of the BRCA1 gene and the product which is inactive as a consequence of truncation.
  • derivatives or other variants of the detector polynucleotides specified above may also be used in the present invention, provided that they encode the requisite detectable activity. Generally speaking such variants will be substantially homologous to the 'wild type' or other sequence specified herein i.e. will share sequence similarity or identity therewith.
  • Similarity or identity may be at the nucleotide sequence and/or encoded amino acid sequence level, and will preferably, be at least about 60%, or 70%, or 80%, most preferably at least about 90%, 95%, 96%, 97%, 98% or 99%. Sequence comparisons may be made using FASTA and FASTP (see Pearson & Lipman, 1988. Methods in Enzymology 183: 63-98). Parameters are preferably set, using the default matrix, as follows: Gapopen (penalty for the first residue in a gap): -12 for proteins / -16 for DNA; Gapext (penalty for additional residues in a gap): -2 for proteins / -4 for DNA; KTUP word length: 2 for proteins / 6 for DNA.
  • T m 81.5°C + 16.6Log [Na+] + 0.41 (% G+C) - 0.63 (% formamide) - 600/#bp in duplex (Sambrook et al., supra).
  • practice of the present invention requires that means be available for introducing a target nucleotide sequence of interest into the linker region of the detector nucleotide sequence.
  • this means is provided by one or more restriction sites located within the polylinker, i.e., the sequence encoding linker region of the detector polypeptide. Cleavage at such a restriction site provides either blunt or overhanging ends to which the target polynucleotide can be ligated using standard techniques.
  • the restriction site or sites can be intrinsic to the gene, e.g. ⁇ Mtase, used as the basis for the detector polynucleotide. More typically, restriction sites are introduced into the polylinker region.
  • multiple unique restriction sites are introduced into a relatively short stretch of the detector polynucleotide, resulting in a multiple cloning site.
  • a multiple cloning site will enhance the utility and convenience of the resulting detector nucleic acid.
  • construction of the polylinker region, including a multiple cloning site will entail the excision of some portion of the native gene, e.g., the removal of a region encoding some domain of the gene. Such a situation is described in more detail in the following examples.
  • suitable host strains are available, and the skilled artisan will be able to choose an appropriate host based on the specific nature of the detector nucleic acid to be employed.
  • the host will be a bacterial organism, although some detector systems lend themselves to use in non-bacterial host cells, e.g., yeast, insect, plant or mammalian cells.
  • the detector nucleotide sequence is derived from an MTase gene, the choice will depend to some extent on whether or not it is desired to measure the activity as null (lethal) phenotype or not. Expression of an active Mtase will generally only be lethal in a host cell that has an mcr system that targets for cleavage DNA that is methylated by the Mtase activity.
  • strains of mcrA + mcrBC + E. coli are known, e.g., DH5 (Hanahan, D. (1983) J. Mol.
  • cells that are mcrA " mcrBC " e.g., DH ⁇ MCR (Grant et al. (1990)
  • Mtase activity is indicative of Mtase activity.
  • the presence of functional Mtase can be determined by assessing Mtase activity being expressed by the cell, for example by checking the methylation state of DNA derived from the cell by restriction digestion.
  • certain strains can "read-through" stop codon-introducing mutations to greater or lesser extent. For instance the supE
  • genotype of almost all E. coli K-12 strains e.g., DH5 ⁇ and DH ⁇ MCR
  • coli strains include W3110 (mcrA* hsd* mcrBC*) (described at http://gib.genes.nig.ac.jp/Ec/ ) and a mcrBC " derivative (NM679) with a large deletion that removes hsd and mcrBC (King and Murray (1995) Mol. Microbiol. 16:769-77). In some embodiments of the invention such strains are the preferred host.
  • the present invention can be used to analyze a target nucleic acid present in a sample nucleic acid.
  • the sample may be from any appropriate nucleic acid source, and may represent all or part of the source. Pooled samples may be used for further analysis if required.
  • the sample may derive from cDNA, genomic DNA (all or part of an intron or exon thereof), amplified portions of DNA (e.g., from a PCR reaction), libraries, artificial chromosomes, etc., or fragments of any of these (e.g., from a restriction digest).
  • the sample nucleic acid may be provided isolated and/or purified from its natural environment, in substantially pure or homogeneous form, or free or substantially free of other nucleic acids, e.g., a selected DNA fragment separated from a mixture of DNA fragments.
  • DNA chromatography HPLC
  • the target nucleotide is derived from mRNA rather than genomic DNA.
  • An advantage of looking for mutations at the mRNA level is that it allows for the detection of mutations that cause an alteration in RNA splicing.
  • This embodiment of the invention is preferably accomplished by isolating mRNA from a relevant sample and amplifying a cDNA copy by RT-PCR, which is then treated with the appropriate restriction enzymes and inserted into an appropriate detector nucleic acid. The entire gene, or a fragment of thereof, can be amplified and checked for a mutation as described herein.
  • RT-PCR is described, e.g., in Myers & Gelfand (1991) Biochemistry 30: 7661 ; Young et al.
  • the target sequence is a gene the mutation of which is known or suspected of being involved in genetic disease or a genetic predisposition to disease, or a relevant portion of such a gene.
  • the target sequence can be derived from a gene for which a mutation can lead to the development of cancer, or wherein a mutation can be used as an indicator of a particular type of cancer.
  • the target sequence is derived from the human BRCA1 or BRCA2 genes.
  • target polynucleotide is inserted into the linker region of a detector polynucleotide.
  • the target polynucleotide should be flanked by restriction sites that are ligation-compatible with sites in the linker region of the detector polynucleotide. Cleavage of those sites with the corresponding restriction enzymes will allow for the targeted introduction of the target sequence into the linker region of the detector polynucleotide by means of standard DNA ligation techniques.
  • the use of the same enzymes to cleave both target and linker region will not be necessary, so long as the ends are compatible and amenable to ligation. It is important that the polylinker region be of a known reading register, so that target polynucleotides can be selected wherein the normal (i.e., wild-type) form does not alter the reading frame downstream from the insertion event.
  • An important aspect of the invention concerns the length of the target sequence introduced.
  • the insertion In order to maintain the correct reading frame for coding sequence downstream of the site of insertion, the insertion must not change the reading frame of the downstream sequence, i.e., the length of the inserted sequence must be some multiple of three bases. Therefore, in designing an experiment and selecting an appropriate target sequence, it is normally critical that the non-mutated target sequence results in the insertion of a multiple of three bases.
  • the resulting frameshift will lead to a detectable loss of activity that flags the presence of the mutation.
  • different portions of the gene under examination are tested separately so that any detected mutation can be localised to a region within a longer gene.
  • This can be achieved, for instance, using a multiplex PCR reaction (i.e., the simultaneous amplification of different target DNA sequences in a single PCR reaction).
  • Multiplex PCR is described, for example, in Edwards et al., “Multiplex PCR: Advantages, Development and Applications", PCR Methods and Applications, 3:56 ⁇ - ⁇ 7 ⁇ (1994) and Shuber et al., "A Simplified Procedure for Developing Multiplex PCRs” Genome Research, ⁇ :488-498 (1995).
  • PCR standard non-multiplex PCR can be employed to generate the appropriate target nucleotide sequence or target nucleotide sequences.
  • Appropriate PCR conditions can be determined without undue experimentation by one of skill in the art.
  • a representative amplification buffer could be prepared using
  • n is normally set at 1 minute per Kbp. If a proofreading polymerase is to be used it is recommended that n be adjusted to 2 minutes per Kbp.
  • the particular amplification conditions can vary substantially, depending upon the nature of the primers and template used, as well as their concentrations. During a standard PCR reaction annealing is usually performed at approximately ⁇ °C below the lowest Tm of the
  • Tm in order to avoid non-specific priming resulting from an annealing temperature too far below the Tm.
  • PCR Protocols A Guide to Methods and Applications, Academic Press (1990; ISBN: 0123721814) and Innis et al. PCR Applications: Protocols for Functional Genomics, Academic Press (1990; ISBN: 0123721857).
  • any of a variety of nucleic acid amplification and other molecular biology techniques known to the skilled artisan can be used to generate the desired DNA segments for mutation detection pursuant to the present invention.
  • regions of preferably 30 to 5001 bp, more preferably 99 to 2001 bp, and still more preferably 501 to 999 bp are tested from patient sample DNA against a control, using primers having common ⁇ '-3' ends that introduce restriction sites into the amplification products suitable for cloning into the linker region of the detector polynucleotide, preferably a multiple cloning site.
  • primers having common ⁇ '-3' ends that introduce restriction sites into the amplification products suitable for cloning into the linker region of the detector polynucleotide, preferably a multiple cloning site.
  • the multiplex amplification is designed such that the products are of diverse lengths.
  • the difference in size of the resulting target polynucleotides will in some cases facilitate their separation from one another prior to cloning into a detector nucleic acid, e.g., when separation is by means of HPLC.
  • An example of an appropriate size series of amplimers would comprise sequences of 2 ⁇ 0 bases, 300 bases, etc., up to about 800 bases (the size need generally not be greater than 1000 bases).
  • PCR primers can be designed using principles known to one of skill in the art, as described, e.g., in Gelfand et al. and Innis et al, supra. Mutations located within the region of primer annealing will not be detected, so if the forward (sense) primer is designed to anneal at the start codon of a gene of interest the
  • primer should preferably include a minimum of sequence derived from the region 3'
  • the primer should also preferably be designed to include a suitable restriction site
  • Restri ction site s shoulc be inco rporated in such a ⁇ Na as to maintain reading frame of the gene following its insertion in pSPRX.
  • the forward (sense) primer is directed at a region other than the start codon, then the primer should be designed to anneal immediately upstream of the region of interest. All primers should be carefully checked to ensure that the reading frame of the target gene will be preserved after ligation to pSPRX and that no stop codons have been inadvertently introduced into the primers.
  • the reverse (antisense) primer is intended to anneal at the stop codon of the gene of interest then the primer should preferably be designed to remove or exclude the stop codon.
  • primer sequence necessary to extend the primer sequence to include between 10 and 20 bases 3' of
  • the primer should be designed to anneal immediately downstream of the region of interest. Mutations located within the region of primer annealing will not be detected. Restriction sites compatible with the polylinker of pSPRX should be
  • an appropriate selection of target nucleotide sequences are prepared from a nucleic acid sample by multiplex PCR, and the amplimers are then separated from each other by means of HPLC.
  • ion pairing reverse-phase HPLC IP-RP-HPLC is used to separate the target nucleotide sequences from one another.
  • IP-RP-HPLC is a form of chromatography particularly suited to the analysis of DNA, and is characterized by the use of a reverse phase (i.e., hydrophobic) stationary phase and a mobile phase that includes an alkylated cation (e.g., triethylammonium) that is believed to form a bridging interaction between the negatively charged DNA and non-polar stationary phase.
  • a reverse phase i.e., hydrophobic
  • a mobile phase that includes an alkylated cation (e.g., triethylammonium) that is believed to form a bridging interaction between the negatively charged DNA and non-polar stationary phase.
  • alkylated cation- mediated interaction of DNA and stationary phase can be modulated by the polarity of the mobile phase, conveniently adjusted by means of a solvent that is less polar than water, e.g., acetonitrile.
  • M1PC Matched Ion Polynucleotide Chromatography
  • a selected nucleic acid target sequence for use in the methods of the present invention may be provided as follows: (a) providing a mixture of DNA fragments; and (b) separating the mixture of DNA fragments using reverse phase HPLC (preferably MIPC) on the basis of fragment length.
  • reverse phase HPLC preferably MIPC
  • the methods herein can be based on any activity of the detector gene which distinguishes (qualitatively or quantitatively) the insertion of a sequence containing a mutation from the insertion of a sequence not containing a mutation.
  • the mutation is a stop codon-introducing mutation.
  • the mutation can be one that results in a shift in reading frame downstream of the mutation, without necessarily resulting in premature truncation of the gene product.
  • mutant genes when inserted into the modified methyltransferase generate a partially active enzyme and as a consequence, these plasmids will not be fully methylated in the host cell.
  • This attenuation of activity can be detected by isolating the plasmid (e.g., by mini-prep), exposing it to a restriction enzyme that targets the sequence recognized by the functional form of the Mtase (e.g., Haelll if the functional Mtase has M. Haelll activity), and determining the extent to which the plasmid is degraded (e.g., by gel electrophoresis, capillary electrophoresis or IP-RP-HPLC). Increased levels of degradation is correlated with decreased methylation levels and hence attenuated Mtase activity.
  • DNA ladder was normally used which gives a range between 250 bp and 10 kb.
  • the intensity of the bands can be quantified by scanning and digital analysis.
  • plasmid degradation is evaluated by separating and quantifying DNA fragments by IP-RP-HPLC, particularly as described in U.S. Patent No. 6,066,258.
  • Typical conditions would include 0.1 M triethylammonium acetate (TEAA) as a counterion and fragment elution by acetonitrile gradient, column temperature set such that the DNA fragments are not denaturing.
  • TEAA triethylammonium acetate
  • the plasmid or other detector nucleic acid will be checked by treatment with the appropriate restriction enzyme or enzymes to confirm that ligation of the insert has indeed taken place. This can be accomplished by using gel electrophoresis or IP-RP-HPLC to check for an insertion fragment of the correct length. This confirmation step is applied in Example 2 below.
  • the combination of two or more mutation analysis methods described herein may be desirable in certain instances, e.g., to increase the certainty of the assessment.
  • the activity of the chimeric polypeptide is such that it is detrimental to its bacterial host unless the foreign DNA within it contains a mutation that attenuates Mtase activtity.
  • the chimeric polypeptide is lethal without the insertion of polynucleotide including such a mutation. That is, if a fragment containing a stop codon is inserted into an open reading frame, a host cell (e.g., an mcr + strain of E. coli) containing that gene can proliferate.
  • a host cell e.g., an mcr + strain of E. coli
  • the Mtase activity results in DNA methylation and subsequent mcr-mediated restriction and death.
  • the method is performed by ligating the foreign target sequence into a cloning vector including a detector nucleotide sequence which is a lethal gene.
  • a "lethal gene” is a gene that encodes a protein that can be lethal in a particular host background, e.g., a methyltransferase in a mcr + strain of E. coli, and need not be universally lethal in all host cells.
  • a host is then transformed with the vector, and the activity of the chimeric polypeptide is determined by assessing the viability of the transformed host.
  • the engineered Mtase gene may be in the form of a positive selection vector in which the Mtase can accommodate extra sequence (as described in US Provisional Application No. 60/157,072).
  • a turbidimetry measurement may be used to establish viability of the host (i.e. proliferation in the broth medium).
  • Turbidimetry involves measuring cell density by determining the extent of light scatter, i.e., the OD, at a defined wavelength, typically 600 nm. This may be performed, for instance, by placing them with broth in a clear-bottomed 96 well plate (Polyfiltronics) maintained at 37°C. This can be read using a 96-well plate reader, with an output of OD 600 vs. time. This type of assay is normally best suited for the detection of a homozygous mutation.
  • the host cells containing the vector are plated out on an appropriate solid medium.
  • Such medium will preferably be prepared such that it will only support the growth of cells containing the detector nucleic acid (e.g., the Mtase encoding plasmid), typically by the inclusion of one or more antibiotics or other selection reagent.
  • Such methods may employ two bacterial hosts, one (mcr + ) which will not tolerate methylated DNA and one (mcr ⁇ ) which will.
  • mcr + methylated DNA
  • mcr ⁇ methylated DNA
  • cells include the mcrBC* and mcrBC host strains DH5 ⁇ and DH ⁇ MCR.
  • mutant DNA into the linker region of the modified methyltransferase generate a partially inactive enzyme and, as a consequence, colonies will appear on both plates.
  • the insertion of wild-type DNA does not attenuate Mtase activity, and thus colonies will only appear on the mcr plates. However, even in these cases a quantitative assessment may have to be made. Therefore in certain contexts it may be desirable to use other detection methods.
  • a preferred detection method is based on quantitative analysis of a substrate for the detector gene, from which changes in activity (or relative activity) can be assessed or inferred.
  • insertion of a target sequence into the linker region of pSPRX in frame with the M.SPRX gene does not destroy the TRD H (Haelll) activity of the encoded Mtase.
  • the insert has a deletion, an insertion or a stop codon, it will change the reading frame of the M.SPRX gene, leading to at least partial attenuation of the TRD H activity.
  • plasmids from the resulting mcrBC transformants can be analyzed by Haelll digestion. If the insert is wild type, i.e., does not introduce a stop codon or disrupt the reading frame of downstream sequence, the gene product retains TRD H activity and therefore the plasmid is resistant to Haelll cleavage. If the sample contains a homozygous mutant, the TRD H activity is attenuated, either partially or wholly, and as a result the plasmid be deficiently methylated and subject to restriction by Haelll. Degradation can be partial, so long as it is distinguishable from the case where TRD(H) activity is fully functional.
  • the sample contains a heterozygous mutation, it is predicted that approximately half of the colonies will contain active M.SPRX. The other colonies will possess attenuated TRD (H) activity.
  • the extent of Haelll restriction can be assayed by any of a variety of techniques capable of determining the length of DNA fragments. Gel electrophoresis, as described, for example, by Ausubel et al., is a standard method for determining the length of DNA restriction fragments and can be used to assess the extent and nature of fragmentation. Alternatively, fragmentation can be assessed by HPLC analysis of the treated plasmid. In a preferred embodiment, the plasmid is anaylzed by MIPC, as described above.
  • the chimeric (fusion) polypeptide including detector and target sequence can be expressed, purified and analysed for truncation, for instance, by comparison with the full length fusion by SDS-PAGE.
  • This can entail inducing expression (if required), obtaining a cell pellet from culture, lysing the cells, purifying the chimeric detection polypeptide from the supernatant, and carrying out size-based separation of the products e.g., by SDS-PAGE. Mutations can also be confirmed and/or more precisely characterized by direct DNA sequencing.
  • FIG. 2 provides an overview of a protocol embodying one aspect of the present invention, involving the analysis of a heterozygous mutant. Mutation detection is achieved by means of a detector nucleotide sequence derived from M.SPRI, determination of Mtase activity by Haelll digestion, and confirmation by direct sequencing and/or SDS-PAGE analysis of purified chimeric polypeptide.
  • a detector nucleotide sequence derived from M.SPRI determination of Mtase activity by Haelll digestion
  • confirmation by direct sequencing and/or SDS-PAGE analysis of purified chimeric polypeptide.
  • EXAMPLE 1 Construction ofpSPRX- an exemplary detector nucleic acid
  • the gene encoding the bacteriophage multi-specific DNA methyltransferase (M.SPRI) has been used as the basis for the construction of a detector polynucleotide, including a polylinker region comprising multiple unique restriction sites.
  • the targeted modification of the M.SPRI included the removal of a region encoding one of the TRDs, specifically TRD(M), which targets the Mspl recognition sequence CCGG.
  • M.SPRX The product of the modified gene, referred to as M.SPRX, was shown to retain the ability to methylate at CC(A/T)GG and GGCC, the sequences targeted by the two remaining TRDs, TRD (E) and TRD(H), respectively.
  • the sequence which has been deleted can be replaced by any DNA sequence that is flanked by restriction sites uniquely represented in the polylinker region, e.g., EcoRI and Xho ⁇ .
  • stretches of sequence from the BRCA1 gene can be introduced into the M.SPRI gene without adversely affecting the enzymatic activity of the gene product. If, however a stop (e.g., nonsense) codon arises aberrantly, protein synthesis will be terminated prematurely in vivo and a competent DNA methyltransferase will no longer be produced.
  • the nucleotide and amino acid was originally obtained in the recombinant plasmid pMS119EH (Walter et al., (1992) EMBO J. 11 :444 ⁇ - ⁇ 0).
  • E. coli strain DH ⁇ MCR (mcrA " mcrBC " ) was used as host
  • the M.SPRI gene was subcloned into pGEX-KG, a glutathione-S-transferase (GST) fusion expression vector (Guan and Dixon (1991 ) Analytical Biochemistry 192:262-67).
  • GST glutathione-S-transferase
  • the M.SPRI gene was fused downstream to that encoding GST using Bam ⁇ and Xho ⁇ restriction sites to produce pGEXKG-SPR.
  • Such fusion proteins when expressed can be purified by a single affinity chromatography step (Smith and Johnson ((1988) Gene 67:31-40).
  • the Pst ⁇ site was removed from pGEXKG- SPR.
  • the vector was digested with AatW and AlwH ⁇ and the 4879 bp fragment was ligated with the fragment containing the ampicillin resistance gene (1398 bps) from pUC18, which had been cleaved with the same restriction enzymes.
  • the structure of the resulting construct which is named pKG-SPR, was verified by the loss of the Pst ⁇ site.
  • TRD(M) was replaced with a polylinker region comprising a multiple cloning site.
  • the nucleotide and amino acid sequences of the modified gene, named M.SPRX are provided herein as SEQ ID NOS: 3 and 4, respectively.
  • the nucleotide sequence of M.SPRX is also presented in FIG. 3, with the polylinker region underlined.
  • the resulting plasmid was named pSPRX.
  • FIG. 4(A) is a map of the plasmid pSPRX
  • FIG 4(B) depicts the nucleotide sequence of the plasmid.
  • the final construct was the result of six cloning steps, described below, and involved the construction of the following intermediary plasmids: (1) pSPR-Xbal, (2) pSPR- Xba ⁇ Stu ⁇ , (3) pSPR-XbalSf ⁇ lKan, (4) pSPR-Linker, and ( ⁇ ) pSPR-Linker-dR.
  • Standard molecular biology protocols were used for all reactions and analysis, e.g., restriction digestions, PCR amplifications, ligations, electrophoresis, etc. Such techniques are known to those working in the field widely available in a variety of publications, including, for example, Sambrook et al, and Ausubel et al..
  • PCR experiments a negative control was always carried out, to exclude the possibility of any contamination.
  • the following reagents were used for doing PCR: 10 ng (plasmid) or 100-
  • DNA polymerase was added and the contents of the tube were mixed by gentle agitation. A mineral oil overlay was not required since the GeneE thermal cycler used in all PCR amplifications possesses a heated lid. Reactions were placed in the thermal cycler and a pre-designed program was initiated. The cycling program used
  • n min 72°C for n min, where n depends on the size of the product and polymerase activity.
  • the optimal polymerisation temperature was usually 72°C. If Taq polymerase was
  • DNA fragments or oligodeoxynucleotides were ligated into plasmid vectors (containing compatible cohesive termini or both containing blunt ends) using the reaction catalysed by bacteriophage T4 DNA ligase (as described in (Sambrook
  • Ligation reactions were performed in 10 or 20 ⁇ l volumes using an
  • reaction was incubated at room temperature for 2 hours or at 16°C for overnight; depending on the nature of DNA ends.
  • an Xj al site was introduced into TRD(M) using two complementary mutagenic oligodeoxynucleotides, SPRXbaA (GCAGTTGAGTACTCTAGAAAAAGCGGGCTTG) and SPRXbaB (CAAGCCCGCTTTTTCTAGAGTACTCAACTGC) (SEQ ID NOS. ⁇ and 6,
  • pSPR- bal contains another Xba ⁇ site, which overlaps with a dam modification site, and would therefore not be cleaved in a plasmid that has been prepared from a dam* strain.
  • the presence of the newly introduced restriction site was confirmed by Xba ⁇ digestion of mutant plasmid that
  • SPRStuA GCTTCTGACTGGAGGGCCTAGAATAGGAACCAAAAACAAAATGC
  • SPRStuB GCATTTTGTTTTTGGTTCCTATTCAGGCCTCTCCAGTCAGAAGC
  • X06404 a derivative of pUC4K containing the kanamycin resistance cassette, in which the following duplicate restriction sites have been added: Xba ⁇ , Sma ⁇ , Sst ⁇ , Kpn ⁇ , PvuW (Taylor and Rose (1988) Nucleic Acids Res. 16:3 ⁇ 8).
  • the resulting construct was named pSPR- X alSfx/IKan. Insertion of the kanamycin resistance cassette inactivates TRD(Haelll) and therefore Stu ⁇ digestion becomes possible.
  • the aim of the next step was to insert a polylinker including a number of unique restriction sites in place of the TRD(M) coding sequence.
  • pSPR-XS-Kan was cleaved with Xba ⁇ and Stu ⁇ and the 6210 bp fragment, which had lost the kanamycin resistance gene, was ligated with an annealed pair of synthetic oligodeoxynucleotides coding for the "SPR polylinker" (CTAGATCTCTGCAGCTCGAGCCCGGGGCTAGCCATATGGAATTCAGAGG and CCTCTGAATTCCATATGGCTAGCCCCGGGCTCGAGCTGCAGAGAT, SEQ ID NOS. 11 and 12, respectively).
  • the SPR polylinker contains Xba ⁇ , BglW, Pst ⁇ , Xho ⁇ , Smal, Nhe ⁇ , Nde ⁇ , EcoRI and Stu ⁇ restriction sites. Insertion of the polylinker did not change the reading frame of the M.SPRI gene.
  • the resulting plasmid was called pSPR-Linker.
  • the methylation capacity of the mutant M.SPRI encoded by this plasmid referred to as M.SPRX, was assessed by restriction analysis of the pSPR-Linker plasmid. In other words, the ability of the mutant M.SPRI encoded by this plasmid.
  • 1 unit of restriction endonuclease activity is the amount of
  • mutant methyltransferase M.SPRX retained the capacity to methylate the sequence recognized by Haelll (GGCC), as shown by Haelll digestion experiments in FIG. 5. All restriction enzymes were obtained from MBI Fermentas or New England Biolabs, Inc. (NEB). However it was found that M.SPRX had selectively lost the ability of wildtype M.SPRI to methylate a CCGG sequence, i.e., the Mspl sequence.
  • pSPR-Linker contains two Xbal, two Xho ⁇ and two EcoRI sites, and since these sites should be unique in the polylinker (to facilitate insertion mutagenesis), the redundant sites had to be removed from the vector. In order to remove the X al, Sac ⁇ and EcoRI sites from the region encoding motif VIII of the M.SPRI gene was accomplished in one step by site-directed
  • the sequences of the mutagenic oligonucleotides are provided as SEQ ID NOS: 13 and 14, respectively.
  • the resultant plasmid, named pSPR-Linker-dR was shown to be refractory to Haelll restriction, and therefore codes for an active M.SPRI methyltransferase.
  • the X ⁇ ol site at the 3'-end of the M.SPRI gene was replaced with a Sacl site in order to make the Xho ⁇ site in the polylinker sequence of pSPR- Linker-dR unique. This was accomplished by site-directed mutagenesis using a pair of complementary oligodeoxynucleotides SPR ⁇ XholA
  • the sequences of the mutagenic oligonucleotides are provided as SEQ ID NOS: 1 ⁇ and 16, respectively.
  • the resulting construct, pSPRX was shown to be refractory to Haelll restriction after isolation from a host cell, and therefore codes for an active methyltransferase.
  • a number of target polynucleotides containing stop codons and/or frameshift mutations were inserted to verify that the method is capable of detecting such mutations. Insertion of PCR product-(1) into pSPRX.
  • the first experiment involved the amplification of 300 bp from within exon 11 of the BRCA1 gene, using primers B1 (CGCGCGCGTCGACTTAACGAAACTGGACTCATTACTCCAAAT) and B3 (CGCGAAJTCATTAATACTGGAGCCCACTTCATT), as shown in SEQ ID NOS: 17 and 18, respectively.
  • B1 is compatible with that part of the BRCA1 gene starting from base pair 3000, B3 starts at base pair 3299 of the BRCA1 coding sequence.
  • These oligodeoxynucleotides introduce Sail and EcoRI sites into the ⁇ ' and 3'-ends of the PCR products, respectively.
  • the amplification product which is called PCR product-(1), was cleaved with Sa/I and EcoRI and then inserted into pSPRX that had been pre-cut with Xhol and EcoRI.
  • double digest the best buffer was selected to provide reaction conditions that were amenable to both restriction endonucleases.
  • the number of units of enzyme or incubation time of the reaction was adjusted to compensate for slower rate of cleavage. If no single buffer could be found to satisfy the buffer requirements of both enzymes, the reactions were carried out sequentially.
  • TRD(H) in the resultant construct, pSPRX-(1 ) was checked by digesting the plasmid with Haelll. This was accomplished by isolating the plasmid, using the DNA Wizard miniprep system (by Promega) according to the
  • isolated plasmid was subjected to restriction digestion with desired restriction enyzme and the extent of cleavage determined by analysis on agarose gel.
  • PCR product-(2) was generated by amplification of the BRCA1 gene with primers B2 and B3.
  • Oligodeoxynucleotide B2 (CGCGCGC CGAGAACGAAACTGGACTCATTACTCCA), shown in SEQ ID NO: 19, is compatible with that part of the BRCA1 gene starting from base pair 3000 and introduces a Xho ⁇ site at the ⁇ '-end of PCR product. After cleavage of the PCR product with Xho ⁇ and EcoRI, it was inserted into pSPRX that had been cut with the same enzymes.
  • PCR product-(4) which encodes a stop codon at its 3'-end, was generated by amplification of BRCA1 gene with oligodeoxynucleotides B2 and B ⁇ . After Xfrol- EcoRI restriction digestion, the PCR product was inserted into pSPRX that had been cleaved with the same enzymes. The only difference between PCR product-(2) and PCR product-(4) is the presence of TAA stop codon in the latter one. The TRD(H) activity of the encoded enzyme was assayed by determining the susceptibility of the plasmid to Haelll cleavage.
  • the plasmid was partially susceptible to Haelll cleavage, indicating that, as a result of the introduction of the nonsense mutation, the enzyme retained only partial TRD(H) activity. Thus, truncation of the enzyme results in loss of the TRD(H) domain and a partial loss of the TRD(H) activity.
  • This mutation contains the stop codon-introducing mutation 4184del4 (TCAA) in exon 11 , codon 13 ⁇ 7, which has been associated with both ovarian and protatic cancer and which leads to the expression of a truncated BRCA1 protein (Simard et al., Nature Genet. 8: 392-398, (1994)).
  • TCAA stop codon-introducing mutation 4184del4
  • TCAA truncated BRCA1 protein
  • the mutation results in the introduction of a stop codon at the intron/exon border (TAg), and thus the insertion of an amplified segment of the gene that includes this region will result in a stop codon in the linker region.
  • Tg intron/exon border
  • B6 starts from base 3939 of BRCA1 coding sequence and introduces a X ⁇ ol site into the ⁇ '-end of PCR product.
  • Reverse primer B7 starts in intron 12, after exon 11 and runs until the 3'-end of the exon (base 4214 of BRCA1 coding sequence) introducing an EcoRI site into the products.
  • PCR product-(5)1 was cut with X ⁇ ol and EcoRI and inserted into pSPRX which had been cleaved with the same enzymes. Recombinant plasmids were
  • PCR product (5)-2 was generated with the same primers but was derived from a template encoding the normal BRCA1 gene. After insertion of the PCR product into pSPRX, as explained above, recombinant plasmids
  • Primer B8 starts at base pair 2782 of the BRCA1 coding sequence.
  • pSPRX After cleavage of PCR product-(6) with Xho ⁇ and EcoRI, it was inserted into pSPRX, which had been cut with the same enzymes.
  • the primers were designed in such a way that the insertion event would not change the reading frame of the M.SPRX gene.
  • Haelll restriction digestion was employed to assess TRD(H) functionality. It was found that the resulting construct, pSPRX-(6), was resistant to Haelll cleavage. Insertion of PCR product-(7) into pSPRX.
  • PCR product-(7) a 400 base PCR product containing a stop codon was generated (PCR product-(7)). It was amplified from a region of exon 11 of the BRCA1 gene using primers B8 and B5, described supra.
  • Primer B5 introduces a TAA stop codon in the PCR product.
  • pSPRX was first cleaved with Xho ⁇ and EcoRI and was then ligated with PCR product-(7) that had been cut with the same enzymes. TRD H in the resulting construct, referred to as pSPRX-(7), was determined to be partially active as shown by the Haelll cleavage pattern of the plasmid.
  • PCR product-(8)1 was generated using oligodeoxynucleotides B6 and B7 as primers and the anonymous patient sample as template. The PCR product was inserted into pSPRX, as described above, via Xr/ol
  • PCR product-(9) was produced by amplification of the end of exon 11 of a normal BRCA1 gene, using primers B9 (36 bps) and B7.
  • Primer B9 CGCGCGC CGAGTCTACTAGGCATAGCACCGTTGCT
  • SEQ ID NO:24 SEQ ID NO:24
  • PCR product-(10)2 A 500 bp PCR product (PCR product-(10)2) was amplified from the end of exon 11 of BRCA1 gene, using primers B10 and B7.
  • Primer B10 CGCGCGC CGAGAAGAAATTAGAGTCCTCAGAAGAG
  • SEQ ID NO:25 SEQ ID NO:25
  • Primer B11 starts at base 3421 of the BRCA1 coding sequence, and the resulting amplification product (PCR product-(11 )) is 700 bases. Insertion mutagenesis was carried out as before.
  • the resultant plasmid, designated pSPRX-11 was shown to be resistant to Haelll restriction digestion, thereby demonstrating that TRD H has remained active.
  • FIG. ⁇ shows a typical analysis of the susceptibility of an insert-containing pSPRX plasmid to restriction digestion. In particular, this is the result of analysis of transformants obtained after insertion of PCR product(8) into pSPRX, as described
  • the gel includes two lanes for each of 14 different transformant colonies, flanked by lanes containing molecular weight markers.
  • the first lane (reading from left to right) contains a double digestion of the plasmid with Xho ⁇ and EcoRI to check the size of the insert, which should be 300 bps.
  • the second lane contains the result of Haelll digestion.
  • the three samples labeled N are seen to not contain the 300 bp insert, and were not analyzed further. Of the remaining 12 transformants that contain the proper size insert, 7 were degraded by Haelll treatment (labeled D), while the other ⁇ were undegraded (labeled U).
  • 7 of the inserts contain mutations that attenuate the TRD(H) activity of the encoded Mtase and thus are not able to sufficiently methylate the plasmid to protect against Haelll cleavage.
  • the ⁇ undegraded samples indicate that the corresponding inserts did not inactivate the TRD(H) functionality of the Mtase, i.e., no stop codon-introducing mutation was introduced. This is a good example of the results expected for a patient heterozygous for a stop codon-introducing mutation in BRCA1. In the case of a homozygous mutation, all of the resulting colonies are expected to express partially active enzyme and hence all plasmids containing the insert will be degraded.
  • Protein purification and analysis by SDS-PAGE In order to confirm the results obtained by checking insert-containing plasmids for susceptibility to Haelll cleavage, the proteins encoded by the M.SPRX gene (including insert) were purified and their lengths determined by SDS-PAGE.
  • the M.SPRX-GST fusion products encoded by constructs containing PCR product 8(1 ) and 8(2) inserts were expressed in E. coli and purified to greater than 90%
  • elution buffer 50 mM Tris-HCI, pH 8.0, 10 mM reduced glutathione. After 10 minutes the protein was eluted in four 0.5 ml fractions. Aliquots of each elution were analyzed by SDS-PAGE to ascertain purity.
  • pSPRX-(8)1 and pSPRX-(8)2 were directly sequenced by standard dideoxy sequencing methods using primer B6.
  • Template DNA was isolated and purified from E. coli cells using a Wizard Plus Minipreps kit (Promega). 200-500 ng double-stranded DNA template and 3.2 pmol of oligodeoxynucleotide primer were used for each reaction, using the Taq Dyedeoxy or BigDye Terminator Cycle Sequencing kit as described in the manufacturer's protocol. Extension products were purified using EtOH precipitation procedure as described in the manufacturer's protocol. After the final 70% ethanol wash, samples were dried under vacuum and taken to the Biomolecular Synthesis Service, University of Sheffield.
  • Example 5 Detecting mutations by analyzing for rescue from a lethal phenotype
  • target DNA sequences were screened for the presence of a mutation by determining whether insertion of the sequence into pSPRX rendered the plasmid viable in an mcr + host cell by inactivation of the gene products methylation
  • the E. coli strains DH5 ⁇ (mcr + ) and DH5 ⁇ MCR(mcr " ) were transformed with
  • M.SPRN M.SPRI
  • the expressed enzyme is methyltransferase proficient and the plasmid is viable in the mcr- strain alone (see a and b).
  • a gene fragment harbouring a stop codon is introduced into the same plasmid, the synthesis of the M.SPRI polypeptide is truncated and the plasmid is viable in both mcr+ (see e and g) and mcr- hosts.
  • M.SPRI modified methyltransferase with EcoRII and Haelll specificities
  • M.SPRI+ M.SPRI containing extra polypeptide sequence from BRCA1
  • M.SPRIN M.SPRI containing a BRCA1 nonsense mutation.
  • Example 2 the Hae///-induced degradation of pSPRX-(8)1 and pSPRX-
  • IP-RP-HPLC was used instead of gel electrophoresis, thereby simplifying, speeding up and automating the process.
  • the flow rate was 0.7 ⁇ mL/min, detection UV at 260 nm, column temp. ⁇ O°C.
  • the pH was 7.0.
  • chromatogram is represented by a solid, the pSPRX-(8)1 chromatogram by a dashed line.
  • the fully methylated plasmid (pSPRX-(8)2) is protected from digestion and appears as a single peak at approximately 18 minutes.
  • the chromatogram between 8 minutes and 16 minutes is flat indicating that no digestion of the plasmid has occurred. This trace indicates that no truncating mutation is present.
  • the partially methylated plasmid (pSPRX-(8)1) has been partially digested and shows a smaller peak at 18 minutes possibly with one or more shoulders or partially resolved peaks as seen here.
  • the chromatogram between 8 minutes and 16 minutes shows a higher absorbance than seen with pSPRX-(8)2 and contains multiple small peaks indicative of DNA fragments resulting from the partial digestion of the plasmid. This chromatogram clearly indicates the presence of a truncating mutation.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention can be used to detect and characterize a genetic mutations in a nucleic acid sample of interest. The invention involves the use of a detector nucleotide sequence encoding a polypeptide, the activity of which can be attenuated or modified when a target sequence including a mutation is cloned into a linker region of the detector nucleotide sequence. In one aspect, the detector nucleotide sequence encodes a detector polypeptide comprising two or more catalytically-essential domains separated by a linker region, where the catalytically-essential domains can function in concert to catalyze a detectable reaction. A mutation in the linker region that results in the elimination of one of the catalytically-essential domains, either by truncation or by substantially altering its amino acid sequence, will result in a detectable activity loss, thereby signaling the presence of a mutation.

Description

TITLE OF THE INVENTION
METHOD FOR DETECTION OF TRUNCATED PROTEINS FIELD OF THE INVENTION The present invention is directed to methods and materials useful for the analysis of DNA and other polynucleotides. The invention is particularly suited to applications involving the detection and/or characterization of a genetic mutation.
BACKGROUND OF THE INVENTION Genetic mutations have been shown to be involved in many diseases and pathological conditions. It follows that methods and reagents that enable one to detect and characterize a genetic mutation in an individual are inherently useful to the medical community. For example, identification and characterization of mutations can be used to predict the propensity of an individual to develop a disease, e.g., cancer, or the likelihood that an individual will pass such a propensity on to his or her children, e.g., an inheritable disease.
A genetic mutation is an alteration in the DNA sequence of an individual's genome, relative to the normally occurring, or wild type sequence. While some genetic mutations are benign and do not affect the phenotype of the affected individual, other mutations can cause serious physiological consequences leading to pathologies and sometimes death. These medically relevant mutations often act by causing an alteration in the amino acid sequence of a protein encoded by a mutated nucleotide sequence, i.e., a protein mutation, or by causing a change in the expression level of one or more genes. The first type is frequently caused by a mutation in a stretch of DNA that encodes a protein sequence, but can also be caused by mutations outside the coding region, e.g., a mutation that affects the splicing of exons. The second type is typically caused by mutation in a regulatory region. A number of different types of DNA mutations have been identified. For example, a point mutation is one involving the substitution of one nucleotide residue, or "base," for another.
As a result of the redundancy of the genetic code, some point mutations do not change the amino acid encoded by the codon comprising the mutated base, and thus do not alter the amino acid sequence of the encoded protein. These mutations, referred to as "silent" mutations, are typically innocuous. On the other hand, DNA mutations that result in alteration in the amino acid sequence of a protein can sometimes have a deleterious effect, depending upon the context of the affected amino acid residue. A well-known example is the mutation responsible for sickle cell anemia, where the alteration of a single residue of hemoglobin has a dramatic impact on function.
Oftentimes the most potent point mutations are those that result in the introduction of a translation stop codon (i.e., opal, amber or ochre) in a protein coding sequence, often referred to as "nonsense" mutations. A nonsense mutation causes translation to terminate prematurely, resulting in truncation of the entire length of the protein downstream from the mutation, often leading to a total loss of function for the affected protein. This loss of function can have devastating medical consequences. Nonsense mutations figure prominently in those mutations that have been found to be associated with genetic disease.
Another class of mutation consists of insertions and deletions of one or more bases in a DNA sequence. Like nonsense mutations, insertion and deletion mutations can be particularly potent and are often associated with genetic disease. These mutations typically exert their effect by causing a shift in reading frame for the sequence downstream from the mutation. Unless the insertion or deletion consists of three bases, or some multiple of three, the reading frame for any codons subsequent to the mutation will be out of register, resulting in a complete disruption of the encoded amino acid sequence. This type of mutation will in many instances eliminate the proper stop codon and introduce a stop codon at some other location in the sequence, thereby altering the molecular weight of the mutated protein. A frame shift mutation that results in the introduction of a stop codon is referred to as a "truncating mutation." As use herein, nonsense mutations and truncating mutations are encompassed by the term "stop codon-introducing mutations."
The BRCA1 and BRCA2 genes are well known examples of genes wherein mutations have been shown to have seriously deleterious repercussions. The mutation of these genes has been found to play a critical role in a large number of cases of breast and/or ovarian cancer. BRCA1 is a tumour suppressor gene located on the long arm of chromosome
17 (Hall et al., Science 250:1684-89 (1990) and Miki et al., Science 266:66-71 (1994)). Tumour suppressor genes play a role in regulating cell growth. A woman is predisposed to breast or ovarian cancer (Narod et al., Cancer 74:2341 -46 (1994)), when one copy of the BRCA1 gene is inherited in a defective (mutant) form. Development of cancer in either organ involves a number of additional mutations, at least one of which involves the other copy (allele) of BRCA1. A woman who inherits one mutant allele of BRCA1 from either her mother or father has a high risk of developing breast cancer.
The gene encoding BRCA1 is relatively large with a transcription unit of 24 exons that spans approximately 81 ,000 bps of genomic DNA (Miki et al., 1994; Smith et al., Genome Research 6:1029-49 (1996)). The BRCA1 protein comprises 1863 amino acids and has a molecular weight of 220 kDa. The majority of the exons are small, often less than 100 bp in size, but there is one large exon (exon 11 ) which comprises 61% of the entire coding region (Hogervorst et al., Nature Genetics 10:208-12 (1995)). In amplifying the smaller exons, a significant proportion of incidental intronic sequence is included in diagnostics. Techniques that rely on the detection of genetic variation via conformational changes, such as SSCP, will also detect neutral polymorphic alterations of little, if any, significance and which will only be precisely characterised by nucleotide sequencing. Of the mutations reported so far in BRCA1 , 71 % are frameshift mutations that are due to insertion or deletion of 1 to 59 bp although this high frequency of detection may be due to the fact that such changes are more readily identifiable. 86% of alterations are predicted to lead to a truncated protein product as a result of frameshift, nonsense, splice or regulatory aberrations. The remaining 14% are missense mutations. There are no specific alterations which occur with a particularly high frequency, although 5 individual mutations (185 del AG; 5385 ins C; 1294 del 40; 4184 del 4; Cys61 Gly) represent approximately one third of all those reported to date. Finally, there is no indication of any strong correlation between phenotype and the site of mutation that could possibly suggest regions of likely alteration based on clinical data.
Genetic mutations in diploid organisms (organisms having two copies of each somatic chromosome and gene, e.g., humans) can be either homozygous or heterozygous. A homozygous mutation is one that appears in both gene copies, while a heterozygous mutation occurs in only one of the two copies. In many cases, an individual that is heterozygous for a mutation that results in a loss or impairment of function for the encoded protein will experience no deleterious phenotype due to the ability of the other wild-type copy of the gene to encode normal protein sufficient to maintain function. However, an individual homozygous for the same mutation, e.g., the progeny of two heterozygous individuals, will be incapable of producing the normal protein and this can result in a disease or lethal phenotype. Familiar examples of such genetic conditions include sickle cell anemia and Tay-Sachs disease.
The presence of a heterozygous mutation can also render an individual more prone to certain diseases, cancer constituting an important example. For example, the development of many cancers has been linked to a inherited heterozygous mutation in the p53 cancer suppressor gene (Akashi and Koeffler, Clin Obstet Gynecol. 41 :172-99 (1998)). It is believed that the inactivation of one copy of the gene in itself does not normally result in cancer, since the other copy encodes enough wild-type p53 protein to provide protection. However, the individual is susceptible to cancer due to the possibility of a mutation arising in the other copy of the gene in some cell of the individual, in which case no wild-type p53 protein is encoded and its cancer suppression function is lost.
When attempting to quantify the amount of mutant allele in a sample (particularly in the context of potentially cancerous tissues), issues surrounding general homozygosity and heterozygosity become more complex. Take as an example a cell which possesses mutated and non-mutated copies of a tumor suppressor gene (which under normal circumstances would be considered heterozygous). The non-mutated copy can effectively be "knocked out" and replaced with the mutated copy through a process referred to as "loss-of- heterozygosity," a form of allelic imbalance. Therefore, within a tissue sample some cells may have the heterozygous situation described above, while other cells may have experienced "loss-of-heterozygosity" and are thus homozygous for the mutant allele. This will therefore influence quantitation of the mutant allele in the tissue sample.
Currently available methods of mutation detection often have difficulty identifying mutations in heterozygous samples due to the complicating presence of wild-type sequence in the sample. A related problem is that some mutation detection methods have difficulty determining whether an identified mutation is heterozygous or homozygous. Thus, in any method of mutation detection it is valuable to be able to identify and distinguish between heterozygous and homozygous mutations, and if possible quantify the amount of mutant allele present in a tissue sample. The ability to quantify facilitates the detection of allelic imbalances such as loss-of-heterozygosity in the presence of a substantial background of normal cells.
Sequence heterogeneity in a DNA sample can also result from a cell preparation that is not homogeneous. For example, a cancer cell sample is often infiltrated and contaminated by normal cells, with the normal cells sometimes present in large excess over the cancer cells. It can be extremely difficult to detect a mutation in the cancer cells because of an excess of wild-type sequences in the sample arising from the normal cells. Any method able to identify and characterize a cancer-related mutation from a sample dominated by normal cells would thus be of extreme relevance to the medical community.
Methods of mutation detection have traditionally involved the isolation and purification of genomic DNA, digestion with restriction enzymes, gel electrophoresis and Southern blotting using radio-labeled probes. More recently, application of the polymerase chain reaction ("PCR") has led to the development of a number of methods for identifying genetic mutations, both direct and indirect. Direct methods, typified by nucleotide sequencing, can identify the specific location and nature of a mutation. While direct sequencing is often considered the "gold standard" in mutation detection, mutations can still be missed, particularly in the case of heterozygous mutations or where the sample is contaminated with wild-type DNA. Moreover, mutation detection by direct sequencing is relatively expensive and technically demanding, and therefore normally not appropriate for high-throughput or routine screening, even with the semi-automated sequencing systems which are currently available. DNA hybridization microarrays represent another technology that has found increasing application in mutation detection. While the relative ease and economy of microarray methods often makes them a better choice than direct sequencing for large-scale mutation screening, these methods have experienced difficulty detecting certain mutations, particularly frameshift mutations and mutations in heterogeneous samples.
A number of indirect methods for mutation detection have been reported. These methods do not directly yield the specific location and nature of a mutation, but rather identify the presence of a mutation in some defined stretch of DNA sequence. The relatively low cost and high throughput capability of these methods renders them particularly suited to large scale screening projects and screening for new mutations. Examples include single strand conformational polymorphism (SSCP) analysis and the protein truncation test (PTT). These methods offer the opportunity to examine lesions/alterations of DNA with no prior knowledge of the exact nature or the locus of the lesion, SSCP analysis allows for the detection of genetic polymorphisms by determining shifts in mobility of single-stranded DNAs on neutral polyacrylamide gel electrophoresis (Orita et al. (1989) Proc NatlAcad Sci USA 86(8):2776-2770; Orita et al. (1989) Genomics 5:874-79 (1989)). As originally reported by Orita et al., the method entails digestion of genomic DNA with restriction endonucleases, denaturation in alkaline solution, and electrophoresis on a neutral polyacrylamide gel. After transfer to a nylon membrane, the mobility shift due to a nucleotide substitution of a single-stranded DNA fragment is detected by hybridization with an appropriate probe. The mobility shift caused by nucleotide substitutions is thought to be due to a conformational change in the single-stranded DNAs. More recently, the use of silver staining and fluorescence detection, e.g., ethidium bromide (Yap & McGee (1992) Trends in Genetics 8:49) has speeded up the analysis. While SSCP analysis is relatively fast and simple to perform, it is less sensitive than other techniques.
PTT is a method specifically suited for the detection of premature translation termination mutations, i.e., stop codon-introducing mutations (Roest et al., Human Molecular Genetics 2:1719-21 (1993)). This procedure is used for mutation screening in certain diagnostic testing facilities and involves the following steps: (1 ) isolation of genomic DNA and amplification of coding sequence of the target gene by PCR, or, alternatively, isolation of RNA and RT-PCR amplification of the target gene; (2) use of the amplified target as a template for the production of the radiolabeled product by in vitro transcription and translation; and (3) MW analysis of the radiolabeled protein by SDS-PAGE.
A key feature of PTT is a specifically designed, tailed, forward primer used in the PCR amplification. This primer contains four specific regions. First, in order to facilitate cloning of the amplified product, a restriction site is frequently engineered into the 5' end of the primer. Next a transcription promoter sequence, e.g., a T7, SP6 or T3 promoter, is included to direct the transcription of RNA by the corresponding viral RNA polymerase. The promoter sequence is followed by a spacer sequence of 3-6 bases and then a eukaryotic translation initiation sequence, i.e., a Kozak sequence (Kozak, Nucleic Acids Research 12:857-872 (1984)). Finally, the 3' end contains 17-20 bases complementary to the target gene sequence and in frame with the ATG codon from the Kozak sequence. As a result, the forward primer is necessarily relatively long, typically at least 56 bases in length. Detection of the encoded protein product is typically achieved by the incorporation of a radiolabeled amino acid during synthesis, using [35S] methionine, [35S] cysteine or [3H] leucine. Alternatively, the newly synthesised proteins can be labeled by direct incorporation of biotin-labeled lysines. After SDS-PAGE separation and transfer to a membrane, the biotinylated proteins are detected by a streptavidin- conjugate and visualised using a chemiluminescent substrate.
In general, analysis of the translation products reveals strong signals or bands for the desired translation products, which are accompanied by several weaker bands. These weaker products are often the result of translation initiation at alternate start codons and can obscure the analysis and detection of truncated fragments.
The complexity of PTT means that its application can require an experienced technician and considerable time, i.e., three days for the procedure and 1 day for the direct sequencing to characterize any mutations identified. Mutation detection by PTT is known to suffer from a number of shortcomings. For one, early mutations (i.e., mutations near the N-terminus of the product of the test construct) can result in products which are too short for detection, either because little or no label is incorporated or because the electrophoretic migration falls outside the range of resolution. On the other hand, late mutations (occurring near the C-terminal end) might result in a minimal shift in size that cannot be resolved by conventional SDS- PAGE. Finally, mutations at translation initiation and termination sites represent a special case. There thus exists a clear need for improved methods of mutation detection and analysis that are rapid and economical, but that at the same time avoid limitations inherent in current methods. In particular, improved methods are needed which can identify mutations at any point in a target, are sensitive to both point mutation and insertion/deletion mutations, can distinguish between heterozygous and homozygous mutations, and that can identify mutations in the presence of an excess of non-mutant DNA. The novel methods and reagents described herein achieve these objectives and thus represent a valuable and timely contribution to the field of genetic analysis and the analysis of polynucleotides in general.
SUMMARY OF THE INVENTION Accordingly, one object of the present invention is to provide improved methods for detecting mutations and other variants in a polynucleotide sequence, especially a DNA sample. A further object of the invention is to provide nucleic acids and other reagents useful in practicing the improved methods described herein.
In one aspect, the invention comprises the steps of providing a detector nucleic acid which includes a detector nucleotide sequence, wherein the detector nucleotide sequence encodes a detector polypeptide having a detectable activity; inserting the target sequence within an open reading frame of the detector nucleotide sequence to give a chimeric sequence; expressing the chimeric sequence to give a chimeric polypeptide; determining the activity of the encoded chimeric polypeptide, and correlating the activity of the encoded chimeric polypeptide with the presence or absence of a mutation in the target nucleotide sequence.
The detector polypeptide can be an enzyme comprising two catalytically essential domains separated by a linker region, wherein additional polypeptide sequence can be inserted into the linker region to produce a chimeric enzyme which possesses a detectable catalytic activity as long as the catalytically essential domains remain linked, but wherein a decrease in the catalytic activity occurs if the insertion results in the loss of one of the catalytically essential domains from the detector polypeptide. The loss of a catalytically essential domain can be the result of truncation or a frameshift in the linker region disrupting the reading frame of one of the catalytically essential domains. The target sequence can obtained by IP-RP-HPLC separation of a mixture of
DNA fragments. In one aspect of the invention, the target nucleotide sequence comprises BRCA1 or BRCA2. In one aspect, the detector nucleotide sequence comprises engineered bacteriophage multi-specific DNA methyltransferase M.SPRI.
In another aspect, the detector nucleic acid comprises a cloning vector including a detector nucleotide sequence that is a lethal gene and the target sequence is ligated into the detector nucleotide sequence. In some cases the activity of the chimeric polypeptide is determined by determining the viability of a host transformed with the chimeric sequence.
In a preferred embodiment of the invention, the detector nucleotide sequence encodes a DNA methyltransferase, more preferably a modified cytosine (C-5)- specific DNA methyltransferase, more preferably a modified form of a precursor methyltransferase originally containing a plurality of target recognition domains, wherein the modification comprises the elimination of one of the target recognition domains.
In a preferred embodiment of the invention the nucleotide sequence that originally encoded the eliminated target recognition domain can be replaced with a linker containing a plurality of restriction sites. In a particularly preferred embodiment of the invention the detector nucleotide sequence comprises M.SPRX or a derivative thereof sharing at least 70% identity therewith.
In another aspect, the invention includes amplifying the target sequence under examination, using PCR primers containing 5' and 3' ends adapted for ligation into a vector; ligating the PCR products into an engineered MTase gene; transforming a host cell which has the property of being killed by active Mtase; and proliferating the cell to indicate the presence of a mutation within the sequence under examination.
In another aspect, the activity of the chimeric peptide is determined by assessing the methylation state of a DNA molecule derived from a host cell expressing the chimeric sequence. In a preferred embodiment, the DNA molecule is a detector nucleic acid, and the methylation state is assessed by treating the detector nucleic acid with a restriction endonuclease and determining the extent to which the detector nucleic acid is degraded by the endonuclease. In a particularly preferred embodiment of the invention the detector nucleic acid is a vector, and the extent of degradation is determined by electrophoresis or by IP-RP-HPLC.
In a preferred embodiment, the detector nucleic acid comprises pSPRX and the method includes transforming a mcrBC" host, analyzing plasmids from mcrBC" transformants by Haelll digestion, wherein if the insert is wild type, the plasmid is resistant to Haelll cleavage, and wherein if the insert contains a mutation, the plasmid is at least partially degraded after Haelll restriction digestion.
In another aspect the invention includes purifying the chimeric polypeptide and carrying out size-based separation of the chimeric polypeptide.
In a further aspect the invention includes obtaining plasmids from the transformed host cells and sequencing the plasmids. In yet another aspect, the invention includes a method for diagnosing, in an individual, a disease which is associated with a mutation in a target nucleotide sequence, which method includes performing a method as described in claim 1 using a sample of nucleic acid from that individual.
Another aspect of the invention is a detector nucleic acid adapted for use in detecting a mutation in a target nucleotide sequence present in a sample nucleic acid, wherein the detector nucleic acid includes a detector nucleotide sequence which encodes a detector polypeptide having a detectable activity, wherein the detector polypeptide is an enzyme comprising two catalytically essential domains separated by a linker region, wherein additional polypeptide sequence can be inserted into the linker region to produce a chimeric enzyme which possesses a detectable catalytic activity so long as the catalytically essential domains remain linked, but wherein a decrease in the catalytic activity occurs if the insertion results in the loss of one of the catalytically essential domains from the detector polypeptide, the detector sequence comprising an insertion site wherein a target nucleotide sequence can be inserted to yield a chimeric sequence encoding the chimeric enzyme.
In a preferred embodiment, the detector nucleic acid is derived from a multi- component enzyme which can accommodate extra sequence within it with little or no loss of activity. Preferably, the detector polypeptide comprises a Mtase, especially an MTase selected from the group consisting of M.SPRI, M. Aqul and M.MSPI. In a particularly preferred embodiment, the detector nucleotide sequence comprises M.SPRX, e.g., the pSPRX vector described herein. In one aspect, the detector nucleotide sequence comprises a target sequence, wherein the target sequence preferably comprises all or a diagnostic part of a eukaryotic gene correlated with a genetic disease, more preferably wherein the eukaryotic gene comprises a gene in which mutations having been characterized and correlated with disease states.
Yet another aspect of the invention is a method for diagnosing, in an individual, a disease which is associated with a mutation in a target nucleotide sequence, the method comprising providing a detector nucleic acid which includes a detector nucleotide sequence, wherein the detector nucleotide sequence encodes a detector polypeptide having a detectable activity; inserting the target sequence within an open reading frame of the detector nucleotide sequence to give a chimeric sequence; expressing the chimeric sequence to give a chimeric polypeptide; determining the activity of the encoded chimeric polypeptide and correlating the activity of the encoded chimeric polypeptide with the presence or absence of a mutation in the target nucleotide sequence.
Another aspect of the invention is a host cell harboring a detector nucleic acid of the invention.
Another aspect is a diagnostic kit comprising a detector nucleic acid of the invention. In a preferred embodiment of the invention, the target nucleotide sequence is derived from genomic DNA, typically as the product of PCR amplification of a sequence of interest from genomic DNA. In another preferred embodiment of the invention, the target nucleotide sequence is derived from mRNA, typically as the cDNA amplification product of RT- PCR.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a pseudo-three dimensional representation of the modified form of
M.SPRI containing additional protein sequence such as that encoded by stretches of the BRCA1 gene and the product which is partially inactive as a consequence of truncation.
FIG. 2 is an overview of a protocol embodying one aspect of the present invention, involving the analysis of a heterozygous mutant.
FIG. 3 depicts the nucleotide sequence of M.SPRX with the polylinker region underlined.
FIG. 4(A) is a map of the plasmid pSPRX. FIG 4(B) depicts the nucleotide sequence of pSPRX. FIG. 5 shows the gel electrophoretic separation of pSPRX-(8)1 isolated following transformation of an mcrBC- host and subjected to restriction analysis using Hae\\\. The appearance of 7 degraded (D) and 5 undegraded (U) plasmids from a sample of twelve bacterial colonies from a transformation plate presents a striking demonstration that the patient is heterozygous for a stop codon-introducing mutation in BRCA1.
FIG. 6 shows SDS-PAGE gel analysis of size differences between the full- length chimeric polypeptide encoded by pSPRX-(8)2 (FIG. 5(A)) and the truncated chimeric polypeptide encoded by pSPRX-(8)1 (FIG. 5(B)), as described in Example 3. The lanes preceding the purified sample are molecular weight markers and partially fractionated extracts of the recombinant strain expressing either normal or truncated protein. FIG. 7 shows the bacterial transformations described in Example 5, illustrating the simplicity of the stop codon-introducing mutation test.
FIG. 8 shows superimposed chromatograms of pSPRX-(8)1 and pSPRX-(8)2
isolated from DHδ MCR and treated with Haelll, as described in Example 6. The
pSPRX-(8)2 chromatogram is represented by a solid, the pSPRX-(8)1 chromatogram by a dashed line. The reduced activity of pSPRX-(8)2 is evidenced by the appearance of small peaks (fragments) in the corresponding curve.
DETAILED DESCRIPTION OF THE INVENTION As described above, the need exists for an economical, high-throughput method for detecting a genetic mutation that avoids the limitations inherent in currently available methods. The present invention provides novel methods and reagents that satisfy that need. Moreover, the invention is generally suited for a variety of applications pertaining to the analysis and characterization of DNA and other polynucleotides.
In a preferred embodiment, the present invention can be used to detect and characterize a genetic mutation in a nucleic acid sample of interest. The invention can be used to detect both point mutations that result in premature translation termination, as well as insertion/deletions mutations that cause a shift in translation reading frame without necessarily altering the mass of the encoded protein to an extent that can be detected by SDS-PAGE analysis. It thus represents a substantial improvement over some currently available methods, such as PTT, that only detect mutations that result in a substantial shift in SDS-PAGE mobility. The invention can also be used to characterize and distinguish between heterozygous and homozygous mutations. Furthermore, the invention is capable of detecting a mutation in the presence of non-mutant DNA, a problem which has proven refractory to a number of the currently available methodologies. In addition, the invention is able to conveniently identify mutations in gene sequence regions that are refractory to some of the currently employed methods, e.g., mutations that occur early (N- terminal) or late (C-terminal) in a particular gene or gene segment, and mutations at translation initiation and termination sites.
The invention involves the use of a detector nucleotide sequence encoding a polypeptide, the activity of which can be attenuated or modified when a target sequence including a mutation is cloned into the gene, i.e., a chimeric sequence is formed. As used herein, the term "nucleotide sequence" indicates a chain of nucleotides, e.g., an oligonucleotide or polynucleotide. In one aspect, the detector nucleotide sequence encodes a detector polypeptide comprising two or more catalytically-essential domains separated by a linker region, where the catalytically- essential domains can function in concert to catalyze a detectable reaction. Importantly, the catalytic activity of the detector polypeptide is such that it can tolerate the addition of some additional polypeptide sequence into the linker region while still maintaining its activity. In other words, the ability of the catalytically- essential domains to work in concert to catalyze the reaction is to some extent independent of the length and amino acid sequence of the linker region. However, elimination of one of the catalytically-essential domains, either by truncation or by alteration of its reading frame, will result in a detectable activity loss, if not complete inactivation. A detector polypeptide that includes additional amino acid sequence in this linker region is referred to as a chimeric polypeptide.
To test for a mutation in a target nucleotide sequence, the target sequence is inserted into that segment of the detector nucleotide sequence which encodes the linker region. So long as the insertion does not alter the reading frame of the downstream catalytically-essential domain (i.e., the insertion consists of a number of bases that is a multiple of 3, and does not include a stop codon), the resulting construct will encode a chimeric polypeptide containing the two catalytically-essential domains separated by a linker region extended by the number of amino acids encoded by the test sequence. Owing to the choice and design of the detector polypeptide, this insertion will not disrupt the enzyme's catalytic activity (assuming that the length of insertion does not exceed some maximum insertion capacity characteristic of that detector polypeptide). However, if the target sequence contains a stop codon-introducing mutation, that mutation will result in truncation of the C-terminal catalytically-essential domain and a substantial loss of activity. Alternatively, an insertion/deletion mutation can shift the coding frame of the C- terminal catalytically-essential domain, scrambling that domains amino acid sequence and attenuating activity. In either case, the existence of a mutation in the target nucleotide sequence results in a detectable reduction in catalytic activity. In a preferred embodiment of the invention, the activity of the chimeric polypeptide is determined and correlated with the presence or absence of a mutation in the target nucleotide sequence.
The "detector nucleic acid" is a nucleic acid comprising the detector polynucleotide. In general, the detector nucleic acids supplies functional elements that enable the transfer of the detector nucleic acid into a host cell, replication in a host cell and expression of the detector nucleotide sequence. In some embodiments of the invention the detector nucleic acid will possess other functional elements, e.g., genetic elements that allow for positive or negative selection or that permit inducible gene expression. Such elements include origin of replication, promoter region, translation initiation sequence (e.g., Shine-Delgamo sequence, ribosome binding site), and in some cases enhancer or other regulatory sequences. The detector nucleic acid generally takes the form of a vector. As used herein, "vector" is defined to include, inter alia, any plasmid, cosmid, phage or the like which can transform a prokaryotic or eukaryotic host, generally by existing extrachromosomally within the host. Typically the vector used herein will be an autonomous replicating plasmid with an origin of replication recognized by the host. Generally speaking, in the light of the present disclosure, those skilled in the art will be well able to construct vectors of the present invention based on those of the prior art. For further details see, for example, Molecular Cloning: a Laboratory Manual: 2nd edition, 3 Volumes, Sambrook et al, 1989, Cold Spring Harbor Laboratory Press (or later editions of the same work) or Current Protocols in Molecular Biology, Second Edition, Ausubel et al. eds., John Wiley & Sons, 1992 both of which are specifically incorporated herein by reference. The "detector nucleotide sequence" will in general be a polynucleotide sequence that encodes a detector polypeptide. In a preferred embodiment of the invention, the portion of the detector nucleotide sequence that encodes the linker region of the detector polypeptide comprises a polylinker region containing one or more unique restriction sites, thereby facilitating insertion of a target DNA sequence. The detector polypeptide and polylinker regions are described in more detail below. The "detector polypeptide" will generally be an enzyme which causes a detectable change in a substrate. The nature of the invention requires that the detector polypeptide be able to accommodate an insertion of a foreign sequence into the linker region to form a "chimeric polypeptide" while still retaining a detectable level of activity. In a preferred embodiment of the invention, the detector polypeptide can accommodate an insertion of up to about 10 amino acids, preferably about 50 amino acids, more preferably about 100 amino acids, still more preferably about 150 amino acids, still more preferably about 200 amino acids, still more preferably about 250 amino acids, still more preferably about 300 amino acids, and most preferably 330 or more amino acids while still retaining a detectable level of activity. While insertion of sequence will be tolerated, an insertion event that results in a loss of the C-terminal catalytically-essential domain, either by truncation or shift in reading frame, will cause a detectable loss of activity as compared with the level shown by the non-truncated and non-frame-shifted chimeric polypeptide. Note that a difference in the relative level of activity (e.g., truncated vs. non-truncated) generally suffices to indicate a mutation, and that a determination of absolute activity is typically not required. Thus, the activities need not actually be quantified, but can also simply be compared, optionally with controls run contemporaneously or historically. A depressed activity can then be correlated with a loss of the C-terminal catalytically essential domain, and hence the presence of a nonsense or frameshift mutation as a result of the introduction of the test nucleotide sequence. The term "catalytically-essential domain" refers to one of two or more polypeptide domains that, when connected by means of a linker region of varying length, function in concert to catalyze a detectable reaction, i.e., they confer upon the detector polypeptide a detectable catalytic activity. The loss of one such catalytically-essential domain, e.g., through truncation or as the result of a shift in reading frame, results in a detectable decrease in activity. The loss in activity need not be total, as long as it is possible to distinguish between chimeric polypeptides that contain the catalytically-essential domain and those wherein the catalytically- essential domain has been lost. In a preferred embodiment, the loss of a catalytically-essential domain will result in essentially complete inactivation. In a preferred embodiment of the invention, the detector polypeptide is an enzyme that can be lethal if expressed in an appropriate host cell. In a preferred embodiment of the invention, the encoded detector polypeptide is a DNA methyltransferase (a "Mtase"). These enzymes catalyze the methylation of DNA bases at locations characteristic of the particular methyltransferase. Methylation alters the ability of DNA to act as a substrate for a number of DNA modifying enzymes, in particular restriction enzymes, i.e., enzymes that cleave DNA in a target dependent manner. It is this alteration of DNA as a substrate for DNA modifying enzymes that render Mtases particularly suited for use as detector polypeptides. For example, Mtase activity in a host cell can be measured by checking DNA derived from the cell for susceptibility to cleavage by an enzyme that selectively cleaves only non-methylated DNA. Alternatively, in some host cells methylation will render DNA susceptible to cleavage by endogenous enzymes in the host cell that do not cleave non-methlyated DNA. In such a situation, methylation can be lethal, which provides a convenient system of screening for Mtase activity. In a particularly preferred embodiment of the invention, the detector polypeptide is a 5-methylcytosine methyltransferase (m5C-methyltransferase). The function of these enzymes is to transfer a methyl group from S-adenosyl-L- methionine (SAM) to the C5 position of a cytosine residue contained within a specific double stranded DNA sequence. Two classes of m5C-methyltransferases are known: mono-specific methyltransferases, which recognize and modify a single recognition sequence, and multi-specific methyltransferases, which recognize and modify cytosines in multiple sequence contexts. Mono-specific enzymes are commonly found in bacteria, where they are involved in restriction/modification systems to protect host DNA from cleavage by the corresponding restriction endonucleases (Piekarowicz et al., Nucleic Acids Research 19:1831-35 (1991); Mi and Roberts, Nucleic Acids Research 20:4811-16 (1992)). A number of bacteriophage have been shown to express multi-specific Mtases (see, e.g., Trautner et al., Nucleic Acids Research 16:6649-58 (1988); Noyer-Weidner, M. and Reiners-Schramm, L., Gene 66, 269-78 (1988)).
Typically, bacteria expressing an MTase also elaborate a cognate restriction endonuclease, which recognizes and cleaves DNA at the same sequence recognized and methylated by the Mtase. Cytosine methylation at the site will prevent restriction, however, so a cell's endogenous Mtase activity normally prevents cleavage of its own DNA. However, foreign DNA present in the cell that has not been methylated at the restriction site will be susceptible to cleavage. The result is somewhat analogous to the immune system of higher organisms, since it allows the cell to specifically target foreign DNA for destruction while protecting its own DNA from cleavage. Thus, the DNA of most bacteria is modified at specific sequences, but the sequences modified are different for different strains and species because they carry different restriction enzymes. The nomenclature of Mtases generally indicates their relationship with their cognate restriction endonuclease. For example, the cognate Mtase of the restriction enzyme Haelll is M. Haelll. Mono-specific Mtases typically contain a variable region capable of recognizing the specific sequence targeted for methylation, sometimes referred to as a Target Recognition Domain (TRD), surrounded by conserved motifs involved in catalyzing the methylation reaction.
Multispecific Mtases, capable of targeting more than one sequence for methylation, are particularly suited for use in certain preferred embodiments of the invention. These enzymes include 10 conserved motifs involved in catalyzing the methylation reaction and a plurality of TRDs, each TRD corresponding to a particular target sequence and conferring upon the enzyme the ability to specifically methylate that sequence. In some cases it has proven possible to eliminate a TRD without abolishing the ability of the enzyme to methylate at sequences recognized by the remaining TRDs. In a preferred embodiment of the invention, the detector polynucleotide is derived from a multispecific Mtase by substituting a nucleotide sequence encoding a TRD with a polylinker region capable of accepting a target nucleotide sequence. Alternatively, the detector polynucleotide can be derived from a monospecific Mtase by introducing a polylinker region into some region of the enyzme that does not disrupt its Mtase activity.
When the detector polypeptide is an MTase, a mutation in the inserted target DNA can be determined by assessing the ability of the encoded chimeric polypeptide to methylate DNA at the sequence motif recognized by that particular Mtase. Because methylation generally confers resistance to cleavage by the cognate restriction enzyme, Mtase activity can be assessed by determining to what extent DNA exposed to the chimeric polypeptide is susceptible to cleavage by that enzyme. For example, plasmid DNA isolated from a cell expressing chimeric polypeptide with M. Haelll activity will be refractory to Haelll degradation relative to plasmid DNA from a cell wherein the M. Haelll activity is attenuated as the result of insertion of target DNA containing a mutation, e.g., a truncating mutation. While C5-methylation protects against cleavage by certain restriction enzymes, it is also known that certain strains of bacteria specifically target DNA containing 5-methylcytosine for restriction. For example, many laboratory strains of
Escherichia coli K-12 (e.g., DH5α) contain mcr (for modified cytosine restriction)
systems (Raleigh and Wilson, Proc. Natl. Acad. Sci. USA 83:9070-74 (1986). Two distinct systems, i.e., the mcrA and mcrBC systems, have been identified (PCT Application No. WO 97/01639; Raleigh, Mol. Microbiol. 6:1079-86 (1992)). These systems are encoded by distinct regions of the bacterial genome and have been found to have different sequence context requirements for the recognition of 5- methylcytosine containing sequences. In particular, the product of the mcrBC gene targets sequences that include 5-methylcytosine adjacent to a 3' cytosine (PumC) (see, e.g., Stewart et al. (2000) J. of Mol. Biol. 298:611-22), while the product of the mcrA gene targets 5-methylcytosine followed by a guanine base (CmG) (see, e.g., U.S. Patent No. 5,405,760). Transformation of such strains with DNA encoding an MTase can be lethal as a result Mtase-catalyzed methylation of the host DNA, which targets it for restriction by the mcr system. Raleigh and Wilson, supra, have reported the mcr phenotypes of some commonly used strains of E. coli K-12.
In a preferred embodiment of the invention the detector polypeptide is derived from the CCCGGG-specific CδMtase M.Aqul (Karreman and de Waard, J. of Bacteriol. 172:266-72 (1990)). M.Aqul occurs naturally as a heterodimer made up of
two distinct subunits: a catalytic subunit (α) and a DNA recognition subunit (β). The
β subunit is believed to associate with the α subunit by means of a hydrophobic arm
at the C-terminus. For use in the present invention, it is preferable to use a polynucleotide sequence that encodes a monomer fusion product of the two subunits attached to one another by means of a polypeptide linker. The construction and expression of such a plasmid containing such a hybrid (pREVENTI) is described in PCT Application No. WO97/01639. Insertion of target DNA into the region encoding the polypeptide linker does not attenuate the activity of the chimeric Mtase so long as the insertion does not result in a nonsense mutation or alter the reading frame of the downstream domain.
In another preferred embodiment of the invention the detector polynucleotide is derived from the CCGG-specific C5 Mtase gene M.Mspl, which, when active, methylates the outer cytosine residue and thereby elicits a potent mcrBC response (Raleigh and Wilson (1986). In some applications of the instant invention, it is desirable to maximize the number of modified cytosines for enhanced detection; in that case an enzyme that recognizes a four base motif (e.g., M.Mspl) can be preferred over an Mtase that recognizes a six base sequence (e.g., M.Aqu\), since statistical probability predicts that a shorter recognition sequence will occur more frequently. Increased methylation is particularly preferable in embodiments of the invention where the Mtase is used as a 'lethal' gene producing a null phenotype, as described below. An example of a particularly preferred C5-Mtase is M.SPRI (sometimes referred to as M.SPR), a multispecific enzyme derived from the B. subtilisis bacteriophage SPR (MTBS_BPSPR, P00476, EC2.1.1.73). This enzyme contains three TRDs which target for methylation DNA sites recognised respectively by the restriction enzymes EcoRII, Msp\ and Haelll, i.e. CC(A/T)GG, CCGG, and GGCC respectively (Trautner et al., Nucleic Acids Research 16:6649-58 (1988)). M.SPRI has a structure consisting of 10 well-conserved motifs surrounding the variable region containing the TRDs (Trans-Betcke et al. (1986) Gene 42:89-96; Lauster et al. (1989) J. Mol. Biol. 206:305-12). For the purposes of illustration, the strategy used to engineer the gene encoding the bacteriophage multi-specific DNA methyltransferase M.SPRI into a detector nucleotide sequence is described. This illustrative example not intended to limit the scope of the invention, and it should be recognized that there are a variety of mutational strategies that one of skill in the art could without undue experimentation devise to achieve an equivalent outcome. In this case, M.SPRI was mutated strategically to eliminate the TRD that recognizes CCGG, namely TRD M. The remaining two TRDs responsible for targeting methylation to the sequences CC(A T)GG (TRD E) and GGCC (TRD H) remain functional. In addition, a linker, i.e., polylinker, containing a number of restriction sites, i.e., a multiple cloning site, was introduced in place of the TRD M coding sequence. The resulting modified gene, identified as M.SPRX, is well suited to serve as a detector polynucleotide in certain preferred embodiments of the invention. In practice, any DNA sequence that is flanked by sites present in the linker, such as Xho\ and EcoRI, can be inserted into the linker. In this way a target nucleotide sequence can be introduced into the M.SPRX gene without adversely affecting enzymatic activity. In a preferred embodiment, the target nucleotide sequence is all or part of a gene suspected of being involved in a disease condition, e.g., a BRCA1 or BRCA2 gene. If, however a stop codon arises aberrantly, protein synthesis will be terminated prematurely in vivo and a fully active DNA methyltransferase will no longer be produced. Alternatively, a mutation that causes a shift in reading frame of the downstream (C-terminal) portion of the encoded Mtase (the detector polypeptide) will likewise result in a less than fully active Mtase.
FIG. 1 shows a pseudo-three dimensional, or "cartoon", representation of the modified form of M.SPRI containing additional protein sequence such as that encoded by stretches of the BRCA1 gene and the product which is inactive as a consequence of truncation. In all cases, as those skilled in the art will appreciate, derivatives or other variants of the detector polynucleotides specified above may also be used in the present invention, provided that they encode the requisite detectable activity. Generally speaking such variants will be substantially homologous to the 'wild type' or other sequence specified herein i.e. will share sequence similarity or identity therewith. Similarity or identity may be at the nucleotide sequence and/or encoded amino acid sequence level, and will preferably, be at least about 60%, or 70%, or 80%, most preferably at least about 90%, 95%, 96%, 97%, 98% or 99%. Sequence comparisons may be made using FASTA and FASTP (see Pearson & Lipman, 1988. Methods in Enzymology 183: 63-98). Parameters are preferably set, using the default matrix, as follows: Gapopen (penalty for the first residue in a gap): -12 for proteins / -16 for DNA; Gapext (penalty for additional residues in a gap): -2 for proteins / -4 for DNA; KTUP word length: 2 for proteins / 6 for DNA. Analysis for similarity can also be carried out using hybridization. One common formula for calculating the stringency conditions required to achieve hybridization between nucleic acid molecules of a specified sequence homology is: Tm = 81.5°C + 16.6Log [Na+] + 0.41 (% G+C) - 0.63 (% formamide) - 600/#bp in duplex (Sambrook et al., supra).
In general, practice of the present invention requires that means be available for introducing a target nucleotide sequence of interest into the linker region of the detector nucleotide sequence. In preferred embodiments, this means is provided by one or more restriction sites located within the polylinker, i.e., the sequence encoding linker region of the detector polypeptide. Cleavage at such a restriction site provides either blunt or overhanging ends to which the target polynucleotide can be ligated using standard techniques. The restriction site or sites can be intrinsic to the gene, e.g.^Mtase, used as the basis for the detector polynucleotide. More typically, restriction sites are introduced into the polylinker region. In a particularly preferred embodiment of the invention, multiple unique restriction sites are introduced into a relatively short stretch of the detector polynucleotide, resulting in a multiple cloning site. A multiple cloning site will enhance the utility and convenience of the resulting detector nucleic acid. In many cases, construction of the polylinker region, including a multiple cloning site, will entail the excision of some portion of the native gene, e.g., the removal of a region encoding some domain of the gene. Such a situation is described in more detail in the following examples. A wide variety of suitable host strains are available, and the skilled artisan will be able to choose an appropriate host based on the specific nature of the detector nucleic acid to be employed. In preferred embodiments of the invention the host will be a bacterial organism, although some detector systems lend themselves to use in non-bacterial host cells, e.g., yeast, insect, plant or mammalian cells. In an embodiment of the invention where the detector nucleotide sequence is derived from an MTase gene, the choice will depend to some extent on whether or not it is desired to measure the activity as null (lethal) phenotype or not. Expression of an active Mtase will generally only be lethal in a host cell that has an mcr system that targets for cleavage DNA that is methylated by the Mtase activity. A number of
strains of mcrA+ mcrBC+ E. coli are known, e.g., DH5 (Hanahan, D. (1983) J. Mol.
Biol. 166:557-580, available from Life Technologies, Inc. and New England Biolabs,
Inc.). Alternatively, cells that are mcrA" mcrBC", e.g., DHδαMCR (Grant et al. (1990)
Proc. Natl. Acad. Sci. USA 87:4645-4649) are useful in instances where it is desirable to maintain cell viability even in the presence of Mtase activity. In this case, the presence of functional Mtase can be determined by assessing Mtase activity being expressed by the cell, for example by checking the methylation state of DNA derived from the cell by restriction digestion. Additionally, as the skilled person is aware, certain strains can "read-through" stop codon-introducing mutations to greater or lesser extent. For instance the supE
genotype of almost all E. coli K-12 strains (e.g., DH5α and DHδαMCR) will cause
read-through of amber mutations (UAG) with up to 22% efficiency (Glass, R.E. (1982) Gene Function, E. coli and its heritable elements (Glass, R.E., Ed.) Croom Helm, London) which can result in expression of some active chimeric polypeptide even when such a mutation is present in the target polynucleotide. For example, in the case where the detector polynucleotide is derived from an MTase, stop codon- introducing mutation read-through can result in partial Mtase activity in spite of a mutation. This sort of partial inactivation was observed in carrying out some of the Examples provided below. Nevertheless, such strains may still be used in the present invention, so long as the difference in activity between constructs containing a stop codon-introducing mutation in the linker and non-mutants is detectable. One skilled in the art will recognize that any problems that arise due to excessive stop codon-introducing mutation read-through can be addressed by employing a host strain that is sup0. Suggested sup0 E. coli strains include W3110 (mcrA* hsd* mcrBC*) (described at http://gib.genes.nig.ac.jp/Ec/ ) and a mcrBC" derivative (NM679) with a large deletion that removes hsd and mcrBC (King and Murray (1995) Mol. Microbiol. 16:769-77). In some embodiments of the invention such strains are the preferred host.
The present invention can be used to analyze a target nucleic acid present in a sample nucleic acid. The sample may be from any appropriate nucleic acid source, and may represent all or part of the source. Pooled samples may be used for further analysis if required. The sample may derive from cDNA, genomic DNA (all or part of an intron or exon thereof), amplified portions of DNA (e.g., from a PCR reaction), libraries, artificial chromosomes, etc., or fragments of any of these (e.g., from a restriction digest). The sample nucleic acid may be provided isolated and/or purified from its natural environment, in substantially pure or homogeneous form, or free or substantially free of other nucleic acids, e.g., a selected DNA fragment separated from a mixture of DNA fragments. The use of DNA chromatography (HPLC) to predetermine the size of fragments to be cloned can enhance the efficiency of, and selectivity for, cloning large fragments, as described infra.
In a particularly preferred embodiment of the invention, the target nucleotide is derived from mRNA rather than genomic DNA. An advantage of looking for mutations at the mRNA level is that it allows for the detection of mutations that cause an alteration in RNA splicing. This embodiment of the invention is preferably accomplished by isolating mRNA from a relevant sample and amplifying a cDNA copy by RT-PCR, which is then treated with the appropriate restriction enzymes and inserted into an appropriate detector nucleic acid. The entire gene, or a fragment of thereof, can be amplified and checked for a mutation as described herein. RT-PCR is described, e.g., in Myers & Gelfand (1991) Biochemistry 30: 7661 ; Young et al. (1993) J. Clin. Microbiol. 31 : 882; Mallet, F. et al. (1995) Biotechniques 18, 678. The present invention is well suited for the general detection of mutations, particularly deleterious mutations, in any nucleic acid sequence of interest. In a preferred embodiment, the target sequence is a gene the mutation of which is known or suspected of being involved in genetic disease or a genetic predisposition to disease, or a relevant portion of such a gene. The target sequence can be derived from a gene for which a mutation can lead to the development of cancer, or wherein a mutation can be used as an indicator of a particular type of cancer. In a particularly preferred embodiment, the target sequence is derived from the human BRCA1 or BRCA2 genes. Since about 90% of abnormalities in BRCA1 gene are nonsense mutations, this gene is particularly suited for use in the present invention. Another preferred target is the p53 gene. Other appropriate target nucleotide sequences will be known or apparent to one of skill in the art. Indeed, lists of genes associated with disease are readily available (See, e.g., McKusick, V.A., "Mendelian Inheritance in Man. Catalogs of Human Genes and Genetic Disorders". Baltimore: Johns Hopkins University Press (1998, 12th edition); Online Mendelian Inheritance in Man, OMIM (TM). McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University
(Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), 2000 (World Wide Web 'ncbi.nlm.nih.gov/omim/'); Krawczak M, Cooper DN (1997) "The Human Gene Mutation Database". Trends Genet. 13:121-122; Nucleic Acids Res. 26: 285-287 (1998); and Human Mutation
15: 45-51 (2000).
To test a gene for the presence of a mutation, either the entire gene, or a segment of the gene (in the context of the invention both cases are referred to as the "target polynucleotide") is inserted into the linker region of a detector polynucleotide. The target polynucleotide should be flanked by restriction sites that are ligation-compatible with sites in the linker region of the detector polynucleotide. Cleavage of those sites with the corresponding restriction enzymes will allow for the targeted introduction of the target sequence into the linker region of the detector polynucleotide by means of standard DNA ligation techniques. One skilled in the art will recognize that in some instances, the use of the same enzymes to cleave both target and linker region will not be necessary, so long as the ends are compatible and amenable to ligation. It is important that the polylinker region be of a known reading register, so that target polynucleotides can be selected wherein the normal (i.e., wild-type) form does not alter the reading frame downstream from the insertion event.
An important aspect of the invention concerns the length of the target sequence introduced. In order to maintain the correct reading frame for coding sequence downstream of the site of insertion, the insertion must not change the reading frame of the downstream sequence, i.e., the length of the inserted sequence must be some multiple of three bases. Therefore, in designing an experiment and selecting an appropriate target sequence, it is normally critical that the non-mutated target sequence results in the insertion of a multiple of three bases. Of course, if a particular target sequence being tested contains an insertion/deletion mutation that alters the length of the insertion such that it is no longer a multiple of three, the resulting frameshift will lead to a detectable loss of activity that flags the presence of the mutation.
In a preferred embodiment of the invention, different portions of the gene under examination are tested separately so that any detected mutation can be localised to a region within a longer gene. This can be achieved, for instance, using a multiplex PCR reaction (i.e., the simultaneous amplification of different target DNA sequences in a single PCR reaction). Multiplex PCR is described, for example, in Edwards et al., "Multiplex PCR: Advantages, Development and Applications", PCR Methods and Applications, 3:56δ-δ7δ (1994) and Shuber et al., "A Simplified Procedure for Developing Multiplex PCRs" Genome Research, δ:488-498 (1995). Alternatively, standard non-multiplex PCR can be employed to generate the appropriate target nucleotide sequence or target nucleotide sequences. Appropriate PCR conditions can be determined without undue experimentation by one of skill in the art. For example, a representative amplification buffer could be prepared using
50 μL PCR reaction buffer, 1.5-6 mM MgCI2, 1-2.5 units polymerase, 200 nM
forward and reverse primer and 200 μM dNTPs, thermocycled at 40 times at 95°C
for 1 minute, 55°C for 1 minute, and 72°C for n minutes, where n is dependent upon
fragment size and polymerase activity. As a guide, for a standard Tag polymerase n is normally set at 1 minute per Kbp. If a proofreading polymerase is to be used it is recommended that n be adjusted to 2 minutes per Kbp. The particular amplification conditions can vary substantially, depending upon the nature of the primers and template used, as well as their concentrations. During a standard PCR reaction annealing is usually performed at approximately δ°C below the lowest Tm of the
primer pair. Primers should be matched to within approximately 5°C of each others
Tm in order to avoid non-specific priming resulting from an annealing temperature too far below the Tm. Further guidance in determining an optimal amplification protocol can be found, for example, in Gelfand et al., PCR Protocols: A Guide to Methods and Applications, Academic Press (1990; ISBN: 0123721814) and Innis et al. PCR Applications: Protocols for Functional Genomics, Academic Press (1990; ISBN: 0123721857). Moreover, any of a variety of nucleic acid amplification and other molecular biology techniques known to the skilled artisan can be used to generate the desired DNA segments for mutation detection pursuant to the present invention.
In a preferred embodiment of the invention, regions of preferably 30 to 5001 bp, more preferably 99 to 2001 bp, and still more preferably 501 to 999 bp are tested from patient sample DNA against a control, using primers having common δ'-3' ends that introduce restriction sites into the amplification products suitable for cloning into the linker region of the detector polynucleotide, preferably a multiple cloning site. For a typical gene under examination by the invention, current estimates place the preferred number of multiplex PCR products at about 10-12. In a preferred embodiment of the invention, the multiplex amplification is designed such that the products are of diverse lengths. The difference in size of the resulting target polynucleotides will in some cases facilitate their separation from one another prior to cloning into a detector nucleic acid, e.g., when separation is by means of HPLC. An example of an appropriate size series of amplimers would comprise sequences of 2δ0 bases, 300 bases, etc., up to about 800 bases (the size need generally not be greater than 1000 bases).
Appropriate PCR primers can be designed using principles known to one of skill in the art, as described, e.g., in Gelfand et al. and Innis et al, supra. Mutations located within the region of primer annealing will not be detected, so if the forward (sense) primer is designed to anneal at the start codon of a gene of interest the
primer should preferably include a minimum of sequence derived from the region 3'
of the start codon. This will avoid the possibility of inadvertently removing a mutation immediately adjacent to the start codon during the amplification procedure. The primer should also preferably be designed to include a suitable restriction site
compatible with the polylinker of pSPRX in the region δ' of the start codon. See
Tables 1 and 2 below for guidance on the compatibility of different enzymes in double digest reactions and on the position of restriction sites relative to the end of DNA fragments, respectively.
Table 1. Compatability of restriction enzymes in double digest reactions. The activity of the least active enzyme is given in each case. NC = Not Compatible
Figure imgf000041_0001
Table 2. Minimum number of bases required between the restriction site and the end of a DNA fragment to allow effective digestion. Suggested distances in this table do not guarantee 100% enzyme activity.
Enzyme Bgl W Pst \ Xho \ Sma l Nhe \ Nde \ Eco Rl
No. of 2 3 3 3 3 7 2 Bases
Restri ction site s shoulc be inco rporated in such a \ Na as to maintain reading frame of the gene following its insertion in pSPRX. If the forward (sense) primer is directed at a region other than the start codon, then the primer should be designed to anneal immediately upstream of the region of interest. All primers should be carefully checked to ensure that the reading frame of the target gene will be preserved after ligation to pSPRX and that no stop codons have been inadvertently introduced into the primers. If the reverse (antisense) primer is intended to anneal at the stop codon of the gene of interest then the primer should preferably be designed to remove or exclude the stop codon. Incorporation of a stop codon in the reverse primer would cause premature termination of translation of the M.SPRX gene leading to impaired methylation activity and the potentially false diagnosis of the presence of a nonsense mutation. Care should be taken in designing the primer to anneal to the region immediately downstream of the stop codon since repetition of stop codons is a common feature of some genes. In general, the presence of a single base mismatch in the middle of a primer sequence will not significantly impair the ability of
the primer to initiate DNA replication. If the mismatched base is close to the 3' end
of the primer however, it may not be possible for synthesis to occur. Thus it may be
necessary to extend the primer sequence to include between 10 and 20 bases 3' of
the stop codon in the reverse primer. Mutations occurring in these bases would not be detected. If the reverse (antisense) primers are directed to regions other than the stop codon, then the primer should be designed to anneal immediately downstream of the region of interest. Mutations located within the region of primer annealing will not be detected. Restriction sites compatible with the polylinker of pSPRX should be
added at the δ' end of the reverse primer, taking care to ensure that the reading
frame of M.SPRX will be preserved with a wild type insert. See Tables 1 and 2 for details of the compatibility of restriction enzymes in double digests and efficiency of digestion close to the ends of DNA fragments. In a preferred embodiment of the invention an appropriate selection of target nucleotide sequences are prepared from a nucleic acid sample by multiplex PCR, and the amplimers are then separated from each other by means of HPLC. In a particularly preferred embodiment of the invention ion pairing reverse-phase HPLC (IP-RP-HPLC) is used to separate the target nucleotide sequences from one another. IP-RP-HPLC is a form of chromatography particularly suited to the analysis of DNA, and is characterized by the use of a reverse phase (i.e., hydrophobic) stationary phase and a mobile phase that includes an alkylated cation (e.g., triethylammonium) that is believed to form a bridging interaction between the negatively charged DNA and non-polar stationary phase. The alkylated cation- mediated interaction of DNA and stationary phase can be modulated by the polarity of the mobile phase, conveniently adjusted by means of a solvent that is less polar than water, e.g., acetonitrile. It has been shown that under non-denaturing conditions the retention time of a double-stranded DNA fragment is dictated by the size of the fragment; the base composition or sequence of the fragment does not appreciably affect the separation. The most preferred method of amplimer separation is by means of Matched Ion Polynucleotide Chromatography (M1PC), a superior form of IP-RP-HPLC described in U.S. Patent Nos. 5,δ8δ,236, 6,066,2δ8 and 6,0δ6,877 and PCT Application Nos. W098/48913, WO98/48914, WO/98δ6797, W098/56798, incorporated herein by reference in their entirety. In the practice of the invention, a preferred system for performing MIPC separations is
that provided by Transgenomic, Inc. under the trademark WAVE®. Thus, for instance, a selected nucleic acid target sequence for use in the methods of the present invention may be provided as follows: (a) providing a mixture of DNA fragments; and (b) separating the mixture of DNA fragments using reverse phase HPLC (preferably MIPC) on the basis of fragment length. When chromatographically separating multiplex PCR products, it is desirable that the amount of each amplimer be roughly tens-of-nanograms, giving a maximum of approximately 50 ng per amplimer, making the total mass of multiplex PCR product loaded onto the column less than 1 microgram.
In principle, the methods herein can be based on any activity of the detector gene which distinguishes (qualitatively or quantitatively) the insertion of a sequence containing a mutation from the insertion of a sequence not containing a mutation. In a preferred embodiment of the invention, the mutation is a stop codon-introducing mutation. Alternatively, the mutation can be one that results in a shift in reading frame downstream of the mutation, without necessarily resulting in premature truncation of the gene product.
In this assay, mutant genes when inserted into the modified methyltransferase generate a partially active enzyme and as a consequence, these plasmids will not be fully methylated in the host cell. This attenuation of activity can be detected by isolating the plasmid (e.g., by mini-prep), exposing it to a restriction enzyme that targets the sequence recognized by the functional form of the Mtase (e.g., Haelll if the functional Mtase has M. Haelll activity), and determining the extent to which the plasmid is degraded (e.g., by gel electrophoresis, capillary electrophoresis or IP-RP-HPLC). Increased levels of degradation is correlated with decreased methylation levels and hence attenuated Mtase activity.
In a preferred embodiment, plasmid degradation is evaluated by determining the amount and size of resulting DNA fragments upon separation by agarose gel electrophoresis using 1 % or 1.7% agarose (ultrapure, Boehringer Mannheim) gels run in 1x Tris-acetate buffer (TAE) (40 mM Tris-acetate; 1 mM EDTA) (for a review see, Serwer (1983) Electrophoresis 4:375-82). Following electrophoresis, visualisation of DNA in agarose gels is achieved by staining with the fluorescent dye ethidium bromide.The sizes of DNA fragments is conveniently estimated by
comparison to known molecular weight standards run on the same gel. 1 μg of 1 kb
DNA ladder was normally used which gives a range between 250 bp and 10 kb. The intensity of the bands can be quantified by scanning and digital analysis.
In a particularly preferred embodiment of the invention, plasmid degradation is evaluated by separating and quantifying DNA fragments by IP-RP-HPLC, particularly as described in U.S. Patent No. 6,066,258. Typical conditions would include 0.1 M triethylammonium acetate (TEAA) as a counterion and fragment elution by acetonitrile gradient, column temperature set such that the DNA fragments are not denaturing.
In a preferred embodiment of the invention, the plasmid or other detector nucleic acid will be checked by treatment with the appropriate restriction enzyme or enzymes to confirm that ligation of the insert has indeed taken place. This can be accomplished by using gel electrophoresis or IP-RP-HPLC to check for an insertion fragment of the correct length. This confirmation step is applied in Example 2 below.
The combination of two or more mutation analysis methods described herein may be desirable in certain instances, e.g., to increase the certainty of the assessment.
In one aspect of the invention the activity of the chimeric polypeptide is such that it is detrimental to its bacterial host unless the foreign DNA within it contains a mutation that attenuates Mtase activtity. In a preferred embodiment, the chimeric polypeptide is lethal without the insertion of polynucleotide including such a mutation. That is, if a fragment containing a stop codon is inserted into an open reading frame, a host cell (e.g., an mcr+ strain of E. coli) containing that gene can proliferate. However, in the absence of a mutation, the Mtase activity results in DNA methylation and subsequent mcr-mediated restriction and death.
Thus, in one embodiment, the method is performed by ligating the foreign target sequence into a cloning vector including a detector nucleotide sequence which is a lethal gene. A "lethal gene" is a gene that encodes a protein that can be lethal in a particular host background, e.g., a methyltransferase in a mcr+ strain of E. coli, and need not be universally lethal in all host cells. A host is then transformed with the vector, and the activity of the chimeric polypeptide is determined by assessing the viability of the transformed host.
The engineered Mtase gene may be in the form of a positive selection vector in which the Mtase can accommodate extra sequence (as described in US Provisional Application No. 60/157,072). In such cases a turbidimetry measurement may be used to establish viability of the host (i.e. proliferation in the broth medium). Turbidimetry involves measuring cell density by determining the extent of light scatter, i.e., the OD, at a defined wavelength, typically 600 nm. This may be performed, for instance, by placing them with broth in a clear-bottomed 96 well plate (Polyfiltronics) maintained at 37°C. This can be read using a 96-well plate reader, with an output of OD600 vs. time. This type of assay is normally best suited for the detection of a homozygous mutation.
To improve the sensitivity of analysis, it may be preferable that the host cells containing the vector are plated out on an appropriate solid medium. Such medium will preferably be prepared such that it will only support the growth of cells containing the detector nucleic acid (e.g., the Mtase encoding plasmid), typically by the inclusion of one or more antibiotics or other selection reagent.
Such methods may employ two bacterial hosts, one (mcr+) which will not tolerate methylated DNA and one (mcr~) which will. Non-limiting examples of such
cells include the mcrBC* and mcrBC host strains DH5α and DHδ MCR. The
insertion of mutant DNA into the linker region of the modified methyltransferase generate a partially inactive enzyme and, as a consequence, colonies will appear on both plates. The insertion of wild-type DNA does not attenuate Mtase activity, and thus colonies will only appear on the mcr plates. However, even in these cases a quantitative assessment may have to be made. Therefore in certain contexts it may be desirable to use other detection methods.
A preferred detection method is based on quantitative analysis of a substrate for the detector gene, from which changes in activity (or relative activity) can be assessed or inferred.
For instance, as described herein and taught in the Examples, insertion of a target sequence into the linker region of pSPRX in frame with the M.SPRX gene does not destroy the TRD H (Haelll) activity of the encoded Mtase. On the other hand, if the insert has a deletion, an insertion or a stop codon, it will change the reading frame of the M.SPRX gene, leading to at least partial attenuation of the TRD H activity.
In order to detect or confirm the presence of a mutation, plasmids from the resulting mcrBC transformants can be analyzed by Haelll digestion. If the insert is wild type, i.e., does not introduce a stop codon or disrupt the reading frame of downstream sequence, the gene product retains TRD H activity and therefore the plasmid is resistant to Haelll cleavage. If the sample contains a homozygous mutant, the TRD H activity is attenuated, either partially or wholly, and as a result the plasmid be deficiently methylated and subject to restriction by Haelll. Degradation can be partial, so long as it is distinguishable from the case where TRD(H) activity is fully functional. Finally, if the sample contains a heterozygous mutation, it is predicted that approximately half of the colonies will contain active M.SPRX. The other colonies will possess attenuated TRD (H) activity. The extent of Haelll restriction can be assayed by any of a variety of techniques capable of determining the length of DNA fragments. Gel electrophoresis, as described, for example, by Ausubel et al., is a standard method for determining the length of DNA restriction fragments and can be used to assess the extent and nature of fragmentation. Alternatively, fragmentation can be assessed by HPLC analysis of the treated plasmid. In a preferred embodiment, the plasmid is anaylzed by MIPC, as described above.
If further confirmation is desired, the chimeric (fusion) polypeptide including detector and target sequence can be expressed, purified and analysed for truncation, for instance, by comparison with the full length fusion by SDS-PAGE. This can entail inducing expression (if required), obtaining a cell pellet from culture, lysing the cells, purifying the chimeric detection polypeptide from the supernatant, and carrying out size-based separation of the products e.g., by SDS-PAGE. Mutations can also be confirmed and/or more precisely characterized by direct DNA sequencing.
The use of suitable controls is recommended to assure the reliability of the test. In a preferred embodiment of the invention, two control DNA fragments encoding a full length and a truncated protein, respectively, will be used. In addition, it is recommended that a known WT sequence be amplified in parallel with a target polynucleotide sequence of the same gene using the same primers and PCR conditions. The user may also wish to construct a sequence containing the target sequence that has been engineered to include a mutation
FIG. 2 provides an overview of a protocol embodying one aspect of the present invention, involving the analysis of a heterozygous mutant. Mutation detection is achieved by means of a detector nucleotide sequence derived from M.SPRI, determination of Mtase activity by Haelll digestion, and confirmation by direct sequencing and/or SDS-PAGE analysis of purified chimeric polypeptide. Other features of the invention will become apparent in the course of the following descriptions of exemplary embodiments which are given for illustration of the invention and are not intended to be limiting thereof.
Procedures described in the past tense in the examples below have been carried out in the laboratory. Procedures described in the present tense have not yet been carried out in the laboratory, and are constructively reduced to practice with the filing of this application. All references referred to herein, including any patent, patent application or non-patent publication, are hereby incorporated by reference in their entirety.
EXAMPLE 1 Construction ofpSPRX- an exemplary detector nucleic acid In the present example the gene encoding the bacteriophage multi-specific DNA methyltransferase (M.SPRI) has been used as the basis for the construction of a detector polynucleotide, including a polylinker region comprising multiple unique restriction sites. The targeted modification of the M.SPRI included the removal of a region encoding one of the TRDs, specifically TRD(M), which targets the Mspl recognition sequence CCGG. The product of the modified gene, referred to as M.SPRX, was shown to retain the ability to methylate at CC(A/T)GG and GGCC, the sequences targeted by the two remaining TRDs, TRD (E) and TRD(H), respectively. In addition, the sequence which has been deleted can be replaced by any DNA sequence that is flanked by restriction sites uniquely represented in the polylinker region, e.g., EcoRI and Xho\. It is further shown that stretches of sequence from the BRCA1 gene can be introduced into the M.SPRI gene without adversely affecting the enzymatic activity of the gene product. If, however a stop (e.g., nonsense) codon arises aberrantly, protein synthesis will be terminated prematurely in vivo and a competent DNA methyltransferase will no longer be produced.
The wild-type M.SPRI gene (GenBank Accession No. K02124, X01670, Gl=216170; Posfai et al. (1984) Nucleic Acids Res. 12:9039-49; Buhk et al. (1984) Geπe 29:δ1-61 ) was originally obtained in the recombinant plasmid pMS119EH (Walter et al., (1992) EMBO J. 11 :444δ-δ0). The nucleotide and amino acid
(GenBank Accession No. AAA32604) sequences of M.SPRI are provided herein as
SEQ ID NOS: 1 and 2. E. coli strain DHδαMCR (mcrA" mcrBC") was used as host
cell throughout all steps of this example.
To facilitate expression and subsequent purification, the M.SPRI gene was subcloned into pGEX-KG, a glutathione-S-transferase (GST) fusion expression vector (Guan and Dixon (1991 ) Analytical Biochemistry 192:262-67). The M.SPRI gene was fused downstream to that encoding GST using Bam \ and Xho\ restriction sites to produce pGEXKG-SPR. Such fusion proteins when expressed, can be purified by a single affinity chromatography step (Smith and Johnson ((1988) Gene 67:31-40). In order to construct pKG-SPR, the Pst\ site was removed from pGEXKG- SPR. To do so the vector was digested with AatW and AlwH\ and the 4879 bp fragment was ligated with the fragment containing the ampicillin resistance gene (1398 bps) from pUC18, which had been cleaved with the same restriction enzymes. The structure of the resulting construct, which is named pKG-SPR, was verified by the loss of the Pst\ site.
In order to evaluate the tolerance of the gene product to polypeptide insertions, TRD(M)was replaced with a polylinker region comprising a multiple cloning site. The nucleotide and amino acid sequences of the modified gene, named M.SPRX, are provided herein as SEQ ID NOS: 3 and 4, respectively. The nucleotide sequence of M.SPRX is also presented in FIG. 3, with the polylinker region underlined. The resulting plasmid was named pSPRX. FIG. 4(A) is a map of the plasmid pSPRX, and FIG 4(B) depicts the nucleotide sequence of the plasmid. The final construct was the result of six cloning steps, described below, and involved the construction of the following intermediary plasmids: (1) pSPR-Xbal, (2) pSPR- Xba\Stu\, (3) pSPR-XbalSfølKan, (4) pSPR-Linker, and (δ) pSPR-Linker-dR. Standard molecular biology protocols were used for all reactions and analysis, e.g., restriction digestions, PCR amplifications, ligations, electrophoresis, etc. Such techniques are known to those working in the field widely available in a variety of publications, including, for example, Sambrook et al, and Ausubel et al.. In PCR experiments a negative control was always carried out, to exclude the possibility of any contamination. A pair of 20 mer or longer primers, which were not self-complementary or capable of forming intra molecular secondary structures, were used in each reaction. The following reagents were used for doing PCR: 10 ng (plasmid) or 100-
δOOng (genomic) DNA template; 200 nM forward and reverse primers; 200 μM
dNTPs, PCR buffer, 1.δ-6 mM MgCI2 (typically δ mM), PCR buffer to total volume of
δO μL. After assembling the reagents, either a normal or a hot start reaction was
carried out using 1 to 2.δ units of the DNA polymerase (Taq, pfu, or Vent
polymerase). In hot start the mixture was heated to 9δ°C for δ minutes and
centrifuged briefly in a microcentrifuge to collect condensed water. 1 μL of diluted
DNA polymerase was added and the contents of the tube were mixed by gentle agitation. A mineral oil overlay was not required since the GeneE thermal cycler used in all PCR amplifications possesses a heated lid. Reactions were placed in the thermal cycler and a pre-designed program was initiated. The cycling program used
was generally as below, unless otherwise stated: 9δ°C for 1 min, δδ°C for 1 min and
72°C for n min, where n depends on the size of the product and polymerase activity.
Primers were matched to within approximately δ°C as regard to Tm, and annealing
was performed at δ°C below the lowest Tm of the primer pair.
The optimal polymerisation temperature was usually 72°C. If Taq polymerase was
used 1 minute per kilobase of expected product size was allowed, and for Vent and Pfu polymerases 2 min per kilobase. Normally 30 s or 1 min extra time was used for polymerisation. When the reaction was complete, amplified DNA was purified by
agarose gel electrophoresis and GeneClean® or using Sephacryl cartridge
(MoBiTec).
DNA fragments or oligodeoxynucleotides (linkers) were ligated into plasmid vectors (containing compatible cohesive termini or both containing blunt ends) using the reaction catalysed by bacteriophage T4 DNA ligase (as described in (Sambrook
et al., 1989). Ligation reactions were performed in 10 or 20 μl volumes using an
approximate molar ratio of between 1 :3 to 1 :10 (plasmid DNA: insert DNA). Ligation buffer was added to 1X (as supplied by the manufacturer) and 1 or 4 Weiss unit of T4 DNA ligase was used for ligation of cohesive or blunt ends, respectively. The
reaction was incubated at room temperature for 2 hours or at 16°C for overnight; depending on the nature of DNA ends.
In order to remove the DNA encoding TRD(M) from the M.SPRI gene, two new restriction sites were introduced to facilitate restriction-mediated excision of the desired sequence.
First, an Xj al site was introduced into TRD(M) using two complementary mutagenic oligodeoxynucleotides, SPRXbaA (GCAGTTGAGTACTCTAGAAAAAGCGGGCTTG) and SPRXbaB (CAAGCCCGCTTTTTCTAGAGTACTCAACTGC) (SEQ ID NOS. δ and 6,
respectively), using the QuikChange™ Site-Directed Mutagenesis Kit (Stratagene®,
La Jolla CA), following the manufacturer's instructions (all sequences are represented herein in the δ' to 3' orientation as read from left to right). The resulting construct, called pSPR- bal, contains another Xba\ site, which overlaps with a dam modification site, and would therefore not be cleaved in a plasmid that has been prepared from a dam* strain. The presence of the newly introduced restriction site was confirmed by Xba\ digestion of mutant plasmid that
had been isolated from E. coli DHδαMCR (dam*).
Next a Stu\ restriction site was introduced 3' to the previously introduced Xba\
and within the region encoding TRD(M), using the QuikChange™ method and the
complementary mutagenic oligodeoxynucleotides SPRStuA (GCTTCTGACTGGAGGGCCTAGAATAGGAACCAAAAACAAAATGC) and SPRStuB (GCATTTTGTTTTTGGTTCCTATTCAGGCCTCTCCAGTCAGAAGC) (SEQ ID NOS. 7 and 8, respectively). The resulting construct was called pSPR- Xba\Stu\.
In order to assess the outcome of the mutagenesis in the newly constructed plasmids, it would have been useful to have been able to perform a Stu\ restriction digestion. However the M. Haelll activity of M.SPRI is still functional in the new construct, and the new Stu\ site in TRD(M) is overlaid by a Haelll site. Thus any Stu\ restriction digestion would be blocked by the resulting methylation of the Haelll site by M.SPRI. To overcome this problem, the entire M.SPRI gene was PCR amplified using the AWSPR1 (CGCGGGATCCTTGGGTAAACTACGTGTAATGAGTC ) and AWSPR2
(CGCGACTCGAGTTATTCAGATTCTTTATTAACGTATG) primers (SEQ ID NOS. 9 and 10, respectively). The resulting 1341 bp PCR product (which is not methylated) was tested for restriction by Stu\, which confirmed the presence of the new Stul site in pSPR-XbalSfal.
Next, in order to remove the region encoding TRD(M) from the M.SPRI gene in pSPR-X alSfι/1, Xba\ and Stul were used to excise the restriction fragment. However, as mentioned before, Stu\ restriction is blocked by Haelll methylation. To overcome this problem pSPR-XbalSft/l was cleaved with Xba\ and a kanamycin resistance cassette from pUC4KM (which had been cut with Xba\) was introduced into the Xba\ site. pUC4KM (accession no. X06404) a derivative of pUC4K containing the kanamycin resistance cassette, in which the following duplicate restriction sites have been added: Xba\, Sma\, Sst\, Kpn\, PvuW (Taylor and Rose (1988) Nucleic Acids Res. 16:3δ8). The resulting construct was named pSPR- X alSfx/IKan. Insertion of the kanamycin resistance cassette inactivates TRD(Haelll) and therefore Stu\ digestion becomes possible.
The aim of the next step was to insert a polylinker including a number of unique restriction sites in place of the TRD(M) coding sequence. pSPR-XS-Kan was cleaved with Xba\ and Stu\ and the 6210 bp fragment, which had lost the kanamycin resistance gene, was ligated with an annealed pair of synthetic oligodeoxynucleotides coding for the "SPR polylinker" (CTAGATCTCTGCAGCTCGAGCCCGGGGCTAGCCATATGGAATTCAGAGG and CCTCTGAATTCCATATGGCTAGCCCCGGGCTCGAGCTGCAGAGAT, SEQ ID NOS. 11 and 12, respectively). The SPR polylinker contains Xba\, BglW, Pst\, Xho\, Smal, Nhe\, Nde\, EcoRI and Stu\ restriction sites. Insertion of the polylinker did not change the reading frame of the M.SPRI gene.
The resulting plasmid was called pSPR-Linker. The methylation capacity of the mutant M.SPRI encoded by this plasmid, referred to as M.SPRX, was assessed by restriction analysis of the pSPR-Linker plasmid. In other words, the ability of the
pSPR-Linker plasmid to cause its own methylation in an mcr" cell (DHδαMCR)
containing the plasmid, by producing an active M.SPRI methyltransferase, was measured.
By definition, 1 unit of restriction endonuclease activity is the amount of
enzyme required to completely digest 1 μg of substrate DNA in a δO μl reaction
volume in 60 minutes at its optimum activity temperature. This endonuclease: DNA: reaction volume ratio was used as a guide when designing a reaction mixture. Glycerol concentration was kept at less than 5% in a reaction by diluting the restriction endonuclease, which was supplied in 50% glycerol, with the 1x enzyme reaction buffer for immediate use. Bovine serum albumin (BSA) was
supplemented in the reaction at a final concentration of 100 μg/ml for optimal activity
as well as restriction enzyme stabilizer. Normally 0.2-2μg of DNA was used in a total
volume of 30 μl with 5-10 units of restriction endonuclease(s), in the manufacturer's
recommended buffer. The incubation times of 2-16 hours were used but it was not
really critical. Heat inactivation by raising the temperature to 65°C for 20 minutes
was the method used for stopping a reaction if possible. It was found that mutant methyltransferase M.SPRX retained the capacity to methylate the sequence recognized by Haelll (GGCC), as shown by Haelll digestion experiments in FIG. 5. All restriction enzymes were obtained from MBI Fermentas or New England Biolabs, Inc. (NEB). However it was found that M.SPRX had selectively lost the ability of wildtype M.SPRI to methylate a CCGG sequence, i.e., the Mspl sequence. pSPR-Linker contains two Xbal, two Xho\ and two EcoRI sites, and since these sites should be unique in the polylinker (to facilitate insertion mutagenesis), the redundant sites had to be removed from the vector. In order to remove the X al, Sac\ and EcoRI sites from the region encoding motif VIII of the M.SPRI gene was accomplished in one step by site-directed
mutatgenesis using the QuikChange™ method and the oligodeoxynucleotides SPR
ΔEcoRIA (GGGTACAGAATTGACCTAGAGCTGCTTAATTCAAAATTCTTTAATG)
and SPR ΔEcoRIB
(CATTAAAGAATTTTGAATTAAGCAGCTCTAGGTCAATTCTGTACCC). The sequences of the mutagenic oligonucleotides are provided as SEQ ID NOS: 13 and 14, respectively. The resultant plasmid, named pSPR-Linker-dR, was shown to be refractory to Haelll restriction, and therefore codes for an active M.SPRI methyltransferase. In the final step, the XΛol site at the 3'-end of the M.SPRI gene was replaced with a Sacl site in order to make the Xho\ site in the polylinker sequence of pSPR- Linker-dR unique. This was accomplished by site-directed mutagenesis using a pair of complementary oligodeoxynucleotides SPR ΔXholA
(GAATCTGAATAACTGGAGCTCAAGCTTATTCATCG) and SPR ΔXholB
(CGATGAATAAGCTTGAGCTCCAGTTATTCAGATTC) and the QuikChange™
method. The sequences of the mutagenic oligonucleotides are provided as SEQ ID NOS: 1 δ and 16, respectively. The resulting construct, pSPRX, was shown to be refractory to Haelll restriction after isolation from a host cell, and therefore codes for an active methyltransferase.
EXAMPLE 2
Screening for mutations using pSPRX and PCR products derived from the human BRCA1 gene To demonstrate that pSPRX could be used to screen for mutations, in particular truncating mutations, regions of exon 11 of the human BRCA1 gene (GenBank Accession No. U 14680) were amplified and tested for the presence of a mutation by insertion into pSPRX. In this case the amplification products are serving as target polynucleotides and pSPRX is the detector nucleic acid. A series of target polynucleotides of varying size were tested to determine the length of insertion into the polylinker region that will be tolerated without disrupting Mtase activity. In addition, a number of target polynucleotides containing stop codons and/or frameshift mutations were inserted to verify that the method is capable of detecting such mutations. Insertion of PCR product-(1) into pSPRX. The first experiment involved the amplification of 300 bp from within exon 11 of the BRCA1 gene, using primers B1 (CGCGCGCGTCGACTTAACGAAACTGGACTCATTACTCCAAAT) and B3 (CGCGAAJTCATTAATACTGGAGCCCACTTCATT), as shown in SEQ ID NOS: 17 and 18, respectively. B1 is compatible with that part of the BRCA1 gene starting from base pair 3000, B3 starts at base pair 3299 of the BRCA1 coding sequence. These oligodeoxynucleotides introduce Sail and EcoRI sites into the δ' and 3'-ends of the PCR products, respectively.
The amplification product, which is called PCR product-(1), was cleaved with Sa/I and EcoRI and then inserted into pSPRX that had been pre-cut with Xhol and EcoRI. In the case of simultaneous cleavage of DNA with two different restriction endonucleases (double digest) the best buffer was selected to provide reaction conditions that were amenable to both restriction endonucleases. Depending on an enzyme's activity rating in a non-optimal buffer, the number of units of enzyme or incubation time of the reaction was adjusted to compensate for slower rate of cleavage. If no single buffer could be found to satisfy the buffer requirements of both enzymes, the reactions were carried out sequentially. An equal amount of each buffer was normally used in the reaction, regardless of the optimal buffer for each enzyme. If the reaction did not work; after first digestion phenol-chloroform extraction or Gene Clean was carried out to purify the DNA prior to second cleavage. This insertion shifts the reading frame of the M.SPRX gene.
The activity of TRD(H) in the resultant construct, pSPRX-(1 ), was checked by digesting the plasmid with Haelll. This was accomplished by isolating the plasmid, using the DNA Wizard miniprep system (by Promega) according to the
manufacturers instruction, from a DHδαMCR colony hosting the plasmid. The
isolated plasmid was subjected to restriction digestion with desired restriction enyzme and the extent of cleavage determined by analysis on agarose gel.
Insertion of PCR product-(2) into pSPRX. PCR product-(2) was generated by amplification of the BRCA1 gene with primers B2 and B3. Oligodeoxynucleotide B2 (CGCGCGC CGAGAACGAAACTGGACTCATTACTCCA), shown in SEQ ID NO: 19, is compatible with that part of the BRCA1 gene starting from base pair 3000 and introduces a Xho\ site at the δ'-end of PCR product. After cleavage of the PCR product with Xho\ and EcoRI, it was inserted into pSPRX that had been cut with the same enzymes. Checking the activity of TRD H in the recombinant plasmid, pSPRX- (2), by Haelll digestion again revealed that even with 100 amino acid insertion in the linker of M.SPRX, the enzyme retains methyltransferase activity. Insertion of PCR product-(4) into pSPRX. In order to determine whether the presence of a stop codon (nonsense mutation) in the PCR product would affect the activity of the M.SPRX, an ochre stop codon was introduced into PCR product-(4) using a synthetic reverse primer Bδ (CGCGAATTCTTAATTAATACTGGAGCCCACTTCATT), shown in SEQ ID NO: 20. PCR product-(4), which encodes a stop codon at its 3'-end, was generated by amplification of BRCA1 gene with oligodeoxynucleotides B2 and Bδ. After Xfrol- EcoRI restriction digestion, the PCR product was inserted into pSPRX that had been cleaved with the same enzymes. The only difference between PCR product-(2) and PCR product-(4) is the presence of TAA stop codon in the latter one. The TRD(H) activity of the encoded enzyme was assayed by determining the susceptibility of the plasmid to Haelll cleavage. It was found that the plasmid was partially susceptible to Haelll cleavage, indicating that, as a result of the introduction of the nonsense mutation, the enzyme retained only partial TRD(H) activity. Thus, truncation of the enzyme results in loss of the TRD(H) domain and a partial loss of the TRD(H) activity.
Insertion of PCR product-(5) into pSPRX. In the next experiment, a patient sample containing a stop codon-introducing mutation was used as the source of target DNA in order to assess the ability of the method to identify a naturally occurring mutation. At the end of exon 11 in the BRCA1 gene, a number of breast cancer (patients) families have been found to harbour a 4 bp deletion: AA[T CAA] GAA GAG CAA AGC ATG GAT TCA AAC TTA ggt att gga ace agg ttt ttg tgt (where the brackets indicate deleted sequence, and the small nucleotides indicate intronic sequence. This mutation, called "4184 del 4," contains the stop codon-introducing mutation 4184del4 (TCAA) in exon 11 , codon 13δ7, which has been associated with both ovarian and protatic cancer and which leads to the expression of a truncated BRCA1 protein (Simard et al., Nature Genet. 8: 392-398, (1994)). In the MIM entry of BRC1_HUMAN (Primary accession number P38398) this mutation is described as "001 δ BREAST-OVARIAN CANCER [BRCA1 , 4-BP DEL, 418δTCAA]. The mutation results in the introduction of a stop codon at the intron/exon border (TAg), and thus the insertion of an amplified segment of the gene that includes this region will result in a stop codon in the linker region. A patient sample containing the 4184 del 4 mutation (a generous gift from Dr.
Ann Dalton, Sheffield Human Genetics Laboratory) was used as a template for amplification with primers B6 (CGCGCGCTCGAGGTAATATTGGCAAAGGCATCTCAG) and B7© (CGCGAAJTCCAAACACAAAAACCTGGTTCCAAT), provided as SEQ ID NOS: 21 and 22, respectively, to produce the 300 bp PCR product-(δ)1 which includes the
intron/exon border and stop codon. B6 starts from base 3939 of BRCA1 coding sequence and introduces a XΛol site into the δ'-end of PCR product. Reverse primer B7 starts in intron 12, after exon 11 and runs until the 3'-end of the exon (base 4214 of BRCA1 coding sequence) introducing an EcoRI site into the products.
The PCR product-(5)1 was cut with XΛol and EcoRI and inserted into pSPRX which had been cleaved with the same enzymes. Recombinant plasmids were
prepared from E. coli DHδαMCR colonies and subjected to digestion by Haelll. The
results showed that the plasmids derived from some colonies were completely resistant to cleavage, while the plasmids from other colonies were only partially resistant. This indicates that the Mtase's TRD(H) functionality was active in some clones, while in the others it was only partially active. As a control experiment, PCR product (5)-2 was generated with the same primers but was derived from a template encoding the normal BRCA1 gene. After insertion of the PCR product into pSPRX, as explained above, recombinant plasmids
purified from the resulting DHδαMCR colonies were tested by Haelll digestion,
where it was determined that all plasmids were resistant to cleavage, i.e., the encoded Mtase retained full TRD(H) activity.
Insertion of PCR product-(β) into pSPRX. In order to test the tolerance of M.SPRX to a larger insert, a 400 bp PCR product-6 was generated from a region in exon 11 of the BRCA1 gene, using the primers B3 and B8
(CGCGCGCTCGAGGGCTTTCCTGTGGTTGGTCAGAAA) provided herein as SEQ ID NO: 23. Primer B8 starts at base pair 2782 of the BRCA1 coding sequence. After cleavage of PCR product-(6) with Xho\ and EcoRI, it was inserted into pSPRX, which had been cut with the same enzymes. The primers were designed in such a way that the insertion event would not change the reading frame of the M.SPRX gene. Haelll restriction digestion was employed to assess TRD(H) functionality. It was found that the resulting construct, pSPRX-(6), was resistant to Haelll cleavage. Insertion of PCR product-(7) into pSPRX. In order to assess the ability of the method to detect a stop codon in a longer insert, a 400 base PCR product containing a stop codon was generated (PCR product-(7)). It was amplified from a region of exon 11 of the BRCA1 gene using primers B8 and B5, described supra.
Primer B5 introduces a TAA stop codon in the PCR product. pSPRX was first cleaved with Xho\ and EcoRI and was then ligated with PCR product-(7) that had been cut with the same enzymes. TRD H in the resulting construct, referred to as pSPRX-(7), was determined to be partially active as shown by the Haelll cleavage pattern of the plasmid.
Insertion of PCR product~(8) into pSPRX. In order to test efficiency of this system in diagnostic testing, an unknown sample provided by Dr. Ann Dalton
(Sheffield Human Genetics Laboratory) was used to check whether the patient was normal with respect to BRCA1 or carried a mutation (a 4 bp (TCAA) deletion at the end of exon 11 ) in the BRCA1 gene. PCR product-(8)1 was generated using oligodeoxynucleotides B6 and B7 as primers and the anonymous patient sample as template. The PCR product was inserted into pSPRX, as described above, via Xr/ol
and EcoRI sites. The plasmids purified from DHδαMCR transformants were checked
for the presence of the insert and activity of M.SPRX. As shown in FIG. δ, the appearance of 7 degraded (D) and δ undegraded (U) plasmids taken from twelve bacterial colonies of a transformation plate presents a striking demonstration that the patient is heterozygous for a stop codon-introducing mutation in BRCA1. In each lane preceding the Haelll restriction digestion, double digestion with Xho\ and EcoRI has been performed to check the size of the insert, which should be 300 bps. These data show that the sample belongs to a heterozygote patient using the SSCP method followed by sequencing.
As a control PCR product (8)2 was made from the end of the exon 11 of normal BRCA1 gene and inserted into pSPRX in parallel experiments. Analysing the activity of TRD H by Haelll cleavage of the recombinants showed that they all expressed M.SPRX with an active TRD H.
Insertion of PCR product-(9) into pSPRX. A 400 bp PCR product (PCR product-(9)) was produced by amplification of the end of exon 11 of a normal BRCA1 gene, using primers B9 (36 bps) and B7. Primer B9 (CGCGCGC CGAGTCTACTAGGCATAGCACCGTTGCT), provided as SEQ ID NO:24, starts at base 3719 of BRCA1 coding sequence. After insertion of this PCR product into pSPRX, the encoded Mtase was shown to retain TRD(H) activity.
Insertion of PCR product-(10) into pSPRX. A 500 bp PCR product (PCR product-(10)2) was amplified from the end of exon 11 of BRCA1 gene, using primers B10 and B7. Primer B10 (CGCGCGC CGAGAAGAAATTAGAGTCCTCAGAAGAG), provided herein as SEQ ID NO:25, starts at base 3619 of BRCA1 coding sequence. The PCR product, which is in frame with M.SPRX gene, was inserted into pSPRX that had been cleaved with X iol and EcoRI as explained above. Analysis of the TRD(H) activity in the resulting construct, pSPRX-(10)2, by Haelll restriction digestion showed that it remains active even after insertion of this 500 bp fragment. Insertion of PCR product-(11) into pSPRX. In order to establish whether insertion of a larger fragment into pSPRX would still support M.SPRX activity, another PCR product was inserted into the gene. Amplification of the end of exon 11 in BRCA1 gene was carried out using a pair of synthetic primers B11 (CGCGCGC CGAGCATGCATCTCAGGTTTGTTCTGAG), provided herein as SEQ ID NO:26, and B7. Primer B11 starts at base 3421 of the BRCA1 coding sequence, and the resulting amplification product (PCR product-(11 )) is 700 bases. Insertion mutagenesis was carried out as before. The resultant plasmid, designated pSPRX-11 , was shown to be resistant to Haelll restriction digestion, thereby demonstrating that TRD H has remained active.
Insertion of PCR product-(12) into pSPRX. A 1 kb fragment was generated by amplification of the end of exon 11 of the BRCA1 gene, using primers B12
(CGCGCGC CGAGTCAAGCAATATTAATGAAGTAGGT ), provided herein as SEQ ID NO:27, and B7. Primer B12 starts at position 3121 of BRCA1 coding sequence.
Insertion mutagenesis was carried out as explained above, and analysis of the activity of M.SPRX in the mutant construct again demonstrated that TRD H retained activity.
Discussion of Example 2 Results.
FIG. δ shows a typical analysis of the susceptibility of an insert-containing pSPRX plasmid to restriction digestion. In particular, this is the result of analysis of transformants obtained after insertion of PCR product(8) into pSPRX, as described
above. Plasmids isolated following transformation of an mcr" host (DHδαMCR) were
treated with Haelll under standard digestion conditions analysed for cleavage by gel electrophoresis. The gel includes two lanes for each of 14 different transformant colonies, flanked by lanes containing molecular weight markers. For each tranformant, the first lane (reading from left to right) contains a double digestion of the plasmid with Xho\ and EcoRI to check the size of the insert, which should be 300 bps. The second lane contains the result of Haelll digestion. The three samples labeled N are seen to not contain the 300 bp insert, and were not analyzed further. Of the remaining 12 transformants that contain the proper size insert, 7 were degraded by Haelll treatment (labeled D), while the other δ were undegraded (labeled U). Thus, 7 of the inserts contain mutations that attenuate the TRD(H) activity of the encoded Mtase and thus are not able to sufficiently methylate the plasmid to protect against Haelll cleavage. The δ undegraded samples indicate that the corresponding inserts did not inactivate the TRD(H) functionality of the Mtase, i.e., no stop codon-introducing mutation was introduced. This is a good example of the results expected for a patient heterozygous for a stop codon-introducing mutation in BRCA1. In the case of a homozygous mutation, all of the resulting colonies are expected to express partially active enzyme and hence all plasmids containing the insert will be degraded.
EXAMPLE 3
Protein purification and analysis by SDS-PAGE In order to confirm the results obtained by checking insert-containing plasmids for susceptibility to Haelll cleavage, the proteins encoded by the M.SPRX gene (including insert) were purified and their lengths determined by SDS-PAGE. The M.SPRX-GST fusion products encoded by constructs containing PCR product 8(1 ) and 8(2) inserts were expressed in E. coli and purified to greater than 90%
purity in one step by affinity chromatography on Glutathione Sepharose® 4B
(Pharmacia, Inc.) (Smith and Johnson (1988) Gene 76:31-40). To express and purify the mutants of M.SPRI it was found that E. coli
GM2163 (obtained from New England Biolabs) was the best host to use. A colony
was grown overnight at 37°C, in 6 ml LB media (containing 100 μg/ml ampicillin). An
uninduced control was carried out by inoculating δ ml of LB containing ampicillin with
400 μl of the overnight culture and then incubating the culture at 2δ°C overnight. For
large-scale induction, δ ml of the overnight culture was inoculated into 100 ml of
media containing ampicillin. The cultures were incubated at 2δ°C for about 4 to δ
hours until the OD6oo was 0.8 unit. Following this, induction of protein expression was achieved by the addition of IPTG (De Boer et al. (1983) Proc. Natl. Acad. Sci. USA-Biological Sciences 80:21 -2δ) to a final concentration of O.δ mM. Incubation
was continued overnight at 2δ°C with continuous shaking.
Purification from 100 ml induced cultures was as follows. Sonicated cell extract containing the soluble fusion protein in PBS containing protease inhibitors (PMSF (200 μM), benzamidine (δ mM) and EDTA (1 mM); as well as lysozyme (1
mg/ml), which enhances cell lysis) was gently shaken at 4°C for 30 minutes with 1 ml
of glutathione agarose beads (pre-washed with three 10 ml volumes of PBS). The tubes were centrifuged for δ minutes at 3000 rpm to collect the agarose beads and protein. The supernatant was removed and the beads were washed three times with 10 ml PBS containing 1 mM EDTA, to remove non-specifically bound proteins. After the third wash, beads were resuspended in 3-δ ml of PBS containing EDTA and applied to a 2.δml disposable plastic column (Pierce and Warriner). When the column was set, the PBS was removed by opening the bottom until there was about
100 μl of the buffer on the beads. Bound protein was eluted at 4°C with 2 bed
volumes of elution buffer: 50 mM Tris-HCI, pH 8.0, 10 mM reduced glutathione. After 10 minutes the protein was eluted in four 0.5 ml fractions. Aliquots of each elution were analyzed by SDS-PAGE to ascertain purity.
SDS-PAGE analysis yielded apparent molecular weights of 84,000 and 69,000 Da for normal and mutant samples, respectively, consistent with truncation of the mutant sample owing to introduction of a stop codon. The gels are shown in FIG. 6. The results are consistent with the sensitivity towards Haelll cleavage of plasmids in which mutant DNA has been inserted being the result of truncation of the Mtase protein. Example 4
Confirmation of mutations by direct sequencing
The inserts of pSPRX-(8)1 and pSPRX-(8)2 were directly sequenced by standard dideoxy sequencing methods using primer B6. Template DNA was isolated and purified from E. coli cells using a Wizard Plus Minipreps kit (Promega). 200-500 ng double-stranded DNA template and 3.2 pmol of oligodeoxynucleotide primer were used for each reaction, using the Taq Dyedeoxy or BigDye Terminator Cycle Sequencing kit as described in the manufacturer's protocol. Extension products were purified using EtOH precipitation procedure as described in the manufacturer's protocol. After the final 70% ethanol wash, samples were dried under vacuum and taken to the Biomolecular Synthesis Service, University of Sheffield. DNA sequencing was performed on an Applied Biosystems Model 373A DNA sequencing system. The results confirmed that the pSPRX-(8)1 insert (i.e., patient-derived) contains a TCAA deletion relative to the normal sequence in pSPRX-(8)2.
Example 5 Detecting mutations by analyzing for rescue from a lethal phenotype In this example target DNA sequences were screened for the presence of a mutation by determining whether insertion of the sequence into pSPRX rendered the plasmid viable in an mcr+ host cell by inactivation of the gene products methylation
activity. The E. coli strains DH5α (mcr+) and DH5αMCR(mcr") were transformed with
pSPRX-(8)1 and pSPRX-(8)2. When the normal gene fragment is inserted into the plasmid encoding an active M.SPRI (M.SPRN), the expressed enzyme is methyltransferase proficient and the plasmid is viable in the mcr- strain alone (see a and b). However, when a gene fragment harbouring a stop codon is introduced into the same plasmid, the synthesis of the M.SPRI polypeptide is truncated and the plasmid is viable in both mcr+ (see e and g) and mcr- hosts. The data are also summarized in the following table, where M.SPRI = modified methyltransferase with EcoRII and Haelll specificities; M.SPRI+ = M.SPRI containing extra polypeptide sequence from BRCA1 ; and M.SPRIN = M.SPRI containing a BRCA1 nonsense mutation.
Figure imgf000073_0001
As predicted, only plasmids containing a mutation are viable in the mcr+ cells. The mcr" host cells, on the other hand, tolerate both mutant and non-mutant inserts. The bacterial transformations, shown in FIG. 7, illustrate the simplicity of the stop codon- introducing mutation test.
Example 6
Analysis of DNA restriction digestion by IP-RP-HPLC
In Example 2, the Hae///-induced degradation of pSPRX-(8)1 and pSPRX-
(8)2 isolated from DHδαMCR was assessed by gel electrophoresis (see FIG. δ). In
the present example, IP-RP-HPLC was used instead of gel electrophoresis, thereby simplifying, speeding up and automating the process.
Separation was achieved on an analytical size (inner dimensions δO x 4.6
mm) DNASep™ column (Trangenomic, Inc) using a WAVE nucleic acid analysis
system (Transgenomic, Inc.). The stationary phase of the DNASep™ column
comprises octadecyl modified, nonporous poly(ethylvinylbenzene-divinylbenzene) beads, as described in U.S. Patent No. 6,066,268.
The separation was conducted under the following conditions: Eluent A: 0.1 M TEAA, pH 7.0; Eluent B: 0.1 M TEAA, 2δ% acetonitrile; Gradient:
Figure imgf000075_0001
The flow rate was 0.7δ mL/min, detection UV at 260 nm, column temp. δO°C. The pH was 7.0.
10 μl of each Haelll digestion reaction was injected. The pSPRX-(8)2
chromatogram is represented by a solid, the pSPRX-(8)1 chromatogram by a dashed line. The fully methylated plasmid (pSPRX-(8)2) is protected from digestion and appears as a single peak at approximately 18 minutes. Significantly, the chromatogram between 8 minutes and 16 minutes is flat indicating that no digestion of the plasmid has occurred. This trace indicates that no truncating mutation is present.
The partially methylated plasmid (pSPRX-(8)1) has been partially digested and shows a smaller peak at 18 minutes possibly with one or more shoulders or partially resolved peaks as seen here. The chromatogram between 8 minutes and 16 minutes shows a higher absorbance than seen with pSPRX-(8)2 and contains multiple small peaks indicative of DNA fragments resulting from the partial digestion of the plasmid. This chromatogram clearly indicates the presence of a truncating mutation.
While the foregoing has presented specific embodiments of the present invention, it is to be understood that these embodiments have been presented by way of example only. It is expected that others will perceive and practice variations which, though differing from the foregoing, do not depart from the spirit and scope of the invention as described and claimed herein.

Claims

The invention claimed is:
1. A method for detecting a mutation in a target nucleotide sequence present in a sample nucleic acid, the method comprising:
(a) providing a detector nucleic acid which includes a detector nucleotide sequence, wherein the detector nucleotide sequence encodes a detector polypeptide having a detectable activity,
(b) inserting the target sequence within an open reading frame of the detector nucleotide sequence to give a chimeric sequence,
(c) expressing the chimeric sequence to give a chimeric polypeptide,
(d) determining the activity of the encoded chimeric polypeptide, and
(e) correlating the activity of the encoded chimeric polypeptide with the presence or absence of a mutation in the target nucleotide sequence.
2. The method of Claim 1 , wherein said detector polypeptide is an enzyme comprising two catalytically essential domains separated by a linker region, wherein additional polypeptide sequence can be inserted into the linker region to produce a chimeric enzyme which possesses a detectable catalytic activity so long as said catalytically essential domains remain linked, but wherein a decrease in said catalytic activity occurs if the insertion results in the loss of one of said catalytically essential domains from said detector polypeptide.
3. The method of Claim 2, wherein said loss of one of said catalytically essential domains is the result of truncation.
4. The method of Claim 2, wherein said loss of one of said catalytically essential domains is the result of a frameshift in the linker region disrupting the reading frame one of said catalytically essential domains. δ. The method of Claim 1 wherein said target nucleotide sequence comprises BRCA1 or BRCA2.
6. The method of Claim 1 wherein said target sequence has been obtained by IP-RP-HPLC separation of a mixture of DNA fragments.
7. The method of Claim 1 wherein said detector nucleotide sequence comprises engineered bacteriophage multi-specific DNA methyltransferase M.SPRI.
8. The method of Claim 1 , wherein said detector nucleic acid comprises a cloning vector including a detector nucleotide sequence which is a lethal gene and wherein step (b) includes ligating said target sequence into said detector nucleotide sequence.
9. The method of claim 8, wherein the activity of the chimeric polypeptide is determined by determining the viability of a host transformed with said chimeric sequence.
10. The method of Claim 1 , wherein the detector nucleotide sequence encodes a DNA methyltransferase.
11. The method of Claim 10, wherein the detector nucleotide sequence encodes a modified bacterial cytosine (C-δ)-specific DNA methyltransferase.
12. The method of Claim 10, wherein the methyltransferase is a modified form of a precursor methyltransferase originally containing a plurality of target recognition domains, and wherein said modification comprises the elimination of one of said target recognition domains.
13. The method of Claim 12, wherein nucleotide sequence that originally encoded the eliminated target recognition domain has been replaced with a linker containing a plurality of restriction sites.
14. The method of Claim 13, wherein the detector nucleotide sequence comprises M.SPRX or a derivative thereof sharing at least 70% identity therewith.
1 δ. The method of Claim 1 , wherein step (b) includes amplifying the target sequence under examination, using PCR primers containing δ' and 3' ends adapted for ligation into a vector, step (b) includes ligating the PCR products into an engineered MTase gene, step (c) includes transforming a host cell which has the property of being killed by active Mtase, and step (d) includes proliferating said cell to indicate the presence of a mutation within the sequence under examination.
16. The method of claim 10, wherein the activity of the chimeric peptide is determined by assessing the methylation state of a DNA molecule derived from a host cell expressing the chimeric sequence.
17. The method of Claim 16, wherein the said DNA molecule is the detector nucleic acid, and where the methylation state is assessed by treating said detector nucleic acid with a restriction endonuclease and determining the extent to which said detector nucleic acid is degraded by the endonuclease.
18. The method of Claim 17, wherein said detector nucleic acid is a vector, and wherein the extent of degradation is determined by electrophoresis.
19. The method of Claim 17, wherein said detector nucleic acid is a vector, and wherein the extent of degradation is determined by IP-RP-HPLC.
20. The method of claim 1 , wherein said detector nucleic acid comprises pSPRX, wherein step (c) includes transforming a mcrBC" host, and wherein steps (d) and (e) include analyzing plasmids from mcrBC" transformants by Haelll digestion, wherein if the insert is wild type, the plasmid is resistant to Haelll cleavage, and wherein if said insert contains a mutation, the plasmid is at least partially degraded after Haelll restriction digestion.
21. The method of claim 1 further including purifying the chimeric polypeptide and carrying out size-based separation of the chimeric polypeptide.
22. The method of claim 1δ further including obtaining plasmids from said transformed host cells and sequencing said plasmids.
23. A method for diagnosing, in an individual, a disease which is associated with a mutation in a target nucleotide sequence, which method includes performing a method as described in claim 1 using a sample of nucleic acid from that individual.
24. A detector nucleic acid adapted for use in detecting a mutation in a target nucleotide sequence present in a sample nucleic acid, the detector nucleic acid comprising: a detector nucleotide sequence which encodes a detector polypeptide having a detectable activity, wherein said detector polypeptide is an enzyme comprising two catalytically essential domains separated by a linker region, wherein additional polypeptide sequence can be inserted into the linker region to produce a chimeric enzyme which possesses a detectable catalytic activity so long as said catalytically essential domains remain linked, but wherein a decrease in said catalytic activity occurs if the insertion results in the loss of one of said catalytically essential domains from said detector polypeptide, said detector sequence comprising an insertion site wherein a target nucleotide sequence can be inserted to yield a chimeric sequence encoding said chimeric enzyme. 2δ. The detector nucleic acid of Claim 24, wherein said detector polypeptide is derived from a multi-component enzyme which can accommodate extra sequence within it with little or no loss of activity.
26. The detector nucleic acid of Claim 2δ, wherein said detector polypeptide comprises an Mtase.
27. The detector nucleic acid of Claim 26, wherein said Mtase is selected from the group consisting of M.SPRI, M. Aqul and M.MSPI.
28. The detector nucleic acid of Claim 27, wherein said detector nucleotide sequence comprises M.SPRX.
29. The detector nucleic acid of Claim 28, wherein said detector nucleic acid is pSPRX.
30. The detector nucleic acid sequence of Claim 24, wherein said detector nucleotide sequence comprises a target sequence.
31. The detector nucleic acid sequence of Claim 30, wherein said target sequence comprises all or a diagnostic part of a eukaryotic gene correlated with a genetic disease.
32. The detector nucleic acid sequence of Claim 31 , wherein said eukaryotic gene comprises a gene in which stop codon-introducing mutations having been characterized and correlated with disease states.
33. A method for diagnosing, in an individual, a disease which is associated with a mutation in a target nucleotide sequence, the method comprising: providing a detector nucleic acid which includes a detector nucleotide sequence, wherein the detector nucleotide sequence encodes a detector polypeptide having a detectable activity, inserting the target sequence within an open reading frame of the detector nucleotide sequence to give a chimeric sequence, expressing the chimeric sequence to give a chimeric polypeptide, determining the activity of the encoded chimeric polypeptide, and correlating the activity of the encoded chimeric polypeptide with the presence or absence of a mutation in the target nucleotide sequence.
34. The detector nucleic acid of Claim 24, wherein said detector nucleic acid resides in a host cell. 3δ. A diagnostic kit comprising the detector nucleic acid of Claim 24.
36. The method of Claim 1 , wherein said target nucleotide sequence is derived from genomic DNA.
37. The method of Claim 37, wherein said target nucleotide sequence is the product of PCR amplification.
38. The method of Claim 1 , wherein said target nucleotide sequence is derived from mRNA.
39. The method of Claim 38, wherein said target nucleotide sequence is the product of RT-PCR.
PCT/US2001/010672 2000-07-13 2001-04-02 Method for detection of truncated proteins WO2002006527A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001253092A AU2001253092A1 (en) 2000-07-13 2001-04-02 Method for detection of truncated proteins

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US21836700P 2000-07-13 2000-07-13
US60/218,367 2000-07-13
US23548300P 2000-09-26 2000-09-26
US60/235,483 2000-09-26
US67889100A 2000-10-04 2000-10-04
US09/678,891 2000-10-04
US71659600A 2000-11-20 2000-11-20
US09/716,596 2000-11-20

Publications (2)

Publication Number Publication Date
WO2002006527A2 true WO2002006527A2 (en) 2002-01-24
WO2002006527A3 WO2002006527A3 (en) 2003-10-02

Family

ID=27499111

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/010672 WO2002006527A2 (en) 2000-07-13 2001-04-02 Method for detection of truncated proteins

Country Status (2)

Country Link
AU (1) AU2001253092A1 (en)
WO (1) WO2002006527A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009037266A2 (en) * 2007-09-17 2009-03-26 Universite Louis Pasteur Method for detecting or quantifying a truncating mutation
EP2228456A1 (en) * 2007-11-02 2010-09-15 Geneart Ag Selection of encoding nucleic acid constructs for absence of frameshift mutations
RU2506315C2 (en) * 2012-03-23 2014-02-10 Федеральное государственное бюджетное учреждение "Научно-исследовательский институт молекулярной биологии и биофизики" Сибирского отделения Российской академии медицинских наук (ФГБУ "НИИМББ" СО РАМН) Plasmid vector and method of detecting nonsense mutations and frameshift mutations in brca1 gene

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997001639A2 (en) * 1995-06-27 1997-01-16 The University Of Sheffield Selection of recombinant molecules
WO1997011972A1 (en) * 1995-09-28 1997-04-03 The Trustees Of Columbia University In The City Of New York Chimeric dna-binding/dna methyltransferase nucleic acid and polypeptide and uses thereof
EP0872560A1 (en) * 1996-10-09 1998-10-21 Srl, Inc. Method for detecting nonsense mutations and frameshift mutations
WO1999038961A1 (en) * 1998-01-30 1999-08-05 Sepracor Inc. Gene regulator fusion proteins and methods of using the same for determining resistance of a protein to a drug targeted thereagainst

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997001639A2 (en) * 1995-06-27 1997-01-16 The University Of Sheffield Selection of recombinant molecules
WO1997011972A1 (en) * 1995-09-28 1997-04-03 The Trustees Of Columbia University In The City Of New York Chimeric dna-binding/dna methyltransferase nucleic acid and polypeptide and uses thereof
EP0872560A1 (en) * 1996-10-09 1998-10-21 Srl, Inc. Method for detecting nonsense mutations and frameshift mutations
WO1999038961A1 (en) * 1998-01-30 1999-08-05 Sepracor Inc. Gene regulator fusion proteins and methods of using the same for determining resistance of a protein to a drug targeted thereagainst

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KUKLIN A ET AL: "A novel technique for rapid automated genotyping of DNA polymorphisms in the mouse" MOLECULAR AND CELLULAR PROBES, ACADEMIC PRESS, LONDON, GB, vol. 13, no. 3, June 1999 (1999-06), pages 239-242, XP002159014 ISSN: 0890-8508 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009037266A2 (en) * 2007-09-17 2009-03-26 Universite Louis Pasteur Method for detecting or quantifying a truncating mutation
WO2009037266A3 (en) * 2007-09-17 2009-05-07 Univ Pasteur Method for detecting or quantifying a truncating mutation
EP2228456A1 (en) * 2007-11-02 2010-09-15 Geneart Ag Selection of encoding nucleic acid constructs for absence of frameshift mutations
US9115390B2 (en) 2007-11-02 2015-08-25 Geneart Ag Method for determining frameshift mutations in coding nucleic acids
RU2506315C2 (en) * 2012-03-23 2014-02-10 Федеральное государственное бюджетное учреждение "Научно-исследовательский институт молекулярной биологии и биофизики" Сибирского отделения Российской академии медицинских наук (ФГБУ "НИИМББ" СО РАМН) Plasmid vector and method of detecting nonsense mutations and frameshift mutations in brca1 gene

Also Published As

Publication number Publication date
AU2001253092A1 (en) 2002-01-30
WO2002006527A3 (en) 2003-10-02

Similar Documents

Publication Publication Date Title
KR102084186B1 (en) Method of identifying genome-wide off-target sites of base editors by detecting single strand breaks in genomic DNA
CN108350449B (en) Engineered CRISPR-Cas9 nuclease
Nishioka et al. Set9, a novel histone H3 methyltransferase that facilitates transcription by precluding histone tail modifications required for heterochromatin formation
Mijakovic et al. Transmembrane modulator‐dependent bacterial tyrosine kinase activates UDP‐glucose dehydrogenases
Uhler et al. Primer removal during mammalian mitochondrial DNA replication
Ebbs et al. Locus-specific control of DNA methylation by the Arabidopsis SUVH5 histone methyltransferase
CN102796728B (en) Methods and compositions for DNA fragmentation and tagging by transposases
Tamaru et al. A histone H3 methyltransferase controls DNA methylation in Neurospora crassa
Campbell et al. Protein arginine methyltransferase 1-directed methylation of Kaposi sarcoma-associated herpesvirus latency-associated nuclear antigen
JP2003505071A (en) Thermostable nucleoside diphosphate kinase for nucleic acid detection
KR20150131146A (en) Method for using heat-resistant mismatch endonuclease
CN112921015B (en) High-specificity Taq DNA polymerase variant and application thereof in genome editing and gene mutation detection
CA2695897A1 (en) Method of identifying individuals at risk of thiopurine drug resistance and intolerance
Lalonde et al. Exoribonuclease R in Mycoplasma genitalium can carry out both RNA processing and degradative functions and is sensitive to RNA ribose methylation
Li et al. aCPSF1 cooperates with terminator U-tract to dictate archaeal transcription termination efficacy
JP5013375B2 (en) Single protein production in living cells promoted by messenger RNA interference enzyme
JP4986358B2 (en) Nucleic acid molecules encoding mismatch endonucleases and methods of use thereof
WO2002006527A2 (en) Method for detection of truncated proteins
Landthaler et al. I-BasI and I-HmuI: two phage intron-encoded endonucleases with homologous DNA recognition sequences but distinct DNA specificities
WO2024112441A1 (en) Double-stranded dna deaminases and uses thereof
US7034116B2 (en) Methyltransferase gene and enzyme
EP1436422A2 (en) Method of determining susceptibility to inflammatory bowel disease
SUBBARAYAN et al. Escherichia coli RNase M is a multiply altered form of RNase I
CN112266418A (en) Improved genome editing system and application thereof
KR100374672B1 (en) Plasmid

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP