CA2219933A1 - Methods for the identification of genetic modification of dna involving dna sequencing and positional cloning - Google Patents

Methods for the identification of genetic modification of dna involving dna sequencing and positional cloning Download PDF

Info

Publication number
CA2219933A1
CA2219933A1 CA 2219933 CA2219933A CA2219933A1 CA 2219933 A1 CA2219933 A1 CA 2219933A1 CA 2219933 CA2219933 CA 2219933 CA 2219933 A CA2219933 A CA 2219933A CA 2219933 A1 CA2219933 A1 CA 2219933A1
Authority
CA
Canada
Prior art keywords
dna
sample
heteroduplex
sequence
mismatch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA 2219933
Other languages
French (fr)
Inventor
Anthony P. Shuber
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Genzyme Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/488,013 external-priority patent/US5707806A/en
Priority claimed from US08/487,986 external-priority patent/US5571676A/en
Application filed by Individual filed Critical Individual
Publication of CA2219933A1 publication Critical patent/CA2219933A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides methods for determining the precise location and sequence of genetic alterations and mutations present in a gene of interest. The present invention further provides methods for positional cloning and sequence determination of a gene of interest.

Description

~I~SFORTHEnD~NTn~CAl~ON OFGENE~C MODn~CAnON OFDNAnNVOLVnNG DNA
SEQUENCnNG AND POS~ONAL CLONnNG

Field of the Invention .., This invention pertains to high-throughput methodology that directly identifies previously unidentified sequence alterations in DNA, including specific disease-causing DNA sequences in m~mm~ , The methods of the present invention can be used to identify genetic polymorphisms, to determine the molecular basis for genetic diseases, and to provide carrier and prenatal ~ nosis for genetic counseling.

Background of the Invention The ability to detect alterations in DNA
sequences (e.g. mutations and polymorphisms) is central to the diagnosis of genetic diseases and to the identification of clinically significant variants of disease-causing microorg~n; ~m~ . One method for the molecular analysis of genetic variation involves the detection of restriction fragment length polymorphisms (RFLPs) using the Southern blotting technique (Southern, E.M., J. Mol. Biol., 98:503-517, 1975. Since this approach is relatively cumbersome, new methods have been developed, some of which are based on the polymerase chain reaction (PCR). These include: RFLP
analysis using PCR (Chehab et al., Nature, 329:293-294, 1987; Rommens et al., Am. J. Hum. Genet., 46:395-396, 1990), the creation of artificial RFLPs using primer-specified restriction-site modification (Haliassos et al., Nuc. Acids Res., 17:3606, 1989), allele-specific amplification (ASA) (Newton CR et al., Nuc. Acids Res., 17:2503-2516, 1989), oligonucleotide ligation assay (OLA) (T~n~gren U et al., Scie~ce 241:1077-1080, 1988), primer extension (Sokolov BP, Nuc. Acids Res., 18:3671, 1989), artificial introduction of restriction sites (AIRS) (Cohen LB et al., Nature 334:119-121, 1988), allele-specific W O 96/41002 PCTrUS96/08806 oligonucleotide hybridization (ASO) (Wallace RB et al., Nuc. Acids Res., 9:879-895, 1981) and their variants.
Together with robotics, these techniques ~or direct mutation and analysis have helped in reducing cost and increasing throughput when only a limited number of mutations need to be analyzed for e~icient ~;~gnostic analysis.

These methods are, however, limited in their applicability to complex mutational analysis. For example, in cystic ~ibrosis, a recessive disorder affecting 1 in 2000-2500 live births in the United States, more than 225 presumed disease-causing mutations have been identified.
Furthermore, multiple mutations may be present in a single affected individual, and may be spaced within a few base pairs of each other. These ph~nnm~n~ present unique difficulties in designing clinical screening methods that can accommodate large numbers of sample DNAs.

Shuber et al., Hum. Mol . Gen., 2:153-158, 1993, disclose a method that allows the simultaneous hybridization o~ multiple oligonucleotide probes to a single target DNA sample. By including in the hybridization reaction an agent that ~l;m;n~tes the disparities in melting temperatures of hybrids formed between synthetic oligonucleotides and target DNA, it is possible in a single test to screen a DNA sample ~or the presence of di~erent mutations. Typically, more than 100 ASOs can be pooled and hybridized to target DNA; in a second step, ASOs from a pool giving a positive result are individually hybridized to the same DNA. Shuber et al., Genome Res . 5:488-93, 1995, disclose a method for multiple allele-speci~ic disease analysis in which multiple ASOs are first hybridized to a target DNA, followed by elution and se~uencing of ASOs that hybridize. This method allows the identification of a mutation without the need ~or many WO 96/41002 PCTrUS96/08806 individual hybridizations involving single ASOs and requires prior knowledge of relevant mutations.

To achieve adequate detection frequencies for rare mutations using the above methods, however, large numbers of mutations must be screened. To identify previously unknown mutations within a gene, other methodologies have been developed, including: single-strand conformational polymorph; ~m.s (SSCP) (Orita M et al., Proc. Natl. Acad. Sci USA 86:2766-2770, 1989), denaturing gradient gel electrophoresis (DGGE) (Meyers RM et al., Nature 313:495-498, 1985), heteroduplex analysis (HET) (Keen j. et al., Trends Genet. 7:5, 1991), chemical cleavage analysis (CCM) (Cotton RGH et al., Proc. Natl.
Acad. Sci. USA 85:4397-4401, 1988), and complete sequencing of the target sample (Maxam AM et al., Methods Enzymol.
65:499-560, 1980, Sanger F. et al., Proc. Natl. Acad. Sci.
USA 74:5463-5467, 1977). A11 of these procedures however, with the exception of direct sequencing, are merely screening methodologies. That is, they merely indicate that a mutation exists, but cannot specify the exact sequence and location of the mutation. Therefore, identification of the mutation ultimately requires complete sequencing of the DNA sample. For this reason, these methods are incompatible with high-throughput and low-cost routine diagnostic methods.

Thus, there is a need in the art for a relatively low cost method that allows the efficient analysis of large numbers of DNA samples for the presence of previously unidentified mutations or sequence alterations.

Summary of the Invention ~35 The present invention encompasses high-throughput methods for identifying one or more genetic alterations in W O 96/41002 PCT~US96/08806 a target sequence present in a first DNA sample. The method is carried out by the steps of:
a) hybridizing the first sample with a second DNA
sample not cont~;n;ng genetic alterations to form heteroduplex DNA cont~;n;ng a mismatch region at the site of a gehetic alteration(s);
b) cleaving one DNA strand of the heteroduplex in the target sequence to form a single-stranded gap across the site of the alteration;
c) treating the cleaved heteroduplex with a DNA
polymerase in the presence of dideoxynucleotides to determine the seguence across the gap; and d) comparing the nucleotide sequence across the gap with a predetermined cognate wild-type sequence to identify the genetic alteration(s).

In practicing the above-described methods, the first DNA sample contA;n;ng the target sequence is hybridized under stringent conditions with a second DNA
sample not cont~;n;ng the alteration. The hybrids that form contain mismatch regions, which are recognized and ~n~onllcleolytically cleaved on one or both sides of the mismatch region by mismatch recognition protein-based systems. When a single endonucleolytic cleavage occurs on only one side of the mismatch region, one or more exonucleases are used to form the single-stranded gap.
When ~n~nl~cleolytic cleavage occurs on both sides of the mismatch region, the single-stranded fragment is released by the action of a helicase to form the single-stranded gap. Det~rm;n~tion of the sequence across the gap is achieved in a single step by an enzymatic DNA sequencing reaction using dideoxynucleotides and DNA polymerase I, DNA
polymerase III, T4 DNA polymerase, or T7 DNA polymerase.

In an alternate embodiment, the present invention encompasses high-throughput methods for identifying one or more genetic alterations in a target sequence present in a W O 96/41002 PCT~US96/08806 first DNA sample. This method is carried out by the steps of:
a) hybridizing the ~irst sample with a second DNA
sample not cont~;n;ng genetic alterations to ~orm heteroduplex DNA having free ends and contA;n;ng a mismatch region at the site of a genetic alteration(s);
b) cleaving the DNA at or in the vicinity of the alteration, forming new ends;
c) ligating an oligonucleotide of predetermined sequence to the new ends;
d) det~rm;n;ng the nucleotide sequence adjacent to the ligated oligonucleotide; and e)comparing the nucleotide sequence determ;ne~ in d) with a predetermined cognate wild-type sequence to identify the genetic alteration(s).

Specific cleavage at or near the alteration is achieved by hybridizing the first DNA sample cont~;n;ng the target sequence with a second DNA sample not cont~;n;ng the alteration, so that heteroduplexes are formed that contain mismatch regions, which can be recognized and cleaved by mismatch recognition systems.

Typically, the ~irst DNA sample comprises genomic DNA from a patient suffering from a genetic disease whose genome does not contain any o~ the known mutations that cause that disease, and the target sequence comprises a known disease-causing gene. The genetic alterations identified by these methods include additions, deletions, or substitutions of one or more nucleotides.

Mismatch recognition, cleavage, and excision systems useful in practicing the invention include without limitation bacteriophage resolvases, mismatch repair proteins, nucleotide excision repair proteins, chemical modification of mismatched bases followed by excision repair proteins, chemical modification and cleavage, and W O 96/41002 PC~r~US96/08806 combinations thereof, with or without supplementation with exonucleases as required.

The present invention finds application in high-throughput methods for multiplex identification of new mutations or previously unidentified polymorph;.cmc, in which DNAs obtained from a multiplicity of patients are immobilized on a single solid support, followed by one or more of the following steps: hybridization, mismatch recognition, excision, cleavage, ligation, sequencing, and sequence comparison steps as set forth above. Furth~rm~re, multiple specific target sequences can be analyzed simultaneously by amplifying the target sequences prior to immobilization, followed by the steps as set forth above.
In an alternate embodiment, the present invention provides methods for positional cloning of a disease-causing gene. Invention methods are carried out using the following steps:
a) hybridizing a first DNA sample derived from an individual suffering from the disease with a second DNA
sample derived from a multiplicity of individuals not suffering from the disease, to form hybrids cont~;n;ng mismatch regions at sites at which the sequence of the first DNA sample diverges from the sequence of the second DNA sample;
b) cleaving one DNA strand in the hybrids to form a single-stranded gap across the site of the alteration;
c) det~rm;n;ng the nucleotide sequence across the gap;
d) preparing a synthetic oligonucleotide comprising all or part of the nucleotide sequence determined in c)i and e) identifying a DNA clone derived from a cosmid or a Pl library cont~;n;ng the sequence of the synthetic oligonucleotide prepared in d).

W O 96/41002 PCT~US96/08806 In practicing the present invention, mismatch regions are recognized and endonucleolytically cleaved on one or both sides of the mismatch region by mismatch recognition protein-based systems. When a single ~n~7.o~77CleolytiC cleavage occurs on only one side of the mismatch region, one or more exonucleases are used to form the single-stranded gap. When endonucleolytic cleavage occurs on both sides of the mismatch region, the single-stranded fragment is released by the action of a helicase to form the single-stranded gap. Det~rm;n~tion of the sequence across the gap is achieved in a single step by an enzymatic DNA sequencing reaction using dideoxynucleotides and DNA polymerase I, DNA polymerase III, T4 DNA
polymerase, or T7 DNA polymerase.
The present invention further provides alternative methods for positional cloning of a gene of interest.
These methods are carried out by:
a) hybridizing a first DNA sample derived from an individual displaying a given phenotype with a second DNA
sample derived from one or more individuals not displaying the phenotype, to form heteroduplex DNA having free ends and cont~;n;ng a mismatch region at sites at which the sequence of the first DNA sample diverges from the sequence of the second DNA sample;
b) blocking the free ends on the hybrids formed in a);
c) cleaving one or both DNA strands within or adjacent to the mismatch regions to form new ends;
d) ligating a single-stranded oligonucleotide of predetermined sequence to the new ends formed in c);
e) det~rm;n;ng the nucleotide sequence adjacent - to the ligatedpredetermined sequence;
f) preparing a synthetic oligonucleotide ~35 comprising all or part of the nucleotide sequence determined in e); and W O 96/41002 PCTrUS96/08806 g) identifying a DNA clone derived from a cosmid or a Pl library cont~;n;ng the sequence of the synthetic oligonucleotide prepared in ~).

As used herein, positional cloning refers to a process by which a previously unknown disease-causing gene is localized and identified.

The genetic alterations identified by invention methods include additions, deletions, or substitutions of one or more nucleotides. Mismatch recognition, cleavage, and excision systems useful in practicing the invention include without limitation mismatch repair proteins, nucleotide excision repair proteins, bacteriophage resolvases, chemical modification of mismatched bases ~ollowed by excision repair proteins, and combinations thereo~, with or without supplementation with exonucleases as required.

Detailed Description of the Invention The present invention encompasses high-throughput methods for identifying specific target sequences in DNA
isolated from a patient. As used herein, the term high-throughput refers to a system ~or rapidly assaying large numbers o~ DNA samples at the same time. The methods are applicable when one or more genes or genetic loci are targets of interest. The specific sequences typically contain one or more sequence alterations relative to wild-type DNA, including additions, deletions, or substitutions of one or more nucleotides.

In practicing the methods of the present invention, the first DNA sample cont~;n;ng the target sequence is hybridized with a second sample of DNA (or a pool of DNA samples) cont~;n;ng one or more wild-type versions of the targeted gene. The methods of the present W O 96/41002 PCT~US96108806 invention take advantage of the physico-chemical properties of DNA hybrids between almost-identical (but not completely identical) DNA strands (i.e., heteroduplexes). When a seguence alteration is present, the heteroduplexes contain a mismatch region that is embedded in an otherwise perfectly matched hybrid. According to the present invention, mismatch regions are formed under controlled conditions and are chemically and/or enzymatically modified; the sequences adjacent to, and including, the mismatch are then determined. Dep~n~; ng upon the mismatch recognition method used, the mismatch region may comprise any number of bases, preferably from 1 to about 1000 bases.

The methods of the invention can be employed to identify specific disease-causing mutations in individual patients (when the gene or genes responsible for the disease are known) or previously unidentified polymorphisms and for positional cloning to identify new genes.

In a preferred embodiment, the specific DNA
sequence comprises a portion of a particular gene or genetic locus in the patient's genomic DNA known to be involved in a pathological condition or syndrome. Non-limiting examples of genetic syndromes include cystic fibrosis, sickle-cell anemia, t~lAcsemias~ Gaucher~s disease, adenosine d~Am;nA~e deficiency, alphal-antitrypsin deficiency, Duchenne muscular dystrophy, familial hypercholesterolemia, fragile X syndrome, glucose-6-phosphate dehydrogenase deficiency, hemophilia A, Huntington disease, myotonic dystrophy, neurofibromatosis type 1, osteogenesis imperfecta, phenylketonuria, retinoblastoma, Tay-Sachs disease, and Wilms tumor (Thompson and Thompson, Genetics in Medicine, 5th Ed. ) .

.35 In another embodiment, the specific DNA sequence comprises part of a particular gene or genetic locus that may not be known to be linked to a particular disease, but W O 96/41002 PCT~US96/08806 in which polymorphism is known or suspected. For example, obesity may be linked with variations in the apolipoprotein B gene, hypertension may be due to genetic variations in sodium or other transport systems, aortic aneurysms may be linked to variations in a-haptoglobin and cholesterol ester transfer protein, and alcoholism may be related to variant forms of alcohol dehydrogenase and mitocho~ial aldehyde dehydrogenase. Furthermore, an individual's response to medicaments may be affected by variations in drug modification systems such as cytochrome P450s, and susceptibility to particular infectious diseases may also be influenced by genetic status. Finally, the methods of the present invention can be applied to HLA analysis for identity testing.
In yet another em~bodiment, the specific DNA
sequence comprises part of a foreign genetic sequence e.g.
the genome of an invading microorganism. Non-limiting examples include bacteria and their phages, viruses, fungi, protozoa, and the like. The present methods are particularly applicable when it is desired to distinguish between different variants or strains of a microorganism in order to choose appropriate therapeutic interventions.

1. PREPARATION OF HETERODUPLEXES
.
In accordance with the present invention, the target sequence is contained within a sample of DNA
isolated from an ~n; m~l or human patient. This DNA may be obtained from any cell source or body fluid. Non-limiting examples of cell sources available in clinical practice include blood cells, buccal cells, cervicovaginal cells, epithelial cells from urine, ~etal cells, or any cells present in tissue obtained by biopsy. Body fluids include blood, urine, cerebrospinal fluid, and tissue exudates at the site of infection or inflammation. DNA is extracted W O 96/41002 PCT~US96/08806 from the cell source or body fluid using any of the numerous methods that are st~n~d in the art. It will be understood that the particular method used to extract DNA
will depend on the nature of the source. The preferred amount of DNA to be extracted for analysis of human genomic DNA is at least 5 pg (corresponding to about 1 cell equivalent of a genome size of 4 x 109 base pairs). In some applications, such as, for example, detection of sequence alterations in the genome of a microorganism, variable amounts of DNA may be extracted.

Once extracted, the sample DNA cont~;n;ng the target sequence may be employed in the present invention without further manipulation. Preferably, one or more specific regions present in the sample DNA may be amplified. In this case, the amplified regions are specified by the choice of particular flanking sequences for use as primers. Amplification at this step provides the advantage of increasing the concentration of specific sequences within the sample DNA population. The length of DNA sequence that can be amplified ranges from 80 bp to up to 30 kbp (Saiki et al., 1988, Science, 239:487).
Furthermore, the use of amplification primers that are modified by, e.g., biotinylation, allows the selective incorporation of the modification into the amplified DNA.

In one embodiment, the first DNA cont~;n;ng the target sequence, with or without prior amplification of particular sequences, is bound to a solid-phase matrix.
This allows the simultaneous processing and scr~n;ng of a large number of patient or first DNA samples. Non-limiting examples of matrices suitable for use in the present invention include nitrocellulose or nylon filters, glass beads, magnetic beads coated with agents for affinity capture, treated or untreated microtiter plates, and the like. It will be understood by a skilled practitioner that the method by which the DNA is bound to the matrix will depend on the particular matrix used. For example, b; n~; ng to nitrocellulose can be achieved by simple adsorption of DNA to the filter, followed by baking the filter at 75-80~C under vacuum ~or 15 min. to 2h.
Alternatively, charged nylon membranes can be used that do not require any ~urther treatment of the bound DNA. Beads and microtiter plates that are coated with avidin can be used to bind DNA that has had biotin attached (via e.g. the use of biotin-conjugated primers). In addition, antibodies can be used to attach DNA to any of the above solid supports by coating the surfaces with the antibodies and incorporating an antibody-specific hapten into the DNA. In a preferred embodiment, DNA that has been amplified using biotinylated primers is bound to streptavidin-coated beads (Dynal, Inc., Milwaukee, WI).

In practicing the present invention, the untreated or amplified ~irst DNA, preferably bound to a solid-phase matrix, is hybridized with a second DNA sample under conditions that favor the formation of mismatch loops. The second DNA sample preferably comprises one or more "wild-type" version(s) of the target sequence. As used herein, a ~wild-type~ version of a gene is one prevalent in the general population that is not associated with disease (or with any discernable phenotype) and is thus carried by "normal" individuals. In the general population, wild-type genes may include multiple prevalent versions, which contain alterations in sequence relative to each other that cause no discernable pathological effect;
these variations are designated "polymorphisms" or "allelic variants~. Most preferably, a mixture o~ DNAs from "normal~ individuals is used for the second DNA sample, thus providing a mixture o~ the most common polymorph; ~m.~, This insures that, statistically, hybrids formed between the first and second DNA sample will be perfectly matched except in the region of the mutation, where discrete mismatch regions will form. In some applications, it is W O 96/41002 PCTrUS96/08806 desired to detect polymorphisms; in these cases, appropriate sources ~or the second DNA sample will be selected accordingly. Dep~n~;ng upon what method is used subsequently to detect mismatches, the wild-type DNA may also be chemically or enzymatically modified, e.g., to remove or add methyl groups.
.

Hybridization reactions according to the present invention are performed in solutions ranging from about 10 mM NaCl to about 600 mM NaCl, at temperatures ranging from about 37~C to about 65~C. It will be understood that the stringency of a hybridization reaction is det~rm; n~ by both the salt concentration and the temperature; thus, a hybridization performed in 10 mM salt at 37~C may be of similar stringency to one performed in 500 mM salt at 65~C.
For the purposes of the present invention, any hybridization conditions may be used that form perfect hybrids between precisely complementary sequences and mismatch loops between non-complementary sequences in the same molecules. Preferably, hybridizations are performed in 600 mM NaCl at 65~C. Following the hybridization step, DNA molecules that have not hybridized to the first DNA
sample are removed by washing under stringent conditions, e.g., O.lX SSC at 65~C.
The hybrids formed by the hybridization reaction may then be treated to block any free ends so that they cannot serve as substrates for further enzymatic modification such as, e.g., by RNA ligase. Suitable blocking methods include without limitation r~LILovdl of 5 phosphate groups, homopolymeric tailing of 3' ends with dideoxynucleotides, and ligation of modified double-stranded oligonucleotides to the ends of the duplex.

W O 96/41002 PCT~US96/08806 2. MISMATCH RECOGNITION AND CLEAVAGE

In the next step, the hybrids are treated so that one or both DNA strands are cleaved within, or in the vicinity of, the mismatch region. Dep~n~; ng on the method used for mismatch recognition and cleavage (see below), cleavage may occur at some predet~rm;ne~ distance from either boundary of the mismatch region, and may occur on the wild-type or mutant strand. The "vicinity" of the mismatch as used herein thus encompasses from 1 to 2000 bases from the borders of the mismatch. Non-limiting examples of mismatch recognition and cleavage systems suitable for use in the present invention include mismatch repair proteins, nucleotide excision repair proteins, bacteriophage resolvases, chemical modification, and combinations thereof. These embodiments are described below.

In general, the mismatch recognition and/or modification proteins necessary for each embodiment described below are isolated using methods that are well known to those skilled in the art. Preferably, when the sequence of a protein is known, the protein-coding region of the relevant gene is isolated ~rom the source organism by subjecting genomic DNA of the organism to amplification using appropriate primers. The isolated protein-coding DNA
sequence is cloned into commercially available expression vectors that, e.g., insert an amino acid "purification tag"
at either the amino- or carboxyterminus of the recombinant protein. The recombinant expression vector is then introduced into an appropriate host cell (e.g., E. coli), and the protein is recovered from the cell lysate by affinity chromatography that recognizes the "tag". For example, the bacterial expression vector pQiexl2 is used to express proteins with a polyhistidine tag, allowing purification of the recombinant product by a single step of chromatography on Ni-Sepharose (QiaGen, Chatsworth, CA).

W O 96/41002 PCTrUS96/08806 Other methods involve the expression o~ recombinant proteins carrying glutathione-S-transferase se~uences as tags, allowing purification of the recombinant products on glutathione affinity columns (Pharmacia Biotech, Uppsala, Sweden). If necessary, proteins cont~;n;ng purification tags are then treated so as to remove the tag sequences.
Alternatively, the protein may be isolated from its cell of origin using st~n~rd protein purification techniques well-known in the art, including, e.g., molecular sieve, ion-exchange, and hydrophobic chromatography; and isoelectric focusing. "Isolation" as used herein denotes purification of the protein to the extent that it can carry out its function in the context of the present invention without interference from extraneous proteins or other contAm;n~nts derived from the source cells.

The mismatch recognition and modification proteins used in practicing the present invention may be derived from any species, from E. coli to hllmAn~, or mixtures thereof. Typically, functional homologues for a given protein exist across phylogeny. A "functional homologue" of a given protein as used herein is another protein that can ~unctionally substitute ~or the ~irst protein, either in vivo or in a cell-free reaction.
Mismatch repair proteins:

A number o~ different enzyme systems exist across phylogeny to repair mismatches that form during DNA
replication. In E. coli, one system involves the MutY gene product, which recognizes A/G mismatches and cleaves the A-cont~;n;ng strand (Tsai-Wu et al., ~. Bacteriol.
178:1902,1991). Another system in E. coli utilizes the coordinated action of the MutS, MutL, and MutH proteins to recognize errors in newly-synthesized DNA strands specifically by virtue of their transient state of n~rm~thylation (prior to their being acted upon by dam W O 96/41002 PCT~US96/08806 methylase in the normal course o~ replication). Cleavage typically occurs at a h~m; methylated GATC site within 1-2 kb of the mismatch, followed by exonucleolytic cleavage of the strand in either a 3~-5~ or 5~-3' direction from the nick to the mismatch. In vivo, this is followed by re-synthesis involving DNA polymerase III holoenzyme and other factors (Cleaver, Cell, 76:1-4, 1994).

Mismatch repair proteins for use in the present invention may be derived ~rom E. coli (as described above) or from any organism cont~;n;ng mismatch repair proteins with appropriate functional properties. Non-limiting examples of use~ul proteins include those derived from ,~7m~e71a ty~h;m77~ium (MutS, MutL); Streptococcus pneumoniae (HexA, HexB); Saccharomyces cerevisiae ("all-type", MSH2, MLHl, MSH3); Schizosaccharomyces pombe (SWI4);
mouse (repl, rep3); and human (~all-type", hMSH2, hMLHl, hPMSl, hPMS2, ducl). Pre~erably, the "all-type" mismatch repair system from human or yeast cells is used (Chang et al., Nuc. Acids Res. 19:4761, 1991; Yang et al., J. Biol.
Chem. 266:6480,1991). In a pre~erred embodiment, heteroduplexes ~ormed between patients' DNA and wild-type DNA as described above are incubated with human "all-type"
mismatch repair activity that is puri~ied essentially as described in International Patent Application WO/93/20233.

Incubations are performed in, e.g., lOmM Tris-HCl pH
7.6, lOmM ZnCl2, lmM dithiothreitol, lmM EDTA and 2.9%
glycerol at 37~C for 1-3 hours. In another embodiment, purified MutS, MutL, and MutH are used to cleave mismatch regions (Su et al., Proc. Natl. Acad. sci~usA 83:5057,1986;
Grulley et al., J. Biol. Chem. 264:1000,1989).

Nucleotide excision repair proteins:
In E. coli, ~our proteins, designated UvrA, UvrB, UvrC, and UvrD, interact to repair nucleotides that are W O 96/41002 PCT~US96/08806 damaged by W light or otherwise chemically modified (.~Ant-;3r, Science 266:1954, 1994), and also to repair mismatches (Huang et al., Proc. Natl . Acad . Sci . USA
91:12213, 1994). UvrA, an ATPase, makes an A2B1 complex with UvrB, binds to the site of the lesion, unwinds and kinks the DNA, and causes a conformational change in UvrB
that allows it to bind tightly to the lesion site. UvrA
then dissociates from the complex, allowing UvrC to bind.
UvrB catalyzes an ~n~onllcleolytic cleavage at the fifth phosphodiester bond 3' from the lesioni UvrC then catalyzes a similar cleavage at the eighth phosphodiester bond 5' from the lesion. Finally, UvrD (helicase II) releases the excised oligomer. In vivo, DNA polymerase I
displaces UvrB and fills in the excision gap, and the patch is ligated.

In one embodiment of the present invention, heteroduplexes formed between patients' DNA and wild-type Dl~TA are treated wlth a r..~xture of IJ-vrA, U-vïB, UVIC, with or without UvrD. As described above, the proteins may be purified from wild-type E. coli, or from E. coli or other a~o~riate host cells cont~;n;ng recombinant genes encoding the proteins, and are formulated in compatible buffers and concentrations. The final product is a heteroduplex cont~;n;ng a single-stranded gap covering the site of the mismatch.

Excision repair proteins for use in the present invention may be derived from E. coli (as described above) or from any organism cont~;n;ng appropriate functional homologues. Non-limiting examples of useful homologues include those derived from S. cerevisiae (RAD1, 2, 3, 4, 10, 14, and 25) and hlmn~n~ (XPF, XPG, XPD, XPC, XPA, ERCC1, and XPB) (Sancar, Science 266:1954, 1994). When the human ' 35 homologues are used, the excised patch comprises an oligonucleotide ext~n~l~ng 5 nucleotides from the 3' end of the lesion and 24 nucleotides from the 5' end of the lesion. Aboussekhra et al. (Cell 80:859, 1995) disclose a reconstituted in vitro system ~or nucleotide excision repair using puri~ied components derived ~rom human cells.

Chemical Mismatch Recognition:

Heteroduplexes formed between patients' DNA and wild-type DNA according to the present invention may be chemically modified by treatment with osmium tetroxide (for mispaired thymidines) and hydroxylamine (for mispaired cytosines), using procedures that are well known in the art (see, e.g., Grompe, Nature Genetics 5:111, 1993; and Saleeba et al., Meth. Enzymol . 217:288, 1993). In one embodiment, the chemically modified DNA is contacted with excision repair proteins (as described above). The hydroxylamine- or osmium-modified bases are recognized as damaged bases in need of repair, one of the DNA strands is selectively cleaved, and the product is a gapped heteroduplex as above.
Resolvases:

Resolvases are enzymes that catalyze the resolution of branched DNA intermediates that form during recombination events (including Holliday structures, cruci~orns, and loops) via recognition of bends, kinks, or DNA deviations (Youil et al., Proc.Natl.Acad.Sci.USA 92:87, 1995). For example, ~n~lonl7clease VII derived from bacteriophage T4 (T4E7) recognizes mismatch regions of from one to about 50 bases and produces double-stranded breaks within six nucleotides from the 3' border of the mismatch region. T4E7 may be isolated ~rom, e.g., a reconbinant E.
coli that overexpresses gene 49 o~ T4 phage (Kosak et al., Eur. J. Biochem . 194:779, 1990). Another suitable resolvase ~or use in the present invention is 7~n~7Onllclease I of bacteriophage T7 (T7E1), which can be isolated using a WO 96/41002 PCT~US96/08806 polyhistidine purification tag sequence (M~.Ch~l et al., Nature Genetics 9 :177, 1995).

In a preferred embodiment, heteroduplexes formed between patients' DNA and wild-type DNA as described above are incubated in a 50 ~1 reaction with 100-3000 units of T4E7 for 1 hour at 37~C.
3. SEQUENCE DETERMINATION
In practicing the present invention, immobilized DNA from a patient is hybridized to wild-type DNA to form mismatch regions and then treated with mismatch repair proteins, excision repair proteins, resolvases, chemical modification and cleavage reagents, or combinations of such agents, to introduce single- or double-stranded breaks at some predetermined location relative to the site of the mismatch regions.

In one embodiment, the introduction of single-stranded breaks at predetermined locations on one or both sides of a mismatch region causes the selective excision of a single-stranded fragment covering the mismatch region.
The resulting structure is a gapped heteroduplex in which the gap may be from about 5 to about 2000 bases in length, depending on the mismatch recognition system used.

To determine the nucleotide sequence of the excised region (including the mismatch), the heteroduplexes are incubated with an appropriate DNA polymerase enzyme in the presence of dideoxynucleotides. Suitable enzymes for use in this step include without limitation DNA polymerase I, DNA polymerase III holoenzyme, T4 DNA polymerase, and T7 DNA polymerase. The only requirement is that the enzyme '35 be capable of accurate DNA synthesis using the gapped heteroduplex as a substrate. The presence of dideoxynucleotides, as in a Sanger sequencing reaction, W O 96/41002 PC~r~US96/08806 insures that a nested set of premature t~rm;n~tion products will be produced, and that resolution of these products by, e.g., gel electrophoresis will display the DNA sequence across the gap.

In some circumstances, the sequence obt~;n~ using this method will correspond to the wild-type strand and not to the patient's DNA in which the mutation is sought. This result is easily accomodated by a second round of sequencing, with or without prior amplification of the relevant DNA region. In this case, the sequence of the mutation is determined using as a template the patient's unmodified DNA in conjunction with sequencing primers derived from the sequence determined in the first round.
In an alternative embodiment of sequence det~rm;n~tion, the hybrids formed between the wild-type DNA
and the patient's DNA are then dissociated by denaturation, and the wild-type DNA and any cleavage products of the target DNA are removed by w~h;ng. The immobilized r~m~; n; ng target DNA is then ligated to a synthetic single-stranded oligonucleotide of predetermined sequence, designated a "ligation oligonucleotide", that serves as a primer for enzymatic DNA sequencing. The oligonucleotide may be from about 15 to about 25 nucleotides in length. A
preferred ligation oligonucleotide has the sequence 5'-CAGTAGTACAACTGACCCTTTTGGGACCGC-3'. Ligation is achieved using, e.g., RNA ligase (Pharmacia Biotech, Uppsala, Sweden).
A typical ligation reaction is performed at 37~C
for 15 min in a 20 ~l reaction cont~;n;ng 50mM Tris-HCl, pH
7.5, lOmM MgCl2, 20mM dithiothreitol, lmM ATP, 100 ~g/ml bovine serum albumin, at least 1 ~g immobilized target DNA, a 10-fold molar excess of the ligation oligonucleotide, and 0.1-5.0 units/ml T4 RNA ligase. Following the ligation, unligated oligonucleotides are removed by w~h;ng~

W O 96/41002 PCTrUS96/08806 The sequence o~ DNA ;mmeA;~tely adjacent to the ligated oligonucleotide is then determined by any method known in the art. In one embodiment, enzymatic sequencing is performed according to the dideoxy Sanger technique, using as a sequencing primer a second oligonucleotide of predetermined sequence that is complementary to the ligation oligonucleotide (Sanger et al., Proc. Natl. Acad.
Sci . USA 74:5463, 1977). Each microsequencing reaction is then resolved by techniques well-known in the art, including without limitation gel electrophoresis, and the sequence is determined.

In another embodiment, an oligonucleotide complementary to the ligated oligonucleotide is used to prime DNA synthesis using DNA polymerase I in the presence of all four nucleoside triphosphates. The newly synthesized strand is then analyzed using hybridization to oligonucleotide arrays as described in Pease et al., Proc.
Natl . Acad. Sci . USA 91: 5022, 1994.

Identification of a sequence alteration according to the present invention is preferably achieved in a single round o~ mismatch recognition and cleavage, oligonucleotide ligation, and DNA sequencing. This occurs when the ligated oligonucleotide becomes covalently attached to a) the immobilized truncated target DNA that contains the alteration b) within 10-500 bp of either boundary of the mismatch region. If either o~ these conditions is not fulfilled, further rounds of sequencing may be required to localize and identify the sequence alteration. It will be understood by those of ordinary skill in the art that - sequencing primers for one or more further rounds of sequencing will be dictated by the sequence obt~;neA in the ~35 first round (either the same or complementary strands).
Without wishing to be bound by theory, it is contemplated that one or two sequencing rounds will reveal the W O 96/41002 PCT/U',''.99C~

divergence between a known wild-type sequence and that contained within the DNA of a particular patient (see below).

High-Throughput Applications The methods of the present invention are particularly suitable for high-throughput analysis of DNA, i.e., the rapid and simultaneous processing of DNA sam.ples derived from a large number of patients. Furthe~m~re, in contrast to other methods for de novo mutation detection, the methods of the present invention are suitable for the simultaneous analysis of a large number of genetic loci in a single reaction; this is designated "multiplex" analysis.
Therefore, for any one sample or for a multiplicity of samples, the present invention allows the analysis of both intragenic loci (several regions within a single gene) and intergenic loci (several regions within different genes) in a single reaction mixture. The manipulations involved in practicing the methods of the present invention lend themselves to automation, e.g., using multiwell microtiter dishes as a solid support or as a receptacle for, e.g., beads; robotics to perform sequential incubations and washes; and, finally, automated sequencing using commercially available automated DNA sequencers. It is contemplated that, in a clinical context, 500 patient DNA
samples can be analyzed within 1-2 days in a cost-effective m~nne7~ .

Positional Cloning The methods of the present invention are also suitable for positional cloning of unknown genes that cause pathological conditions or other detectable phenotypes in any organism. "Positional cloning" as used herein denotes a process by which a previously unknown disease-causing gene is localized and identified. For example, W O 96/41002 PCTrUS96/08806 identification of multiplex families in which several members exhibit signs of a genetically-based syndrome often occurs even when the particular gene responsible for the syndrome has not been identified. Typically, the search for the unknown gene involves one or more of the following time- and labor-intensive steps: 1) cytogenetic localization of the gene to a relatively large segment of a particular chromosome; 2) assembly of overlapping cosmid or Pl clones that collectively cover several hundred thousand nucleotides corresps~;ng to the identified chromosomal region; 3) sequencing the clones; and 4) transcript mapping to identify expressed protein-encoding regions of the gene.

The present invention offers an alternative, cost-effective method for localizing a disease-causing gene. Briefly, DNA from affected individuals is hybridized with normal DNA as described above to form mismatch regions at the site of the mutation. Preferably, large regions of DNA corresponding to the chromosomal location are amplified from the patient's genomic DNA prior to inclusion in the hybridization reaction. The hybrids are then treated by any of the methods described above so that mismatch regions are recognized and cleaved, forming gapped heteroduplexes across the mismatch region. Finally, the sequence in the vicinity of the mismatch region is determined.

In this embodiment, det~rm;n~tion of even a short sequence in the vicinity of the mismatch facilitates definitive identification of the disease-causing gene.
The short sequence that is determined in the first round of sequencing can be used to design oligonucleotide probes for use in screening genomic or cDNA libraries.
-Other methods in which the primary se~uence ~35 information can be used, either alone or in conjunction with library screening, include identification of tissue specific expression, reverse transcription-amplification of W O 96/41002 PCT/U',''.~C~
mRNA, and screening o~ an a~ected population ~or genotype/phenotype association. Thus, without wishing to be bound by theory, it is contemplated that a previously unknown gene that causes a disease or other phenotype can be quickly and ef~iciently identified by these methods.

The following examples are intended to illustrate the present invention without limitation.

Example 1: Preparation of Target DNA

A) Preparation of Sample DNA from Blood Whole blood samples collected in high glucose ACD
Vacutainers~ (yellow top) were centrifuged and the buffy coat collected. The white cells were lysed with two washes of a 10:1 (v/v) mixture of 14mM NH4Cl and lmM NaHCO3, their nuclei were resuspended in nuclei-lysis buffer (10mM Tris, pH 8.0, 0.4M NaCl, 2mM EDTA, 0.5% SDS, 500 ~g/ml proteinase K) and incubated overnight at 37~C. Samples were then extracted with a one-fourth volume of saturated NaCl and the DNA was precipitated in ethanol. The DNA was then washed with 70% ethanol, dried, and dissolved in TE buffer (10mM Tris-HCl, pH 7.5, lmM EDTA).
B) Preparation of Sample DNA from Buccal Cells Buccal cells were collected on a sterile cytology brush (Scientific Products) or female dacron swab (Medical Packaging Corp.) by twirling the brush or swab on the inner cheek for 30 seconds. DNA was prepared as follows, ;mme~;ately or after storage at room temperature or at 4~C.
The brush or swab was im-m-ersed in 600 ~l of 50mM NaOH
contained in a polypropylene microcentrifuge tube and vortexed. The tube, still cont~;n;ng the brush or swab, was heated at 95~C for 5 min, after which the brush or swab was care~ully removed. The solution cont~;n;ng DNA was W O 96/41002 PCT~US96/08806 then neutralized with 60 ~1 o~ lM Tris, pH 8.0, and vortexed again (Mayall et al., J.Med.Genet. 27:658, 1990).
The DNA was stored at 4~C.

C) A-m~plification o~ Target DNA Prior to Hybridization DNA from patients with CF was amplified by PCR in a Perkin-Elmer Cetus 9600 Thermocycler. Five primer sets were used to simultaneously amplify relevant regions of exons 4, 10, 20, and 21 of the cystic fibrosis tr~n~m~mhrane conductance regulator (CFTR) gene (Richards et al., Hum. Mol.Gen. 2:159, 1993). The 50 ~1 PCR reaction mix contained the following components: 0.2-l ~g CF
patient DNA, lOmM Tris pH 8.3, 50mM KCl, 1.5mM MgCl2, O.01 (w/v) gelatin, 200~M of each deoxynucleotide triphosphate, 0.4~M of each amplification primer, and 2.5 units of Ta~
polymerase. An initial denaturation was performed by incubation at 94~C for 20 seconds, followed by 28 cycles of amplification, each consisting of 10 seconds at 94~C, 10 seconds at 55~C, 10 seconds at 74~C, and a final soak at 74~C ~or 5 min. Following amplification, 8 ~l o~ the PCR
products were electrophoresed in a 2% agarose gel to verify the presence of all five products.

D) Binding of DNA to a Solid Matrix:

For binding of ampli~ied DNA to a solid support, the amplification reactions described above are per~ormed in the present o~ biotinylated primers. The biotinylated products are then incubated with ~ynabeads~M-280 Streptavidin (Dynal) in a solution cont~;n;ng 10 mM Tris HCl, pH 7.5, 1 mM EDTA, 2M NaCl, and 0.1% Tween-20 for 15-30 minutes at 48~C.

W O 96/41002 , PCTGUS96/08806 Example 2: Hybridization o~ target DNA and wild-type DNA

A) Preparation of wild-type DNA:

DNA is prepared from blood or buccal cells of healthy individuals as described in Example 1. A
representative "wild-type~' DNA sample is prepared by cnmh;n;ng and thoroughly mixing DNA samples derived from 10-200 individuals.
B) Hybridization Reaction:

Hybridizations are carried out in microtiter dishes contA;n;ng bead-immobilized DNA prepared as in Example lD above. The hybridization solution contains approximately 500 ~Lg/ml wild-type DNA (prepared as in Example 2A above) and approximately 50 ~Lg/ml amplified immobilized target DNA (prepared as in Example 1) in 10 Tris HCl pH 7.5 - 650I[M NaCl. The reaction mixtures are heated at 90~C for 3 minutes, ai~ter which hybridizations are allowed to proceed for 1 hour at 65~C. The hybridization solution is then removed and the beads are washed three times in O.lx SSC at 65~C.

C) Blocking of :Eree ends:

The beads cont~;n;ng DNA:DNA hybrids prepared as described above are treated so that free ends become blocked and no longer accessible to modification by, e.g., RNA ligase. The wells are incubated in 100 ,~Ll of a solution cont~;n;ng 0.4M potassium cacodylate, 50 mM Tris HCl, pH
6.9, 4 mM dithiothreitol, 1 mM CoC12, 2mM ddGTP, 500 ,ug/ml bovine serum albumin, and 2 units of terminal transferase for 15 minutes at 37~C.

WO 96/41002 PCT~US96/08806 Example 3: Mismatch recognition, cleavage, and sequencing A) In one embodiment of the present invention, four identical reactions mixtures, each cont~; n; ng 50 ~l beads to which DNA hybrids prepared as described in Example 2 are bound, are incubated with 2 ~l of a lOX T4 Polymerase buffer (50 mM NaCl, 10 mM Tris-HCl, pH 7.9, 10 mM MgCl2, lmM dithiothreitol, and 1 mg/ml bovine serum albumin); 16 ~1 water; 1 ~l T4 endonuclease 7 (250-3000 units, obtained as described in Kosak et al., Eur. J. Biochem. 194:779, 1990); and 1 ~1 T7 DNA polymerase (3 units). The reaction is allowed to proceed for 1-10 minutes at 37~C.

9 ~l of a ~'termination mix" is then added to each reaction. '~T~rm;n~tion mix" contains 8 ~M of a single ddNTP (i.e., ddGTP, ddATP, ddTTP, or ddCTP) and 80 ~M of all four dNTPs, one of which is labelled with a radioactive or fluorescent label. In addition, 1 ~l of lOX T4 - polymerase buffer is added, and the reaction is allowed to proceed for 5 minutes at 37~C.

The reaction mix is removed and the beads are washed three times with 100 ~1 TE (10 mM Tris-HCl, pH 7.5, 1 mM EDTA). Finally, the beads are resuspended in 6 ~1 gel loading buffer (95% formamide, 20 mM EDTA, 0.05% bromphenol blue, 0.05% Xylene Cyanol FF). The buffer is removed from the beads and loaded on a 6% denaturing polyacrylamide DNA
sequencing gel.

B) Alternatively, 50 ~l beads cont~;n;ng DNA hybrids prepared as described in Example 2 are incubated with 500 units of T4 endonuclease 7 in a solution cont~;n;ng 50 mM
Tris-HCl, pH 8.0, 10 mM MgCl2, and 1 mM dithiothreitol for 30 minutes at 37~C. T4 endonuclease 7 is obtained as described in Kosak et al., Eur. ~. Biochem. 194:779,1990.

W O 96/41002 PCT~US96/08806 A~ter the incubation, the beads are heated to 90~C ~or three minutes, after which the solution is quickly removed and replaced with prewarmed TE, and the beads are washed three times with TE at room temperature. This procedure ef~ectively denatures DNA:DNA hybrids and removes wild-type DNA strands.

Example 4: Mismatch recognition and cleavage using chemical mismatch cleavage In one embodiment of the present invention, microtiter wells prepared as described in Examples 1 and 2 above are treated sequentially with hydroxylamine and osmium tetroxide.
A) Hyroxylamine treatment:

Hydroxylamine (obtained from Aldrich, Milwaukee, WI) is dissolved in distilled water, and the pH is adjusted to 6.0 with diethylamine (Aldrich) so that the final concentration is about 2.5 M. 200 ~1 of the solution are incubated within the wells at 37~C ~or 2 hours. The reaction is stopped by replacing the hydroxylamine solution with an ice-cold solution cont~;n;ng 0.3 M sodium acetate, 0.1mM EDTA, pH5.2, and 25 ~g/ml yeast tRNA (Sigma, St.
Louis, MO). The wells are then washed in an ice-cold solution of 10mM Tris-HCl, pH 7.7, lmM EDTA prior to osmium tetroxide treatment.

B) Osmium tetroxide treatment:

Osmium tetroxide (Aldrich) is dissolved in 10mM
Tris-HCl, pH 7.7, lmM EDTA, and 1.5% (v/v) pyridine to a concentration o~4% (w/v). The wells are incubated with this solution ~or 2 hours at 37~C. The reaction is stopped by replacing the osmium tetroxide solution with an ice-cold solution cont~;n;ng 0.3 M sodium acetate, 0.lmM
EDTA, pH5.2, and 25 ~g/ml yeast tRNA.

C) Piperidine cleavage:
Chemical cleavage of the C and T bases that react with hydroxylamine or osmium tetroxide is achieved by incubating the dishes with lM piperidine at 90~C for 30 min. The wells are then washed extensively with distilled water.

Example 5: Sequencing of mismatch regions Immobilized DNAs prepared as described in Examples 1 and 2 above and subjected to mismatch recognition and cleavage (as described in Examples 3B or 4 above or by other methods) are incubated with a single-stranded oligonucleotide having the sequence 5'-CAGTAGTACAACTGACCCTTTTGGGACCGC-3' under conditions in which efficient ligation of the oligonucleotide to free 5' ends is achieved. The oligonucleotide and immobilized DNA are combined in a solution cont~;n;ng 50 mM Tris HCl, pH 7.5, 10 mM MgCl2, 20 mM dithiothreitol, 1 mM ATP, and 100 ~g/ml bovine serum albumin, after which RNA ligase (ph~rm~cia Biotech, Uppsala, Sweden) is added to the solution to achieve a final enzyme concentration of 0.1-5.0 U/ml.
The reaction is allowed to proceed at 37~C for 15 min.
Following the ligation reaction, the solution is removed, and the wells are washed with distilled water.
DNA sequencing is then performed using the Sanger method (Sanger et al.,Proc.Natl.Acad.Sci.USA 74:5463, 1977).

W O 96/41002 PCTrUS96/08806 Example 6: Positional cloning o~ a disease-causing gene The experiments described below are performed to rapidly localize and sequence a genomic region correspon~;ng to a disease-causing gene.

A multiplex family in which a genetic disease is expressed is identified using st~n~d clinical indicators.
DNA samples are obtained from affected and unaffected individuals as described in Example 1 abovei if by patterns of tr~n~m~sion the disease appears to be an autosomal recessive syndrome, DNA samples are obtained ~rom those individuals presumptively heterozygous for the disease gene.
In one embodiment, all DNA samples are subjected to mismatch analysis by hybridization to wild-type DNA as described in Example 2 above. The hybrids are then treated with mismatch repair proteins to form a gapped heteroduplex, and the sequence across the gap is determined as described in Example 3A above.

In an alternative embodiment, all DNA samples are subjected to mismatch analysis by hybridization to wild-type DNA as described in Example 2 above. The hybrids are then treated with T4 ~n~o~llclease 7 as described in Example 3B above. Finally, an oligonucleotide having the sequence 5'-CAGTAGTACAACTGACC~ ~GGACCGC-3~ is ligated to the cleaved hybrids using RNA ligase, and the products are subjected to enzymatic DNA sequencing as described in Example 5 above.

The sequences obtained from unaf~ected, affected, and presumptively heterozygous family members are compared with each other and with available sequence databases, using, ~or example, Sequencher (Gene Codes, Ann Arbor, MI) and Assembly Lign (Kodak, New Haven, CT) The sequences are WO 96/41002 PCTnJS96108806 also serve as the basis for design of oligonucleotide probes, which are chemically synthesized and used to probe human genomic DNA libraries.

,.

Claims (36)

What is claimed is:
1. A method for identifying one or more genetic alterations in a target sequence present in a first genomic DNA sample, which comprises:
a) hybridizing said DNA sample with a second DNA
sample, wherein said second sample does not contain the alteration(s), to form heteroduplex DNA containing a mismatch region at the site of an alteration(s);
b) cleaving one strand of said heteroduplex in the target sequence to form a single-stranded gap across the site of said alteration(s);
c) treating said cleaved heteroduplex with a DNA
polymerase in the presence of dideoxynucleotides to determine the nucleotide sequence across said gap; and d) comparing said nucleotide sequence with a predetermined cognate wild-type sequence to identify said genetic alteration(s).
2. The method of claim 1, 16, 24 or 32, wherein the alterations are selected from the group consisting of additions, deletions, and substitutions of one or more nucleotides and combinations thereof.
3. The method of claim 1, 16, 24 or 32, wherein said target sequence is amplified prior to the hybridizing step.
4. The method of claim 1, 16 or 24, wherein the first DNA sample is immobilized on a solid support prior to the hybridizing step.
5. The method of claim 4, wherein the solid support is selected from the group consisting of nitrocellulose filter, nylon filter, glass beads, and plastic.
6. The method of claim 1, 21, 24 or 35, wherein said cleaving step comprises exposing said heteroduplex DNA to one or more resolvase proteins under conditions appropriate for mismatch recognition and cleavage.
7. The method of claim 6, wherein the resolvases are selected from the group consisting of T4 endonuclease 7 and T7 endonuclease 1.
8. The method of claim 1 or 21, wherein said DNA polymerase is selected from the group consisting of DNA
polymerase I, DNA polymerase III, T7 DNA polymerase, and T4 DNA polymerase.
9. The method of claim 1, 21, 24 or 35, wherein said cleaving step comprises exposing said heteroduplex DNA to one or more mismatch repair proteins under conditions appropriate for mismatch recognition, cleavage, and excision.
10. The method of claim 9, wherein the one or more mismatch repair proteins comprise Escherichia coli proteins MutS, MutL, MutH, and MutU, or functional homologues thereof.
11. The method of claim 10, wherein the functional homologues are derived from species selected from the group consisting of Salmonella typhimurium, Streptococcus pneumoniae, Saccharomyces cerevisiae, Schizosaccharomyces pombe, mouse and human.
12. The method of claim 1, 21, 24 or 35, wherein said cleaving step comprises exposing said heteroduplex DNA to a mixture of nucleotide excision repair proteins under conditions appropriate for mismatch recognition, cleavage, and excision.
13. The method of claim 12, wherein the mixture comprises E. coli proteins UvrA, UvrB, UvrC, and UvrD, or functional homologues thereof.
14. The method of claim 13, wherein the functional homologues are derived from species selected from the group consisting of Saccharomyces cerevisiae and human.
15. The method of claim 1 or 24, further comprising determining the complement of said nucleotide sequence using said first DNA as a template.
16. A method for identifying one or more genetic alterations in a target sequence present in a first genomic DNA sample, which comprises:
a) hybridizing the first DNA sample with a second DNA sample, wherein said second sample does not contain the alteration(s), to form heteroduplex DNA containing a mismatch region at the site of an alteration(s);
b) treating said heteroduplex DNA with a mixture of T4 endonuclease 7 and DNA polymerase I in the presence of dideoxynucleotides to form premature termination products;
c) resolving said termination products to determine the nucleotide sequence in the vicinity of the mismatch region; and d) comparing said nucleotide sequence with a predetermined cognate wild-type sequence to identify said alteration(s).
17. A method for multiplex identification of one or more mutation(s) in a DNA, the method comprising:

a) immobilizing one or more first DNA samples on a solid support;
b) hybridizing said immobilized sample(s) with a second DNA sample, wherein said second sample does not contain the mutation(s), to form heteroduplex DNA
containing a mismatch region at the site of a mutation(s);
c) cleaving one or both strands of said heteroduplex adjacent to said mismatch region to form a gap at the site of said mutation(s);
d) treating said cleaved heteroduplex with a DNA
polymerase in the presence of dideoxynucleotides to determine the nucleotide sequence across said gap using enzymatic DNA sequencing; and e) comparing said nucleotide sequence(s) with one or more predetermined cognate wild-type sequences to identify said mutation(s).
18. The method of claim 1, 16, 17, 24, 33 or 34, wherein the DNA samples are denatured prior to hybridization.
19. The method of claim 17, 33 or 34, wherein the first DNA sample is amplified prior to immobilization.
20. A method for identifying one or more genetic alterations in a target sequence present in a genomic DNA sample, which comprises:
a) denaturing said DNA;
b) reannealing said DNA to form heteroduplex DNA
containing a mismatch region at the site of an alteration(s);
c) cleaving one strand of said heteroduplex in said target sequence to form a single-stranded gap across the site of said alteration(s);
d) treating said cleaved heteroduplex with a DNA
polymerase in the presence of dideoxynucleotides to determine the nucleotide sequence across said gap; and e) comparing said nucleotide sequence with a predetermined cognate wild-type sequence to identify said alteration(s).
21. A method for positional cloning of a gene of interest, the method comprising:
a) hybridizing a first DNA sample derived from an individual displaying a given phenotype with a second DNA
sample, wherein said second DNA sample is derived from one or more individual(s) not displaying said phenotype, to form heteroduplex DNA containing a mismatch region at the site(s) at which the sequence of said first DNA diverges from the sequence of said second DNA;
b) cleaving one strand of said heteroduplex DNA
to form a single-stranded gap across said mismatch region;
c) treating said cleaved heteroduplex with a DNA
polymerase in the presence of dideoxynucleotides to determine the nucleotide sequence across said gap;
d) preparing a synthetic oligonucleotide comprising all or part of said nucleotide sequence; and e) identifying a DNA clone that hybridizes to said oligonucleotide.
22. The method of claim 21 or 35, wherein the mismatch region is caused by one or more modifications in the gene of interest selected from the group consisting of additions, deletions, and substitutions of one or more nucleotides and combinations thereof.
23. The method of claim 21 or 35, wherein said nucleotide sequence is determined by enzymatic DNA
sequencing.
24. A method for identifying one or more genetic alterations in a target sequence present in a first DNA sample, which comprises:

a) hybridizing said first DNA sample with a second DNA sample, wherein said second sample does not contain the alteration(s), to form heteroduplex DNA having free ends and containing a mismatch region at the site of an alteration(s);
b) cleaving said heteroduplex DNA at or in the vicinity of the alteration, forming new ends;
c) ligating a single-stranded oligonucleotide of predetermined sequence said new ends;
d) determining the nucleotide sequence of said DNA sample adjacent to said ligated oligonucleotide; and e) comparing said nucleotide sequence with a predetermined cognate wild-type sequence to identify said genetic alteration(s).
25. The method of claim 24 or 35 further comprising blocking said free ends on said heteroduplex DNA
prior to the cleaving step.
26. The method of claim 25, wherein the blocking step comprises a method selected from the group consisting of removal of 5' phosphate groups, homopolymeric tailing of 3' ends with dideoxynucleotides, and ligation of modified double-stranded oligonucleotides.
27. The method of claim 24 or 35, wherein said cleaving step comprises the steps of:
a) exposing said heteroduplex DNA to one or more non-protein chemical reagents under conditions appropriate for mismatch recognition and modification; and b) cleaving one strand of said heteroduplex DNA
in the vicinity of the modification.
28. The method of claim 27, wherein the chemical reagent is selected from the group consisting of hydroxylamine and osmium tetroxide.
29. The method of claim 24 or 35, wherein the single-stranded oligonucleotide is from about 15 to about 35 nucleotides in length.
30. The method of claim 24 or 35, wherein the ligating step is achieved using RNA ligase.
31. The method of claim 24 or 35, wherein the determining step is achieved using hybridization to oligonucleotide arrays.
32. A method for identifying one or more genetic alterations in a target sequence present in a first genomic DNA sample, the method comprising:
a) immobilizing said first DNA sample on a solid support;
b) hybridizing said immobilized sample with a second DNA sample, wherein said second sample does not contain the alteration, to form heteroduplex DNA having free ends and containing a mismatch region at the site of the alteration(s);
c) chemically blocking said free ends with a terminal transferase in the presence of a dideoxynucleotide;
d) cleaving one strand of said heteroduplex DNA
adjacent to said mismatch region with bacteriophage T4 endonuclease 7 to form new ends;
e) ligating a single-stranded oligonucleotide having the sequence 5'-CAGTAGTACAACTGACCCTTTTGGGACCGC-3' to said new ends;
f) determining the nucleotide sequence adjacent to said ligated oligonucleotide using enzymatic DNA
sequencing; and g) comparing said nucleotide sequence with a predetermined cognate wild-type sequence to identify the mutation(s).
33. A method for identifying one or more mutation(s) in a DNA, the method comprising:
a) immobilizing said DNA sample on a solid support;
b) hybridizing said immobilized sample with a second DNA, wherein said second sample does not contain the mutation(s), to form heteroduplex DNA having free ends and containing a mismatch region at the site of a mutation(s);
c) chemically blocking said free ends;
d) cleaving one or both strands of said heteroduplex within or adjacent said mismatch region to form new ends;
e) ligating a single-stranded oligonucleotide of predetermined sequence to said new ends;
f) determining the nucleotide sequence adjacent to said ligated oligonucleotide; and g) comparing said nucleotide sequence with one or more predetermined cognate wild-type sequences to identify said mutation(s).
34. A method for multiplex identification of one or more mutation(s) in a first DNA, the method comprising:
a) immobilizing one or more first DNA samples on a solid support;
b) hybridizing said immobilized sample(s) with a second DNA sample, wherein said second sample does not contain the mutation(s), to form heteroduplex DNA having free ends and containing a mismatch region at the site of a mutation(s);
c) chemically blocking said free ends;
d) cleaving one or both strands of said heteroduplex within or adjacent to said mismatch region, to form new ends;
e) ligating a single-stranded oligonucleotide of predetermined sequence to said new ends;

f) determining the nucleotide sequence adjacent to said ligated oligonucleotide; and g) comparing said nucleotide sequence with one or more predetermined cognate wild-type sequences to identify said mutation(s).
35. A method for positional cloning of a gene of interest, the method comprising:
a) hybridizing a first DNA sample derived from an individual displaying a given phenotype with a second DNA
sample, wherein said second sample is derived from one or more individual(s) not displaying said phenotype to forms heteroduplex DNA having free ends and containing a mismatch region at the site at which the sequence of said first DNA
sample diverges from the sequence of said second DNA
sample;
b) cleaving one or both strands of said heteroduplex DNA within or adjacent to the mismatch region to form new ends;
c) ligating a single-stranded oligonucleotide of predetermined sequence to said new ends;
d) determining the nucleotide sequence adjacent to said ligated oligonucleotide;
e) preparing a synthetic oligonucleotide comprising all or part of said nucleotide sequence; and f) identifying a DNA clone that hybridizes to said oligonucleotide.
36. The method of claim 21 or 35, wherein the identifying step is achieved using a method selected from the group consisting of colony hybridization, identification of tissue specific expression, reverse transcription-amplification of mRNA, and screening of an affected population for genotype/phenoytpe association.
CA 2219933 1995-06-07 1996-06-06 Methods for the identification of genetic modification of dna involving dna sequencing and positional cloning Abandoned CA2219933A1 (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US48798795A 1995-06-07 1995-06-07
US48801295A 1995-06-07 1995-06-07
US08/488,013 US5707806A (en) 1995-06-07 1995-06-07 Direct sequence identification of mutations by cleavage- and ligation-associated mutation-specific sequencing
US08/488,013 1995-06-07
US08/487,986 US5571676A (en) 1995-06-07 1995-06-07 Method for mismatch-directed in vitro DNA sequencing
US08/487,986 1995-06-07
US08/487,987 1995-06-07
US08/488,012 1995-06-07

Publications (1)

Publication Number Publication Date
CA2219933A1 true CA2219933A1 (en) 1996-12-19

Family

ID=27504308

Family Applications (1)

Application Number Title Priority Date Filing Date
CA 2219933 Abandoned CA2219933A1 (en) 1995-06-07 1996-06-06 Methods for the identification of genetic modification of dna involving dna sequencing and positional cloning

Country Status (5)

Country Link
EP (1) EP0832283A2 (en)
JP (1) JPH11506937A (en)
AU (1) AU705257B2 (en)
CA (1) CA2219933A1 (en)
WO (1) WO1996041002A2 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9724480D0 (en) * 1997-11-19 1998-01-14 Hexagen Technology Limited Screening process
US6297010B1 (en) * 1998-01-30 2001-10-02 Genzyme Corporation Method for detecting and identifying mutations
US6573053B1 (en) 1999-02-05 2003-06-03 Amersham Biosciences Uk Limited Analysis method
DE19911130A1 (en) * 1999-03-12 2000-09-21 Hager Joerg Methods for identifying chromosomal regions and genes
US7166432B2 (en) 1999-03-12 2007-01-23 Integragen Compositions and methods for genetic analysis
WO2001009384A2 (en) * 1999-07-29 2001-02-08 Genzyme Corporation Serial analysis of genetic alterations
US6524794B1 (en) * 1999-10-29 2003-02-25 Decode Genetics Ehf. Identical-by-descent fragment enrichment
US6235483B1 (en) 2000-01-31 2001-05-22 Agilent Technologies, Inc. Methods and kits for indirect labeling of nucleic acids
DE10030452A1 (en) * 2000-06-21 2002-01-24 Max Planck Gesellschaft In vitro detection of DNA damage, useful e.g. for determining genotoxicity of chemicals, comprises measuring specific interaction between DNA and repair proteins
JP2004509628A (en) * 2000-09-21 2004-04-02 メルク エンド カムパニー インコーポレーテッド Method for producing recombinant polynucleotide
AU2003210259A1 (en) * 2002-02-21 2003-09-09 Nanogen Recognomics Gmbh Method for detecting single nucleotide polymorphisms
ES2292270B1 (en) 2004-04-14 2009-02-16 Oryzon Genomics, S.A. PROCEDURE FOR SELECTLY DETECTING NUCLEIC ACIDS WITH ATIPIC STRUCTURES CONVERTIBLE IN MUESCAS.
US20200407776A1 (en) * 2019-06-26 2020-12-31 Integrated Dna Technologies, Inc. Compositions and methods for improved detection of genomic editing events

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5556750A (en) * 1989-05-12 1996-09-17 Duke University Methods and kits for fractionating a population of DNA molecules based on the presence or absence of a base-pair mismatch utilizing mismatch repair systems
AU3919293A (en) * 1992-03-27 1993-11-08 University Of Maryland At Baltimore Detection of gene mutations with mismatch repair enzymes
US5750335A (en) * 1992-04-24 1998-05-12 Massachusetts Institute Of Technology Screening for genetic variation
US5376526A (en) * 1992-05-06 1994-12-27 The Board Of Trustees Of The Leland Stanford Junior University Genomic mismatch scanning
FR2709761B1 (en) * 1993-09-10 1995-11-24 Pasteur Institut Method for detecting molecules containing nucleotide mismatches and for locating these mismatches, and application to the detection of base substitutions or deletions.
WO1996035809A1 (en) * 1995-05-11 1996-11-14 Avitech Diagnostics Inc Detection of mismatches by resolvase cleavage on a solid support
US5707806A (en) * 1995-06-07 1998-01-13 Genzyme Corporation Direct sequence identification of mutations by cleavage- and ligation-associated mutation-specific sequencing
US5571676A (en) * 1995-06-07 1996-11-05 Ig Laboratories, Inc. Method for mismatch-directed in vitro DNA sequencing

Also Published As

Publication number Publication date
WO1996041002A3 (en) 1997-02-20
AU705257B2 (en) 1999-05-20
EP0832283A2 (en) 1998-04-01
JPH11506937A (en) 1999-06-22
AU5981396A (en) 1996-12-30
WO1996041002A2 (en) 1996-12-19

Similar Documents

Publication Publication Date Title
US5707806A (en) Direct sequence identification of mutations by cleavage- and ligation-associated mutation-specific sequencing
US5571676A (en) Method for mismatch-directed in vitro DNA sequencing
US6297010B1 (en) Method for detecting and identifying mutations
US20210189460A1 (en) Sequential paired-end sequencing
EP0777750B1 (en) High throughput screening method for sequences or genetic alterations in nucleic acids
US5589330A (en) High-throughput screening method for sequence or genetic alterations in nucleic acids using elution and sequencing of complementary oligonucleotides
US9677127B2 (en) Method of detecting gene mutation
US5834181A (en) High throughput screening method for sequences or genetic alterations in nucleic acids
FI111554B (en) Reagent composition and kit to identify a nucleotide base at a specific position
US20080090733A1 (en) Method for selectively isolating a nucleic acid
WO1996003529A9 (en) High throughput screening method for sequences or genetic alterations in nucleic acids
US20030044794A1 (en) 5'-thio phosphate directed ligation of oligonucleotides and use in detection of single nucleotide polymorphisms
KR20010005544A (en) Extraction and utilisation of VNTR alleles
AU705257B2 (en) Methods for the identification of genetic modification of DNA involving DNA sequencing and positional cloning
CA2318980C (en) Method for detecting and identifying mutations
US7189512B2 (en) Methods for variation detection
JP2761159B2 (en) Method for detecting complementary nucleic acid sequence and kit therefor
El-Hashemite et al. A technique for eliminating allele specific amplification failure during DNA amplification of heterozygous cells for preimplantation diagnosis.
EP4060050B1 (en) Highly sensitive methods for accurate parallel quantification of nucleic acids
US20100285970A1 (en) Methods of sequencing nucleic acids
AU785211B2 (en) Method for selectively isolating a nucleic acid
EP4215619A1 (en) Methods for sensitive and accurate parallel quantification of nucleic acids
KR20240032631A (en) Highly sensitive methods for accurate parallel quantification of variant nucleic acids
JP2002209584A (en) Method for detecting nucleotide polymorphism
CA2205234A1 (en) High throughput screening method for sequences or genetic alterations in nucleic acids

Legal Events

Date Code Title Description
FZDE Dead