WO1991017269A1 - A method for mapping a eukaryotic chromosome - Google Patents
A method for mapping a eukaryotic chromosome Download PDFInfo
- Publication number
- WO1991017269A1 WO1991017269A1 PCT/US1991/003006 US9103006W WO9117269A1 WO 1991017269 A1 WO1991017269 A1 WO 1991017269A1 US 9103006 W US9103006 W US 9103006W WO 9117269 A1 WO9117269 A1 WO 9117269A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dna
- library
- chromosome
- genomic dna
- restriction enzyme
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6841—In situ hybridisation
Definitions
- the human genome consists of a DNA sequence of some 3 billion base pairs carried on 46 chromosomes. This genetic blueprint provides all of the information
- a current estimate of the number of evenly spaced markers required to provide markers 20cM apart is 300-700.
- To obtain such a collection of probes with a spacing of approximately 10 cM or less requires cloning, analyzing and mapping a substantial portion of the genome. This is true of any random mapping method because there is a high probability that newly defined markers will map genetically too close to an existing marker to be useful as an additional marker.
- the subject invention relates, in one aspect, to a method for ordering a set of discrete DNA sequences complementary to a eukaryotic chromosome for physical and genetic mapping.
- the method involves providing a set of discrete DNA sequences, each discrete DNA sequence being complementary to a region of the eukaryotic chromosome.
- the order of the discrete sequences on the chromosome is determined by in situ hybridization. Discrete DNA sequences which contain a restriction enzyme recognition' sequence containing the dinucleotide CpG and a polymorphic DNA sequence are then identified.
- the invention in another aspect, relates to a method for isolating a gene from a eukaryotic organism of interest.
- a DNA library of genomic DNA clones containing insert DNA from the eukaryotic organism of interest is provided.
- DNA is purified from individual genomic DNA clones contained within the genomic DNA library and digested with at least one restriction enzyme which recognizes and cleaves a nucleotide sequence which contains the dinucleotide CpG.
- the products of the restriction enzyme digestion are displayed on a gel and the genomic DNA clones having insert DNA which is
- restriction enzyme recognized and cleaved by the restriction enzyme are identified.
- the method for isolating a gene from a eukaryotic organism is particularly useful when a gene of interest having a known genetic map position has been identified.
- individual genomic DNA clones having insert DNA which is recognized and cleaved by the restriction enzyme which recognized and cleaves a nucleotide sequence which contains the dinucleotide CpG are labeled and the map position of the complementary chromosomal region for each clone is determined physically by in situ
- Candidate clones are identified as those having a physical map position near the location of the gene of interest as determined by genetic mapping.
- Figure 1 is a diagrammatic representation of the vector cHCl.
- Figures 2A and 2B are diagrammatic representations showing restriction enzyme cleavage sites for 6
- the subject invention is based on the discovery of a simple and convenient method for ordering a set of discrete DNA sequences, each discrete DNA sequences being complementary to a region of a eukaryotic chromosome.
- the ordering of the discrete DNA sequences is useful for physical and genetic mapping.
- the term "ordering" means to establish the linear relationship of the discrete DNA sequences relative to one another on the chromosome, or a portion of the chromosome.
- a set of discrete DNA sequences are provided, each discrete DNA sequence being complementary to a region of the chromosome.
- the set of discrete DNA sequences is provided as a chromosome specific genomic DNA library.
- the chromosome specific genomic DNA library can be obtained from any source, or it can be constructed using known techniques. Of particular interest are chromosome specific libraries of human origin, although the methods described herein are applicable to all eukaryotes.
- Somatic cell hybrids e.g. hamster/human hybrids
- Somatic cell hybrids are available which contain a single human chromosome.
- Such libraries are available for purchase from the American Type Culture Collection (Rockville, Md).
- Total DNA isolated from such hybrid cells can be fragmented, for example by restriction enzyme digestion, and inserted into an appropriate vector.
- Genomic clones carrying human inserts are identified by probing the library with human specific probe DNA (e.g. Alu repeat sequences).
- the desired human chromosome can be isolated from other- human chromosomes by the well known flow sorting
- the vector used for construction of the genomic DNA library accomodate inserts of greater than 20 kb.
- Such vectors include, for example, cosmid vectors, bacteriophage vectors, and yeast artificial chromosomes (YACs).
- Cosmids are cloning vectors which contain bacteriophage lambda cos signals for in vitro packaging, and allow the cloning of DNA fragments ranging from 30 to 50 kb.
- Bacteriophage vectors e.g., lambda and PI bacteriophage vectors
- bacteriophage cloning vectors is provided, Sternbers et al . (Proc. Natl. Acad. Sci. USA 87:103-107 (1990)).
- a general description of the YAC cloning system is provided, for example, by Burke et al. (Science 236 : 806 - 812 (1987)).
- the YAC cloning vectors can accomodate from 100 to 600 kb of DNA.
- the order of the discrete DNA sequences on the human chromosome can be determined, for example, by the in situ suppression hybridization method.
- This technique which has been described by Lichter et al. (Hum. Genet. 80:224- 234 (1988)) and is the subject of co-pending application serial no. 07/271,609, permits high resolution physical mapping with fluorescently labeled probe sequences hybridized to interphase or metaphase chromosome preparations. In the Exemplification described below, this method is used to order a set of cosmid clones derived from human chromosome 16.
- the discrete DNA sequences are then analyzed for the presence of a restriction enzyme recognition sequence which contains the dinucleotide CpG.
- This analysis can be conducted by digesting cosmid clones with a rare cutting restriction endonuclease and displaying the products on a gel.
- sequences are known to be rare in the human genome and, therefore, they are frequently referred to as rare cutter sequences (enzymes which recognize such sequences are referred to herein as rare cutting restriction endonucleases).
- rare cutter sequences enzyme which recognize such sequences are referred to herein as rare cutting restriction endonucleases.
- one such sequence, which is recognized by the restriction enzyme Not I occurs at a frequency of approximately
- This analysis is conducted, for example, by purifying insert DNA from the chromosome specific genomic DNA library. This purified DNA is then subjected to digestion with a rare cutting restriction endonuclease.
- the enzyme's recognition sequence comprises a sequence of at least 6 base pairs.
- the restriction enzyme Not I which is discussed in the
- the products of this digestion are then displayed on a gel. Electrophoretic methods are well known to those skilled in the art.
- the preferred gel matrix is agarose; the percentage of agarose can be varied within known ranges to optimize resolution.
- An appropriate control sample for example, is a molecular weight marker or markers having a predetermined electrophoretic mobility. This type of restriction enzyme mapping is a fundamental technique which is well known to those of skill in the art.
- the vector itself contains two recognition sequences for a rare cutting restriction endonuclease. These sites flank the DNA insertion site.
- a vector is shown in Figure 1 and described in the Exemplification. Interpretation is facilitated in this case because if, for example, the insert DNA contains no such site, the digestion product, when displayed on a gel, will include a relatively large band of insert DNA, and a faster migrating band representing the linear vector DNA. If, on the other hand, the insert does contain such a site, the
- electrophoretic display will include 3 or more distinct bands.
- cosmid clones containing a polymorphic DNA sequence are identified.
- the presence of a polymorphic sequence can be identified, for example, by identifying a restriction fragment length polymorphism (RFLP).
- RFLP restriction fragment length polymorphism
- the clones of this invention therefore, have two essential characteristics: 1) they span regions of the chromosome containing a recognition sequence for a rare cutting restriction endonuclease, and 2) they contain a DNA sequence polymorphism.
- a clone which satisfies characteristic 1) can be referred to as a linking clone because it would hybridize to (or link) two adjacent fragments from a total digest of chromosome specific human DNA with the restriction enzyme which recognizes the CpG containing sequence.
- the clones should be spaced from one another along the chromosome at a distance of less than 10 cM, and optimally 2-5 cM.
- Clones containing such sequences are, therefore, enriched in DNA sequences corresponding to the 5' ends of genes thereby offering a convenient method for isolating and cloning a eukaryotic gene or genes.
- the method involves providing a DNA library of genomic DNA clones containing insert DNA from the
- a eukaryotic organism of interest can be constructed, for example, by isolating DNA and cloning fragments of the DNA into an appropriate vector.
- an important consideration is the size of the DNA insert.
- Eukaryotic genes can contain multiple introns which do not encode any portion of the protein encoded by the gene, but rather, are excised from the transcribed mRNA prior to translation.
- the factor VIII gene in the human which encodes the blood- clotting factor deficient in hemophilia A, has been reported to span at least 190 kb (Gitschier et al., Nature 312:326 (1986)). Recent reports indicate that the defective gene responsible for Duchenne's muscular dystrophy may span more than
- the insert DNA is preferably greater than 20 kb in length. Any cloning vector which can accommodate insert DNA of 20 kb or greater is useful for the construction of a genomic library to be screened by the method of this invention.
- a preferred vector for the construction of the genomic library is a cosmid vector which can accomodate DNA fragments ranging from 30 to 50 kb.
- bacteriophage vectors or YAC cloning vectors are also useful for this purpose.
- DNA from individual genomic DNA clones contained within the genomic DNA library is purified using known techniques.
- the purification of cosmid DNA is relatively straightforward and well known in the art.
- the purification of a yeast artificial chromosome is more
- a YAC can be purified in any way in which a YAC can be purified.
- This electrophoretic technique enables the resolution of very large DNA molecules.
- the YAC is then isolated from the other DNA bands in the gel and purified from the gel material using known techniques.
- the purified DNA is then subjected to digestion by a rare cutting restriction enzyme.
- Many rare cutting restriction endonucleases are known to those skilled in the art.
- the enzyme's recognition sequence comprises a sequence of at least 6 bases. Especially preferred is the restriction enzyme Notl.
- the insert must be further characterized to identify the portion of interest containing the gene. This can be done, for example, by restriction enzyme mapping and DNA
- sequence of the coding region can then be compared with sequences recorded in gene bank data bases. Using this approach, a genetic locus can be assigned to a gene whose sequence, or a portion thereof, has been determined previously.
- This method for isolating a gene is particularly useful when attempting to isolate a gene of interest for which no portion of the nucleotide sequence is known and the identity of the encoded protein is unknown. This is the case, for example, in the study of many human genetic disorders. In the case of such human genetic disorders, a gene is known to be responsible for a disease
- Those clones hybridizing near the location of the gene of interest as determined by genetic mapping represent candidate clones which are analyzed to determine whether they, in fact, encode the gene responsible for the disease phenotype.
- a variety of strategies can be used to determine whether a candidate clone is, in fact, the gene responsible for the disease phenotype.
- the clone, or a portion thereof can be labeled with a reporter group and used to study tissue distributuion of complementary mRNA.
- the gene responsible for autosomal dominant polycystic kidney disease (ADPKD) is known to be manifested in kidney cells. Clones which hybridize to mRNA specifically expressed in kidney cells can be selected for further analysis.
- such a clone can be used to probe cDNA libraries generated from two sources; individuals having autosomal dominant polycystic kidney disease and individuals not having the disease. Both of the genes isolated from these sources can then be sequenced, and the nucleotide change (s) responsible for the disease phenotype can then be determined.
- chromosome-specific cosmid clones that: 1) span the chromosome; 2) contain the recognition sequence of the restriction enzyme Not I; and 3) identify Mendelian
- ADPKD autosomal dominant polycystic kidney disease
- EBV- transformed lymphoblastoid cell lines were grown in Iscove's Modified Dulbecco's Medium supplemented with 10% horse serum.
- the hybrid cell lines were grown in F12 medium supplemented with 10% fetal calf serum, 5 x 10 5 M adenine, and 4 ⁇ g/ml azaserine.
- Mouse cell line A9 was grown in a similar manner.
- DNA was isolated from cultured cell lines by
- the cosmid cloning vector cHCl was derived from the high-copy number, double cos vector c2XBHC (Bates and Swift, Gene 26:137-146 (1983); Bates, P.F., Methods in Enzymology 153: 82-94 (1987)) and the "walking easy" vector pWE15 (Stratagene, LaJolla, CA) .
- the small Notl fragment of pWE15 which contains the T3 and T7 promoters and the BamHI cloning site, was enzymatically inserted into a derivative of c2XBHC (gift of Dr. Paul Bates) encoding a single NotI site.
- the Notl site was created at the single BamHI site of the c2XBHC vector by linker ligation.
- a cosmid library was constructed as described by Swift and Bates (Gene 26:137-146 (1983)) using cell line CY18 (Callen, D.F., Ann. Genet. 29 : 235 - 239 (1986)) and vector cHC1.
- High molecular weight DNA was isooated from CY18 cells by proteinase K digestion and very gentle phenol/chloroform extraction followed by dialysis. The resultant DNA was greater than 150 kb as judged by pulsed field gel electrophoresis.
- the vector was digested with Smal, dephosphorylated with calf intestinal phosphatase (Boehringer Mannheim), and digested with BamHI. Theinsert was partially digested with Sau3A and similarly dephosphorylated. Ligation was carried out using 1 ⁇ g of vector arms and 1.5 ⁇ g of target DNA in a final volume of 5 ⁇ l . Reactions were incubated with 200 units of T4 DNA Ligase (New England Biolabs) at room temperature for 4 hours. The DNA was packaged using Gigapack Plus I
- Colony filters were probed with nick-translated 32 -P- labeled human DNA (0.5-1 x 10 6 cpm/ml). Hybridizations and washes were performed as described for Southern blots
- RNA probes were transcribed from the bacteriophage T3 and T7 RNA promoters present in the vector (ref) and hybridized at 1-10 x 10 6 cpm/ml. Total torula RNA (0.2 mg/ml) was added as competitor.
- Cosmid clones were mapped for the rare cutting enzymes BssHII, Mlul, Notl, Nrul, Pvul and Sacll. All enzymes were obtained from New England Biolabs (Beverly, MA) and digestions were performed according to manufacturer's recommendations. Mapping was performed by the single and double enzyme digestion method or partial digestion method of Smith and Birnstiel (Nucl. Acids.
- Blotting was done bi-directionally to allow for accurate comparison between hybridizations.
- genomic DNA Five ⁇ g of genomic DNA were digested with excess restriction enzyme and fractionated on an 0.8% agarose gel in 1XTBE.
- the DNA in the gel was nicked by partial depurination in 0.25 M HCl, denatured in 0.5 M NaOH/1.5 M NaCl, and transferred to nylon membrane (e.g., Magnagraph from MSI, Westboro, MA or SureBlot from Oncor, Gaithersburg, MD). After transfer, the membrane was rinsed in 2XSSC, air-dried, and baked in vacuo for 2 hours at 80°C.
- nylon membrane e.g., Magnagraph from MSI, Westboro, MA or SureBlot from Oncor, Gaithersburg, MD.
- RFLP panels contained digests of DNAs isolated from lymphoblastoid cell lines of six unrelated individuals.
- the initial enzyme set included BglII, Bcll, EcoRI,
- Each 100 ⁇ l block contained the DNA from 2 x 10 6 IG138 cells or TIL-1 cells, approximately 10 ⁇ g. Restriction digests were carried out as described by
- the gel composition was 0.7% FastLane agarose (FMC Bioproducts,
- Electrophoresis was carried out in 0.5XTBE buffer for 16 hours at 180V with a switching interval of 60 seconds. The temperature was maintained at 15°C. Size markers were chromosomes of S. cerevisiae and lambda ladders purchased from Bio-Rad
- Metaphase spreads were prepared from normal cultured lymphocytes (46, XY) by standard procedures of colcemid arrest, hypotonic treatment, and acetic acid-methanol fixation.
- Cosmid probes were prepared by direct nick translation with biotinylated nucleotides. To facilitate probe penetration and to optimize reannealing, the size of the probe DNA was adjusted empirically to a length of 150-250 nucleotides by varying the DNAse concentration in the nick translation reaction.
- Cosmid vector cHCl shown in Figure 1 is 6 kb in size and has a cloning capacity of 35 to 50 kb.
- the T3 and T7 promoters flanking the BamHI cloning site allow synthesis of end-specific RNA probes for chromosome walking and mapping.
- the Notl sites flanking the cloning site allow excision of the cloned DNA insert.
- the 20 cosmid clones were biotinylated and used as probes in fluorescent in situ hybridization analysis of human metaphase chromosomes. Hybridization was carried out under conditions that suppress signal from repetitive DNA sequences.
- Chromosome 16 was identified by hybridization with a chromosome 16-specific alpha satellite DNA clone (Oncor) and by its DAPI-staining pattern. Each clone hybridized exclusively to chromosome 16. The results of this analysis show that the 20 clones are not randomly distributed over the chromosome: 9 map to the long arm and 11 to the short arm. Of the 11 short-arm probes, 7 map to 16 ⁇ l3.3 to l ⁇ pter.
- Figure 2 shows the restriction enzyme maps for the set of 6 NotI-containing cosmids and overlapping clones isolated by chromosome walking (designated by "W" in the clone name).
- the maps place the sites for the rare cutting restriction enzymes BssHII, Mlul, Notl, Nrul, Pvul, and Sacll.
- the clustering of sites for these enzymes is indicative of CpG-rich HTF islands.
- the Notl sites present in cosmids 16-4N, 16-30N and 16-129N are in close proximity to 2 or more rare cutting restriction enzyme sites, and are most likely island-related. This is not the case for the Notl sites present in the other cosmids.
- One possible explanation is that these Notl sites are situated in CpG-rich regions which do not encode sites for these or other rare cutting restriction enzymes.
- HTF island-like regions lacking Notl sites are present in cosmids 30N and 132N.
- the 6 cosmids examined define 6 different loci containing 8 Notl sites.
- the Notl sites in the loci defined by 16-30N and 16-38N are unmethylated and digest to completion.
- the Notl sites in the loci defined by 16-4N and 16-132N are not cleaved and, therefore, probably methylated at this site in both cell lines.
- Intermediate extents of methylation are observed at the Notl sites present in the loci defined by cosmids 16-14N and 16-129N.
- the extent of methylation of the specific Notl sites is the same for IG138 and TIL-1 DNA.
- 16-129N can be effectively used to link Notl restriction fragments for long-range physical mapping.
- Cosmid 16-30N contains 2 Notl sites which lie 25 kb apart and are unmethylated in genomic DNA. We expect 16-30N to anneal to 3 genomic Notl fragments, but in this experiment, we see only 2 Notl fragments (25 kb and 105 kb). In the case of 16-38N, whole cosmid hybridizes to a single, resolvable Notl fragment of 150 kb. The location of the NotI site in cosmid 38N should have allowed for the detection of both the leftward and rightward Notl fragments. We conclude that either the leftward fragment ran off the CHEF gel or that the missing Notl fragments is 1600 kb or greater and consequently, unresolved using these electrophoretic conditions. Cosmid 16-129N encodes a single Notl site which is substantially unmethylated (70% cleavage) in the DNA of IG138 and TIL-1 cells.
- Cosmid 16-132N encodes a Notl site which is fully methylated in genomic DNA and hybridizes to a fragment which Is not resolved using these electrophoretic conditions.
- Cosmid 16-4N hybridizes to a single Notl fragment consistent with the fully methylated state of the Notl site encoded by this cosmid. The two loci defined by these two cosmids form an interesting
- cosmid 16-4N Notl site appears to be in an HTF-island while the Notl site encoded by cosmid 132N exists as an isolated rare cutting re s tr ic t ion enzyme s i te .
- Cosmid clones 16-30N and 16-38N failed to identify polymorphisms with the 14 restriction enzymes used in this study.
- a cosmid derived from a 34 kb walk from the 16-30N locus (16-30NW6) also failed to identify any
- Clone 16-38N hybridizes with the oligonucleotide (CA) 10 and, thus, may identify a microsatellite polymorphism (). Clone 16-30N does not hybridize with this
- Clone 16-14N maps to 16pl3.3-16pl3.13 and thus represents a candidate PKD-1 clone.
- a variety of strategies can be adopted to determine whether this candidate clone is, in fact, the gene responsible for the disease phenotype.
- clone 16-14N, or a portion thereof can be labeled with a reporter group and used to study tissue distribution of complementary mRNA.
- the gene responsible of ADPKD is manifested in kidney cells.
- kidney cells specifically expressed in kidney cells can be selected for further analysis.
- a clone (or a portion' thereof) can be used to probe cDNA libraries generated from two sources; individuals having autosomal dominant polycystic kidney disease and individuals not having the disease. Both of the genes isolated from these sources can then be sequenced, and the nucleotide
- ADPKD locus change(s) responsible for the disease phenotype are determined.
- the other clones discussed above which do not map to the ADPKD locus can be analyzed by determining their DNA sequence and comparing that sequence with the sequences recorded in gene bank data bases. Using this approach, a genetic locus can be assigned to proteins of known sequence, as well as those of unknown sequence.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Plant Pathology (AREA)
- Immunology (AREA)
- Analytical Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
A method for mapping discrete DNA sequences complementary to regions of a eukaryotic chromosome, and a method for isolating a gene from a eukaryotic organism of interest. The methods are useful for the efficient identification of genetic markers spaced throughout the human genome. Those genetic markers identified by the mapping method of this invention tend to be associated with the 5' ends of genes. The method, therefore, is useful for the isolation of eukaryotic genes.
Description
A METHOD FOR MAPPING A EUKARYOTIC CHROMOSOME
Background of the Invention
The human genome consists of a DNA sequence of some 3 billion base pairs carried on 46 chromosomes. This genetic blueprint provides all of the information
necessary for the growth, differentiation and maintenance of the vast array of human cells.
The United States has recently announced, as a national objective, the mapping and sequencing of the human genome. The director of this project, James D. Watson, has recently summarized the importance of this project by stating that the interpretation of the data gained through this work "will not only help us to understand how we function as healthy human beings, but will also explain, at the chemical level, the role of genetic factors in a multitude of diseases, such as cancer, Alzheimer's disease, and schizophrenia, that diminish the individual lives of so many millions of people."
The construction of genetic linkage maps using restriction fragment length polymorphisms (RFLPs) was first described by Botstein et al. (Am. J. Hum. Genet.
32:314 (1980)). Mapping of a genetic locus by linkage analysis could be performed with high efficiency if polymorphic DNA probes were identified at a spacing of approximately every 10 centimorgans (1cM = 1-2 mbp)
throughout the human genome. A current estimate of the number of evenly spaced markers required to provide markers 20cM apart is 300-700. To obtain such a collection of probes with a spacing of approximately 10 cM or less requires cloning, analyzing and mapping a substantial portion of the genome. This is true of any random mapping method because there is a high probability that newly defined markers will map genetically too close to an existing marker to be useful as an additional marker.
A need exists for an efficient method for identifying genetic markers spaced at intervals of 10cM or less throughout the human genome.
Summary of the Invention
The subject invention relates, in one aspect, to a method for ordering a set of discrete DNA sequences complementary to a eukaryotic chromosome for physical and genetic mapping. The method involves providing a set of discrete DNA sequences, each discrete DNA sequence being complementary to a region of the eukaryotic chromosome. The order of the discrete sequences on the chromosome is determined by in situ hybridization. Discrete DNA sequences which contain a restriction enzyme recognition' sequence containing the dinucleotide CpG and a polymorphic DNA sequence are then identified.
By determining the order of the discrete DNA
sequences (usually genomic clones) on the chromosome prior to further characterization, problems associated with random mapping are eliminated. For example, the number of clones which must be analyzed to define markers
of appropriate spacing is reduced significantly as compared to the number necessary when using the classical random mapping approach. This is true as a result of the initial mapping information provided by in situ hybridization. This allows the usual selection of desirably spaced clones. If selecting anonymous clones (as is done in the classical mapping approach), the analysis of a great many clones is necessary before it would be
possible to define a set of markers spanning the chromosome at a desired spacing.
In another aspect, the invention relates to a method for isolating a gene from a eukaryotic organism of interest. A DNA library of genomic DNA clones containing insert DNA from the eukaryotic organism of interest is provided. DNA is purified from individual genomic DNA clones contained within the genomic DNA library and digested with at least one restriction enzyme which recognizes and cleaves a nucleotide sequence which contains the dinucleotide CpG. The products of the restriction enzyme digestion are displayed on a gel and the genomic DNA clones having insert DNA which is
recognized and cleaved by the restriction enzyme are identified.
The method for isolating a gene from a eukaryotic organism is particularly useful when a gene of interest having a known genetic map position has been identified. In this case, individual genomic DNA clones having insert DNA which is recognized and cleaved by the restriction enzyme which recognized and cleaves a nucleotide sequence which contains the dinucleotide CpG, are labeled and the
map position of the complementary chromosomal region for each clone is determined physically by in situ
hybridization. Candidate clones are identified as those having a physical map position near the location of the gene of interest as determined by genetic mapping.
Brief Description of the Figures
Figure 1 is a diagrammatic representation of the vector cHCl.
Figures 2A and 2B are diagrammatic representations showing restriction enzyme cleavage sites for 6
NotI-containing cosmids.
Detailed Description of the Invention
The subject invention is based on the discovery of a simple and convenient method for ordering a set of discrete DNA sequences, each discrete DNA sequences being complementary to a region of a eukaryotic chromosome. The ordering of the discrete DNA sequences is useful for physical and genetic mapping. As used herein, the term "ordering" means to establish the linear relationship of the discrete DNA sequences relative to one another on the chromosome, or a portion of the chromosome.
As an intial step, a set of discrete DNA sequences are provided, each discrete DNA sequence being complementary to a region of the chromosome. In a preferred embodiment, the set of discrete DNA sequences is provided as a chromosome specific genomic DNA library. The chromosome specific genomic DNA library can be obtained from any source, or it can be constructed using known
techniques. Of particular interest are chromosome specific libraries of human origin, although the methods described herein are applicable to all eukaryotes.
Somatic cell hybrids (e.g. hamster/human hybrids) are available which contain a single human chromosome. Such libraries are available for purchase from the American Type Culture Collection (Rockville, Md). Total DNA isolated from such hybrid cells can be fragmented, for example by restriction enzyme digestion, and inserted into an appropriate vector. Genomic clones carrying human inserts (as opposed to inserts of hamster origin) are identified by probing the library with human specific probe DNA (e.g. Alu repeat sequences). Alternatively, the desired human chromosome can be isolated from other- human chromosomes by the well known flow sorting
technique and used to construct the chromosome specific library.
Preferably, the vector used for construction of the genomic DNA library accomodate inserts of greater than 20 kb. Such vectors include, for example, cosmid vectors, bacteriophage vectors, and yeast artificial chromosomes (YACs). Cosmids are cloning vectors which contain bacteriophage lambda cos signals for in vitro packaging, and allow the cloning of DNA fragments ranging from 30 to 50 kb. Bacteriophage vectors (e.g., lambda and PI bacteriophage vectors) can accommodate from 20- to 100 kb of DNA. A general description of large capacity
bacteriophage cloning vectors is provided, Sternbers et al . (Proc. Natl. Acad. Sci. USA 87:103-107 (1990)). A general description of the YAC cloning system is provided, for example, by Burke et al. (Science 236 : 806 - 812
(1987)). The YAC cloning vectors can accomodate from 100 to 600 kb of DNA.
The order of the discrete DNA sequences on the human chromosome can be determined, for example, by the in situ suppression hybridization method. This technique, which has been described by Lichter et al. (Hum. Genet. 80:224- 234 (1988)) and is the subject of co-pending application serial no. 07/271,609, permits high resolution physical mapping with fluorescently labeled probe sequences hybridized to interphase or metaphase chromosome preparations. In the Exemplification described below, this method is used to order a set of cosmid clones derived from human chromosome 16.
The discrete DNA sequences are then analyzed for the presence of a restriction enzyme recognition sequence which contains the dinucleotide CpG. This analysis can be conducted by digesting cosmid clones with a rare cutting restriction endonuclease and displaying the products on a gel. Such sequences are known to be rare in the human genome and, therefore, they are frequently referred to as rare cutter sequences (enzymes which recognize such sequences are referred to herein as rare cutting restriction endonucleases). For example, one such sequence, which is recognized by the restriction enzyme Not I, occurs at a frequency of approximately
1/500,000 base pairs. This frequency is convenient for mapping purposes because the theoretical spacing corresponds roughly to resolution limits of the in situ suppression hybridization method described above.
This analysis is conducted, for example, by purifying insert DNA from the chromosome specific genomic DNA
library. This purified DNA is then subjected to digestion with a rare cutting restriction endonuclease.
Preferably the enzyme's recognition sequence comprises a sequence of at least 6 base pairs. For example, the restriction enzyme Not I, which is discussed in the
Exemplification, is particularly useful for this purpose.
The products of this digestion are then displayed on a gel. Electrophoretic methods are well known to those skilled in the art. The preferred gel matrix is agarose; the percentage of agarose can be varied within known ranges to optimize resolution.
Those individual genomic DNA clones having insert DNA which is recognized and cleaved by the rare cutting restriction enzyme are identified by comparing the products of the restriction enzyme digest displayed on a gel with the pattern or expected pattern of an
appropriate control sample. An appropriate control sample, for example, is a molecular weight marker or markers having a predetermined electrophoretic mobility. This type of restriction enzyme mapping is a fundamental technique which is well known to those of skill in the art.
In a preferred embodiment, the vector itself contains two recognition sequences for a rare cutting restriction endonuclease. These sites flank the DNA insertion site. Such a vector is shown in Figure 1 and described in the Exemplification. Interpretation is facilitated in this case because if, for example, the insert DNA contains no such site, the digestion product, when displayed on a gel, will include a relatively large band of insert DNA, and a faster migrating band
representing the linear vector DNA. If, on the other hand, the insert does contain such a site, the
electrophoretic display will include 3 or more distinct bands.
In another screening step, cosmid clones containing a polymorphic DNA sequence are identified. The presence of a polymorphic sequence can be identified, for example, by identifying a restriction fragment length polymorphism (RFLP). One way in which this can be done is by
digesting DNA from several unrelated individuals with a variety of different restriction enzymes. This DNA is electrophoretically fractionated, and then transferred to a solid support (e.g. nitrocellulose paper or a nylon filter). This DNA is then screened using labeled cosmid clones. In some individuals, a polymorphism is
identified as a restriction fragment having a length differing from the corresponding sequence in another individual. When an RFLP is identified, family members of the individual from which the RFLP was identified are analyzed in a similar manner to determine whether or not the RFLP is inherited according to Mendelian principles. An RFLP which is inherited according to Mendelian
principles provides a useful marker for genetic mapping.
The clones of this invention, therefore, have two essential characteristics: 1) they span regions of the chromosome containing a recognition sequence for a rare cutting restriction endonuclease, and 2) they contain a DNA sequence polymorphism. A clone which satisfies characteristic 1) can be referred to as a linking clone because it would hybridize to (or link) two adjacent fragments from a total digest of chromosome specific
human DNA with the restriction enzyme which recognizes the CpG containing sequence. To be maximally useful for mapping purposes, the clones should be spaced from one another along the chromosome at a distance of less than 10 cM, and optimally 2-5 cM.
The isolation of clones containing recognition sequences for rare cutting restriction endonucleases offers another advantage. It has been reported that the dinucleotide CpG tends to appear in clusters associated with the 5' ends of eukaryotic genes. These clusters, often referred to as HTF islands (an abbreviation for Hpa I tiny fragment islands) are discussed in two review articles by Bird (Nature 321:209-213 (1986); TIG
3:342-347 (1987)). Clones containing such sequences are, therefore, enriched in DNA sequences corresponding to the 5' ends of genes thereby offering a convenient method for isolating and cloning a eukaryotic gene or genes.
The method involves providing a DNA library of genomic DNA clones containing insert DNA from the
eukaryotic organism of interest. As described previously, such a library can be constructed, for example, by isolating DNA and cloning fragments of the DNA into an appropriate vector. When preparing such a library for the purpose of isolating a eukaryotic gene, an important consideration is the size of the DNA insert. Eukaryotic genes can contain multiple introns which do not encode any portion of the protein encoded by the gene, but rather, are excised from the transcribed mRNA prior to translation.
For example, the factor VIII gene in the human, which encodes the blood- clotting factor deficient in
hemophilia A, has been reported to span at least 190 kb (Gitschier et al., Nature 312:326 (1986)). Recent reports indicate that the defective gene responsible for Duchenne's muscular dystrophy may span more than
1,000,000 base pairs (Monaco et al., Nature 323:646
(1986)). Therefore, in order to minimize the number of clones which must be screened in order to isolate the gene of interest, the insert DNA is preferably greater than 20 kb in length. Any cloning vector which can accommodate insert DNA of 20 kb or greater is useful for the construction of a genomic library to be screened by the method of this invention.
A preferred vector for the construction of the genomic library is a cosmid vector which can accomodate DNA fragments ranging from 30 to 50 kb. As discussed above, bacteriophage vectors or YAC cloning vectors are also useful for this purpose.
DNA from individual genomic DNA clones contained within the genomic DNA library is purified using known techniques. The purification of cosmid DNA is relatively straightforward and well known in the art. The purification of a yeast artificial chromosome is more
technically demanding. One way in which a YAC can be purified is to run a DNA sample containing all yeast chromosomes on a low melting agarose gel by pulsed field agarose gel electrophoresis. This electrophoretic technique enables the resolution of very large DNA molecules. The YAC is then isolated from the other DNA bands in the gel and purified from the gel material using known techniques.
The purified DNA is then subjected to digestion by a rare cutting restriction enzyme. Many rare cutting restriction endonucleases are known to those skilled in the art. Preferably the enzyme's recognition sequence comprises a sequence of at least 6 bases. Especially preferred is the restriction enzyme Notl.
Once a clone containing a rare cutting restriction endonuclease recognition sequence is isolated, the insert must be further characterized to identify the portion of interest containing the gene. This can be done, for example, by restriction enzyme mapping and DNA
sequencing. The sequence of the coding region can then be compared with sequences recorded in gene bank data bases. Using this approach, a genetic locus can be assigned to a gene whose sequence, or a portion thereof, has been determined previously.
This method for isolating a gene is particularly useful when attempting to isolate a gene of interest for which no portion of the nucleotide sequence is known and the identity of the encoded protein is unknown. This is the case, for example, in the study of many human genetic disorders. In the case of such human genetic disorders, a gene is known to be responsible for a disease
phenotype, but the identity of the defective protein is unknown. In these situations, unless there are
associated gross changes in the chromosomal architecture (e.g. deletion, translocation or inversion) which can be detected by cytogenetic methods, efforts toward
localization are limited to studies of genetic linkage in families. This type of analysis typically yields a map resolution only to within several million base pairs. By
providing a genomic library which is specific for the chromosome known to carry the defect, it is possible to reduce the number of clones which theoretically must be screened in order to expect to have a high probability of identifying the clone carrying the gene responsible for the defect.
When the genetic map position of a gene of interest is known, the clones identified which contain recognition sequences for rare cutting enzymes can be labeled and hybridized to human chromosomes in situ as described, forexample, by Lichter et al. (Hum. Genet. 80:224-234
(1988)). Those clones hybridizing near the location of the gene of interest as determined by genetic mapping represent candidate clones which are analyzed to determine whether they, in fact, encode the gene responsible for the disease phenotype. A variety of strategies can be used to determine whether a candidate clone is, in fact, the gene responsible for the disease phenotype. For example, the clone, or a portion thereof, can be labeled with a reporter group and used to study tissue distributuion of complementary mRNA. As discussed in the Exemplification which follows, the gene responsible for autosomal dominant polycystic kidney disease (ADPKD) is known to be manifested in kidney cells. Clones which hybridize to mRNA specifically expressed in kidney cells can be selected for further analysis. For example, such a clone can be used to probe cDNA libraries generated from two sources; individuals having autosomal dominant polycystic kidney disease and individuals not having the disease. Both of the genes isolated from these sources
can then be sequenced, and the nucleotide change (s) responsible for the disease phenotype can then be determined.
EXEMPLIFICATION The Exemplification which follows sets forth a strategy for the isolation and mapping of an ordered set of discrete DNA sequences for the physical and genetic mapping of individual human chromosomes. The goal of this work is to generate a collection of
chromosome-specific cosmid clones that: 1) span the chromosome; 2) contain the recognition sequence of the restriction enzyme Not I; and 3) identify Mendelian
RFLPs. By providing both genetic and physical mapping data, this ordered set of discrete DNA sequences will serve to integrate existing physical and genetic
chromosome maps.
The overall efficiency of this strategy depends on:
1) the distribution of Not I sites on a given chromosome;
2) the extent to which the corresponding genomic Not I site is cleavable by that enzyme; and 3) the degree to which the Not I containing clones are genetically
polymorphic. To address these questions, a collection of cosmid clones spanning Not I sites on human chromosome 16 has been established. Below is presented the results of initial molecular and genetic characterization of six of these clones.
The ordered set of discrete DNA sequences derived from this model study of chromosome 16 is useful for the
study of chromosomal abnormalities. For example, autosomal dominant polycystic kidney disease (ADPKD) is a common genetic disorder with a frequency of 1 per 1,000 in populations of European origin. It is an important cause of chronic renal failure in Europe and the United States, accounting, for approximately 10% of all long-term kidney dialysis and transplantation.
It is known that the genetic defect which is
responsible for the ADPKD phenotype is linked with both α-globin and phosphyoglycolate phosphatase (PGP), thus assigning the locus for the disease to the short arm of chromosome 16 (16p). Physical mapping studies have further refined the localization of this gene to a 600 Kd region of 16pl3.3. In the Exemplification which follows a strategy for cloning the ADPKD gene is presented along with results which validate the strategy. Methods and Materials
Cell Lines and DNAs
Human Epstein-Barr virus (EBV)-transformed lymphoblastoid cell lines were from our laboratory collection. The hypomethylated human lymphoblas toid cell line Til-I was the gift of Dr. Susan Lindsay (Lindsay et al., Hum. Genet. 81: 252-256 (1989)). The mouse-human chromosome 16 somatic cell hybrid lines have been described previously (Callen, D.F., Ann. Genet. 29 : 235 - 239 (1986); Callen et al., Genomics 2:144-153 (1988); Callen et al., Genomics 4:348-354 (1989)). Table 1 shows the source of the translocation, the original karyotype, and the laboratory name of these hybrid cell lines.
EBV- transformed lymphoblastoid cell lines were grown in Iscove's Modified Dulbecco's Medium supplemented with 10% horse serum. The hybrid cell lines were grown in F12 medium supplemented with 10% fetal calf serum, 5 x 105 M adenine, and 4 μg/ml azaserine. Mouse cell line A9 was grown in a similar manner.
DNA was isolated from cultured cell lines by
standard phenol/chloroform extraction following proteinase K digestion. High molecular weight DNA for cosmid library construction or pulsed field gel electrophoresis was prepared as described below.
* The breakpoint of CY7 is distal to the breakpoint of CY8. Both these breakpoints are in 16q13.
** The identification of additional human material was not possible because of the presence of unidentifiable human marker chromosomes and translocations between mouse and human chromosomes.
TABLE 1 (Continued)
1. Callen, D.F., A mouse/human hybrid cell panel for mapping human chromosome 16. Ann. Genet. 29:235-239
(1986).
2. Callen, D.F., Hyland, V.J., Baker, E.H., Fratini, A., Simmers, R.N., Mulley, J.C., Sutherland, G.R., Fine mapping of gene probes and anonymous DNA fragments to the long arm of chromosome 16, Genomics 2:144-153 (1988).
3. Callen, D.F. (unpublished).
4 . Koeffler , H . P . , Sparkes , R . S . , Stang, H., Mohandas, T., Regional assignment of genes for human
alpha-globin and phosphoglycolate phosphatase to the short arm of chromosome 16. Proc. Natl. Acad. Sci. 78:7015-7018 (1981).
5. Derived from cell line of Breuning, M.H., Medan, K., Verjaal, M., Wijnen, J.T., Meera Klou, P., Pearson, P.L., Human globin maps to pter-p13.3 in chromosome 16 distal to PGP, Hum. Genet. 76:287-289 (1987).
6. Callen, D.F. et al., Mapping the Short Arm of Human Chromosome 16, Genomics 4:348-354 (1989).
Vector and Cosmid LIbrary Construction
The cosmid cloning vector cHCl was derived from the high-copy number, double cos vector c2XBHC (Bates and Swift, Gene 26:137-146 (1983); Bates, P.F., Methods in Enzymology 153: 82-94 (1987)) and the "walking easy" vector pWE15 (Stratagene, LaJolla, CA) . The small Notl fragment of pWE15, which contains the T3 and T7 promoters and the BamHI cloning site, was enzymatically inserted into a derivative of c2XBHC (gift of Dr. Paul Bates) encoding a single NotI site. The Notl site was created at the single BamHI site of the c2XBHC vector by linker ligation.
A cosmid library was constructed as described by Swift and Bates (Gene 26:137-146 (1983)) using cell line CY18 (Callen, D.F., Ann. Genet. 29 : 235 - 239 (1986)) and vector cHC1. High molecular weight DNA was isooated from CY18 cells by proteinase K digestion and very gentle phenol/chloroform extraction followed by dialysis. The resultant DNA was greater than 150 kb as judged by pulsed field gel electrophoresis. The vector was digested with Smal, dephosphorylated with calf intestinal phosphatase (Boehringer Mannheim), and digested with BamHI. Theinsert was partially digested with Sau3A and similarly dephosphorylated. Ligation was carried out using 1 μg of vector arms and 1.5 μ g of target DNA in a final volume of 5 μl . Reactions were incubated with 200 units of T4 DNA Ligase (New England Biolabs) at room temperature for 4 hours. The DNA was packaged using Gigapack Plus I
(Stratagene, LaJolla, CA), titered on the host 1046 (Cami et al., Nucl. Acids. Res. 5:2381-2390 (1978)), and plated on LB agar plates containing 50 μg/ml ampicillin. Library Screening
Colonies were plated at low density (1,000 colonies/
2
150 mm plate) onto LB agar containing 50 μg/ml ampicillin. The colonies were transferred onto nylon membrane disks in duplicate and processed, as described by Dillela and Woo (Meth. Enzymol. 152:199-212 (1987)).
Colony filters were probed with nick-translated 32-P- labeled human DNA (0.5-1 x 106 cpm/ml). Hybridizations and washes were performed as described for Southern blots
(see below). Autoradiography was done with Kodak XAR-5 film and an intensifying screen overnight at -70°C. DNA
was prepared from the human clones using the rapid boiling miniprep procedure (Holmes and Quigley, Anal. Biochem. 114:193-197 (1981)), digested with Notl, and electrophoresed on an 0.7% agarose gel in 1XTBE buffer.
For chromosome walking, colony filters were prepared as described above. Radiolabeled RNA probes were transcribed from the bacteriophage T3 and T7 RNA promoters present in the vector (ref) and hybridized at 1-10 x 106 cpm/ml. Total torula RNA (0.2 mg/ml) was added as competitor.
Restriction Mapping
Cosmid clones were mapped for the rare cutting enzymes BssHII, Mlul, Notl, Nrul, Pvul and Sacll. All enzymes were obtained from New England Biolabs (Beverly, MA) and digestions were performed according to manufacturer's recommendations. Mapping was performed by the single and double enzyme digestion method or partial digestion method of Smith and Birnstiel (Nucl. Acids.
Res. 3:2387-2398 (1976)), using labeled oligonucleotides to the T3 and T7 RNA promotor sequences bordering the insert. Digests were run on two types of gels to
optimize sizing: 0.7% agarose in 1XTBE overnight at 60V or 0.4% agarose in 1XTAE at 40V for 16-40 hours.
Blotting was done bi-directionally to allow for accurate comparison between hybridizations.
Southern Blot Analysis
Five μg of genomic DNA were digested with excess restriction enzyme and fractionated on an 0.8% agarose gel in 1XTBE. The DNA in the gel was nicked by partial
depurination in 0.25 M HCl, denatured in 0.5 M NaOH/1.5 M NaCl, and transferred to nylon membrane (e.g., Magnagraph from MSI, Westboro, MA or SureBlot from Oncor, Gaithersburg, MD). After transfer, the membrane was rinsed in 2XSSC, air-dried, and baked in vacuo for 2 hours at 80°C.
32 P-labeled probes were prepared either by nick translation (Rigby et al., J. Mol. Biol. 113:237-251
(1977)) or random priming (Feinberg and Voglestein, Anal.Biochem. 132 : 6-13 (1983); Feinberg and Voglestein,
Addendum, 137: 266 -277 (1984)). Repetitive elements present in some probes were competed out by pre-annealing of the labeled probe with excess sonicated human
placental DNA (Scambler et al., Nucl. Acids Res.
15 : 3639-3651 (1987)). Hybridizations were carried out as described by Church and Gilbert ( Proc. Natl. Acad. Sci. USA 81:1991-1995 (1984)). Filters were washed at high stringency (0.1XSSC/0.1% SDS at 65°C) unless otherwise stated in the text. Autoradiography was carried out at -70°C with Kodak XAR-5 film and two intensifying screens. RFLP Analysis
RFLP panels contained digests of DNAs isolated from lymphoblastoid cell lines of six unrelated individuals. The initial enzyme set included BglII, Bcll, EcoRI,
Hindlll, Mspl, PstI, PvuII, Sacl, and Tagl. When a clone failed to identify an RFLP with these initial nine enzymes, the analysis was extended to include EcoRV,
Hhal, Rsal, StuI, and Xbal . Southern blot analys is was carried out as described above. Mendelian inheritance was confirmed in seven 2-generation Caucasian families.
Pulsed Field Gel Electrophoresis
High molecular weight cellular DNA was encased in agarose blocks as described by Hermann et al. (Cell
48:813-825 (1987)). Each 100 μl block contained the DNA from 2 x 106 IG138 cells or TIL-1 cells, approximately 10 μg. Restriction digests were carried out as described by
Anand, R. (TIG Nov. 278-283 (1986)). Half blocks samples were analyzed by contour clamped homogenous electric field (CHEF) electrophoresis (Chu et al., Science
234: 1582-1585 (1986)) using a custom-made apparatus (OWL
Scientific Plastics, Inc., Cambridge, MA). The gel composition was 0.7% FastLane agarose (FMC Bioproducts,
Rockland, ME) in 0.5 x TBE buffer. Electrophoresis was carried out in 0.5XTBE buffer for 16 hours at 180V with a switching interval of 60 seconds. The temperature was maintained at 15°C. Size markers were chromosomes of S. cerevisiae and lambda ladders purchased from Bio-Rad
(Burlingame, CA). Southern blot analysis was carried out as described above. In Situ Hybridization
Fluorescent in situ hybridization was carried out as detailed by Lichter et al (Science 247:64-69 (1990)).
Metaphase spreads were prepared from normal cultured lymphocytes (46, XY) by standard procedures of colcemid arrest, hypotonic treatment, and acetic acid-methanol fixation. Cosmid probes were prepared by direct nick translation with biotinylated nucleotides. To facilitate probe penetration and to optimize reannealing, the size of the probe DNA was adjusted empirically to a length of
150-250 nucleotides by varying the DNAse concentration in the nick translation reaction. Slide preparations were routinely counterstained with 200 ng/ml 2-phenylindole- dihydrochloride (DAPI) in 2 XSSC for 5 minutes at room temperature and mounted in 20 mM Tris-HCl (pH 8.0)/90% glycerol containing 2.3% antifade 1,4-diazabicyclo- 2(2,2,2) octane. Preparations were visualized on a Zeiss photomicroscope equipped for DAPI and FITC epifluorescence optics, as well as conventional bright field microscopy. Photographs were taken with Kodak Ektachrome 400 (color) film.
RESULTS Library Construction and Screening
Cosmid vector cHCl shown in Figure 1 is 6 kb in size and has a cloning capacity of 35 to 50 kb. As in its pWE parent, the T3 and T7 promoters flanking the BamHI cloning site allow synthesis of end-specific RNA probes for chromosome walking and mapping. The Notl sites flanking the cloning site allow excision of the cloned DNA insert. Starting with 1.5 μg of CY18 genomic DNA, we used cHCl to construct a cosmid library of the cell line CY18 . CY18 is a mouse -human somatic cell hybrid containing human chromosome 16 as its only human component. The library construction yielded 1 x 106 independent colonies with an average insert size of 41.3 kb.
Approximately 1% (94/10,000)of the clones were identified as human by hybridization to radiolabeled total human DNA. Miniprep DNA was prepared from the
positive colonies and digested with Notl. Notl cuts out the insert to yield two fragments in those clones without internal Notl sites and 3 or more fragments in those clones with internal Not:I sites. Out of 94 human clones, 20 had internal Notl sites; 15 had a single site; 2 had two sites; 2 had three sites; and 1 had 4 sites.
Restriction analysis verified that all 20 clones are independent isolates.
Regional Localization
To verify the chromosomal origin of the clones and to gain initial mapping information, the 20 cosmid clones were biotinylated and used as probes in fluorescent in situ hybridization analysis of human metaphase chromosomes. Hybridization was carried out under conditions that suppress signal from repetitive DNA sequences.
Chromosome 16 was identified by hybridization with a chromosome 16-specific alpha satellite DNA clone (Oncor) and by its DAPI-staining pattern. Each clone hybridized exclusively to chromosome 16. The results of this analysis show that the 20 clones are not randomly distributed over the chromosome: 9 map to the long arm and 11 to the short arm. Of the 11 short-arm probes, 7 map to 16ρl3.3 to lδpter. From this collection of 20 clones, we chose six clones from six distinct chromosomal regions for the studies described below: 16-4N (D16S268), 16-14N (D16S273), 16-30N (D16S271), 16-38N (D16S270), 16-129N (D16S272), and 16-132N (D16S269).
These six clones were also localized with respect to their position on chromosome 16 by hybridization to the somatic cell hybrid mapping panel described in Table 1. Single copy fragments or fragments containing low levels of repetitive sequence elements were isolated from each clone and used as probe. The results of this analysis are summarized in Table 2. As expected, all six clones hybridized to the parental cell line CY18 and at least on other hybrid cell line. None of the probes hybridized to mouse DNA (cell line A9).
Rare Cutter Restriction Map
Figure 2 shows the restriction enzyme maps for the set of 6 NotI-containing cosmids and overlapping clones isolated by chromosome walking (designated by "W" in the clone name). The maps place the sites for the rare cutting restriction enzymes BssHII, Mlul, Notl, Nrul, Pvul, and Sacll. The clustering of sites for these enzymes is indicative of CpG-rich HTF islands. The Notl sites present in cosmids 16-4N, 16-30N and 16-129N are in close proximity to 2 or more rare cutting restriction enzyme sites, and are most likely island-related. This is not the case for the Notl sites present in the other cosmids. One possible explanation is that these Notl sites are situated in CpG-rich regions which do not encode sites for these or other rare cutting restriction enzymes. HTF island-like regions lacking Notl sites are present in cosmids 30N and 132N.
Linking Clones
To establish the methylation status of each cloned Notl site in genomic DNA and, thus, to identify Notl linking clones, we hybridized the cosmids or cosmidderived probes to Southern blots containing EcoRI and
EcoRI/Notl digests of genomic DNA. The source of DNA for these experiments was either IG138, a human lymphoblast cell line, or Til-1, a hypomethylated cell line recently described by Lindsay et al. (Hum. Genet. 81:252-256 (1989)). The results from this study are summarized in Table 3.
Notl
The 6 cosmids examined define 6 different loci containing 8 Notl sites. The Notl sites in the loci defined by 16-30N and 16-38N are unmethylated and digest to completion. In contrast, the Notl sites in the loci defined by 16-4N and 16-132N are not cleaved and, therefore, probably methylated at this site in both cell lines. Intermediate extents of methylation are observed
at the Notl sites present in the loci defined by cosmids 16-14N and 16-129N. With the exception of the Notl sites in 16-14N, the extent of methylation of the specific Notl sites is the same for IG138 and TIL-1 DNA. In summary, four of the six clones (16-14N, 16-30N, 16-38N, and
16-129N) can be effectively used to link Notl restriction fragments for long-range physical mapping.
To demonstrate this, we used whole cosmids as probes against Southern blots of Notl-digested genomic DNA fractionated by CHEF gel electrophoresis. With optimal resolution and sufficient single-copy sequence content, this analysis should allow us to identify the Notl fragments on both sides of the Notl site. Figure 3 shows the results of hybridizing the six cosmid clones to CHEF blots of Notl-digested DNA.
Cosmid 16-30N contains 2 Notl sites which lie 25 kb apart and are unmethylated in genomic DNA. We expect 16-30N to anneal to 3 genomic Notl fragments, but in this experiment, we see only 2 Notl fragments (25 kb and 105 kb). In the case of 16-38N, whole cosmid hybridizes to a single, resolvable Notl fragment of 150 kb. The location of the NotI site in cosmid 38N should have allowed for the detection of both the leftward and rightward Notl fragments. We conclude that either the leftward fragment ran off the CHEF gel or that the missing Notl fragments is 1600 kb or greater and consequently, unresolved using these electrophoretic conditions. Cosmid 16-129N encodes a single Notl site which is substantially unmethylated (70% cleavage) in the DNA of IG138 and TIL-1 cells.
Whole cosmid hybridization to CHEF blots reveals 2
resolvable Notl fragments of 735 and 200 kb. These fragments may be contiguous or represent partial
digestion products.
Cosmid 16-132N encodes a Notl site which is fully methylated in genomic DNA and hybridizes to a fragment which Is not resolved using these electrophoretic conditions. Cosmid 16-4N hybridizes to a single Notl fragment consistent with the fully methylated state of the Notl site encoded by this cosmid. The two loci defined by these two cosmids form an interesting
comparison since neither of their genomic Notl sites are cleavable, yet the cosmid 16-4N Notl site appears to be in an HTF-island while the Notl site encoded by cosmid 132N exists as an isolated rare cutting re s tr ic t ion enzyme s i te . Co smi d 16 - 14N hyb r idize s to a 160 kb Notl in Till DNA but not to an unresolved fragment in IG138 DNA. This example probably reflects local methylation differences between the two cell lines at this locus.
RFLP Analysis
Four of the six clones reco.gnize RFLPs. Mendelian inheritance was demonstrated in 7 2-generation families. This information is summarized in Table 4. 4-15, a 2.4 kb fragment of cosmid 16-4N recognizes Sacl and Xbal RFLPs. Although cosmid 16-4N was not used as probe In this study, cosmid 16-4NW1, isolated by chromosome walking from this locus, recognizes EcoRI and Bglll RFLPs. All four RFLPs at the 4N locus identify a single haplotype in the 7 families studied. 14-3, a 1.5 kb fragment of cosmid 16-14N, recognizes a PstI polymorphism
with at least 8 alleles. Because of its location with respect to the PKD-1 gene, this marker may prove useful in testing for autosomal dominant polycystic kidney disease (ADPKD). 129-16, a 5 kb fragment of cosmid
16-129N, recognizes a PvuII RFLP. Cosmid 16-132N
recognizes an EcoRI RFLP when the whole cosmid is used as probe.
Cosmid clones 16-30N and 16-38N failed to identify polymorphisms with the 14 restriction enzymes used in this study. A cosmid derived from a 34 kb walk from the 16-30N locus (16-30NW6) also failed to identify any
RFLPs, as did fragments isolated from these clones.
Clone 16-38N hybridizes with the oligonucleotide (CA)10 and, thus, may identify a microsatellite polymorphism (). Clone 16-30N does not hybridize with this
oligonucleotide.
An intermediate goal of this project was to identify Not I linking clones that could serve both as physical and genetic markers for the mapping of a chromosome. Of the 6 cosmid clones selected for characterization, 4 were determined to be Not I linking clones, and of these 4 linking clones, two detect RFLPs. Thus, one-third of the clones yield all of the desired information. To increase the efficiency of the screening process, a linking library can be constructed which contains only Not I sites that are cleavable in genomic DNA. Together, the two libraries should produce a sufficiently large pool of linking clones in which to search for genetic markers.
Clone 16-14N (D16S272) maps to 16pl3.3-16pl3.13 and thus represents a candidate PKD-1 clone. A variety of strategies can be adopted to determine whether this candidate clone is, in fact, the gene responsible for the disease phenotype. For example, clone 16-14N, or a portion thereof, can be labeled with a reporter group and used to study tissue distribution of complementary mRNA. For example, the gene responsible of ADPKD is manifested in kidney cells. Clones which hybridize to mRNA
specifically expressed in kidney cells can be selected for further analysis. For example, such a clone (or a portion' thereof) can be used to probe cDNA libraries generated from two sources; individuals having autosomal dominant polycystic kidney disease and individuals not having the disease. Both of the genes isolated from these sources can then be sequenced, and the nucleotide
change(s) responsible for the disease phenotype are determined.
The other clones discussed above which do not map to the ADPKD locus can be analyzed by determining their DNA sequence and comparing that sequence with the sequences recorded in gene bank data bases. Using this approach, a genetic locus can be assigned to proteins of known sequence, as well as those of unknown sequence.
Claims
1. A method for ordering a set of discrete DNA
sequences for physical and genetic mapping, the method' comprising:
a) providing a set of discrete DNA sequences, each discrete DNA sequence being complementary to a region of a eukaryotic chromosome;
b) determining the order of the discrete sequences .on the chromosome by in situ hybridization; and c) identifying discrete DNA sequences which
contain a restriction enzyme recognition sequence containing the dinucleotide CpG and a polymorphic DNA sequence.
2. A method of Claim 1 wherein the set of discrete DNA sequences complementary to a chromosome is a chromosome specific genomic DNA library.
3. A method of Claim 2 wherein the chromosome specific genomic DNA library is a cosmid library.
4. A method of Claim 2 wherein the chromosome specific genomic DNA library is constructed in a
bacteriophage vector.
5. A method of Claim 2 wherein the chromosome specific genomic DNA library is constructed in a yeast artificial chromosome.
6. A method of Claim 2 wherein the chromosome specific genomic DNA library is a cosmid library containing inserts from human chromosome 16.
7. A method of Claim 1 wherein the restriction enzyme recognition sequence is recognized by the
restriction enzyme Not I.
8. A method of Claim 1 wherein the DNA sequence
polymorphism is detectable as an RFLP.
9. A cosmid clone containing:
a) a DNA sequence which contains a recognition sequence for a restriction endonuclease which contains the dinucleotide CpG; and
b) a DNA sequence polymorphism.
10. A method for isolating a gene from a eukaryotic organism of interest, the method comprising:
a) providing a DNA library of genomic DNA clones containing insert DNA from the eukaryotic organism of interest;
b.) purifying DNA from Individual genomic DNA
clones contained within the DNA library and digesting the purified DNA with at least one restriction enzyme which recognizes and cleaves a nucleotide sequence which contains the dinucleotide CpG;
c) displaying the products of the restriction
enzyme digestion reaction on a gel; and
d) identifying genomic DNA clones
having Insert DNA which is recognized and cleaved by the restriction enzyme of step b).
11. A method of Claim 8 wherein the genomic DNA library is constructed within a yeast artificial chromosome.
12. A method of Claim 8 wherein the chromosome specific genomic DNA library is constructed in a
bacteriophage vector.
13. A method of Claim 8 wherein the genomic DNA library is constructed in a cosmid vector.
14. A method of Claim 13 wherein the genomic DNA library is constructed within the cosmid vector of Figure 1.
15. A method of Claim 8 wherein the restriction enzyme recognizes and cleaves a DNA sequence of 6 or more base pairs.
16. A method of Claim 8 wherein the restriction enzyme is Not I.
17. A method of Claim 8 wherein the eukaryotic organism of interest is a human.
18. A method of Claim 17 wherein the genomic library is chromosome specific.
19. A method for isolating a gene of interest having a known genetic map position from a eukaryotic
organism of interest, the method comprising:
a) providing a DNA library of genomic DNA clones containing insert DNA from the eukaryotic organism of interest;
b) purifying DNA from individual genomic DNA
clones contained within the DNA library and digesting the purified DNA with at least one restriction enzyme which recognizes and cleaves a nucleotide sequence which contains the dinucleotide CpG;
c) displaying the products of the step b) on a
gel;
d) identifying individual genomic DNA clones
having insert DNA which is recognized and cleaved by the restriction enzyme of step b); e) labeling clones identified in step d) with a reporter group and determining the map position of the complementary chromosomal region for each clone by in situ hybridization; and f) identifying a candidate clone as one which
hybridizes near the location of the gene of interest as determined by genetic mapping.
20. A method of Claim 19 wherein the genomic DNA library is constructed within a yeast artificial chromosome.
21. A method of Claim 19 wherein the genomic DNA library is constructed in a cosmid vector.
22. A method of Claim 19 wherein the genomic DNA library is constructed in a bacteriophage vector.
23. A method of Claim 21 wherein the genomic DNA library is constructed within the cosmid vector of Figure 1.
24. A method of Claim 19 wherein the restriction enzyme recognizes and cleaves a DNA sequence of 6 or more base pairs.
25. A method of Claim 19 wherein the restriction enzyme is Not I.
26. A method of Claim 19 wherein the eukaryotic organism of interest is a human.
27. A method of Claim 26 wherein the genomic library is chromosome specific.
28. A method of Claim 19 wherein the eukaryotic gene is responsible for a disease causing genetic disorder which results from the production of a defective protein and the Identity of the defective protein is unknown.
29. A method of Claim 28 wherein the disease is
autosomal dominant polycystic kidney disease.
30. A cosmid vector containing a site for insertion of DNA from a eukaryotic organism of interest, the insertion site being flanked by the nucleotide sequence CGGCCG.
31. The cosmid vector of Figure 1.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US51815690A | 1990-05-03 | 1990-05-03 | |
US518,156 | 1990-05-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1991017269A1 true WO1991017269A1 (en) | 1991-11-14 |
Family
ID=24062802
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1991/003006 WO1991017269A1 (en) | 1990-05-03 | 1991-05-02 | A method for mapping a eukaryotic chromosome |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO1991017269A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0518583A1 (en) * | 1991-06-13 | 1992-12-16 | Zeneca Limited | Nucleotide sequences |
EP0707660A1 (en) * | 1993-06-15 | 1996-04-24 | The Salk Institute For Biological Studies | Method for generation of sequence sampled maps of complex genomes |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1990012891A1 (en) * | 1989-04-14 | 1990-11-01 | Genmap, Inc. | Method of physically mapping genetic material |
-
1991
- 1991-05-02 WO PCT/US1991/003006 patent/WO1991017269A1/en unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1990012891A1 (en) * | 1989-04-14 | 1990-11-01 | Genmap, Inc. | Method of physically mapping genetic material |
Non-Patent Citations (9)
Title |
---|
Gene, vol. 79, 1989, Elsevier Science Publishers B.V., (BE), G.A. Evans et al.: "High efficiency vectors for cosmid microcloning and genomic analysis", pages 9-20, see the whole article; especially abstract; page 10, column 1, line 5 - page 10, column 2, line 10; page 13, column 1, line 7 - page 14, line 20 * |
Genomics, vol. 2, 1988, Academic Press, Inc., (San Diego, US), D.F. Callen et al.: "Fine mapping of gene probes and anonymous DNA fragments to the long arm of chromosome 16", pages 144-153, see abstract; page 144, column 1, line 32 - page 145, column 1, line 5; page 145, column 2, lines 3-21; page 150, column 2, line 14 - pages 152, column 2, line 25 * |
Human Genetics, vol. 72, no. 1, January 1986, Springer-Verlag, (Berlin, DE), N.E. Buroker et al.: "Four restriction fragment length polymorphisms revealed by probes from a single cosmid map to human chromosome 12q", pages 86-94, see abstract; page 87, column 1, line 5 - page 90, column 1, line 6; page 93, column 1, line 51 - column 2, line 16 * |
Human Genetics, vol. 74, no. 4, 1986, Springer-Verlag, (Berlin, DE), L. Bufton et al.: "A highly polymorphic locus on chromosome 16q revealed by a probe from a chromosome-specific cosmid library", pages 425-431, see abstract; page 425, column 1, line 1 - column 2, line 31; page 427, column 1, line 8 - column 2, line 9 * |
Nature, vol. 325, 22 January 1987, MacMillan, (London, GB), A. Poustka et al.: "Construction and use of human chromosome jumping libraries from NotI-digested DNA", pages 353-355, see the whole article * |
Nucleic Acids Research, vol. 18, no. 23, (Oxford, GB), G.A.J. Gillespie et al.: "Cosmid walking and chromosome jumping in the region of PKD1 reveal a locus duplication and three CpG islands", pages 7071-7075, see the whole article; especially abstract * |
Science, vol. 247, 5 January 1990, P. Lichter et al.: "High-resolution mapping of human chromosome 11 by in situ hybridization with cosmid clones", pages 64-69, see the whole article; especially abstract; page 66, column 2, line 17 - page 67, column 2, line 4 * |
The EMBO Journal, vol. 6, no. 7, 1987, IRL Press Ltd, (Oxford, GB), G.A. Rappold et al.: "Identification of a testis-specific gene from the mouse t-complex next to a CpG-rich island", pages 1975-1980, see the whole article; especially abstract; page 1975, column 2, lines 1-29; page 1977, column 1, line4 - column 2, line 9; page 1979, column 1, lines 7-39 * |
Trends in Genetics, vol. 2, July 1986, Elsevier Science Publishers B.V., (Amsterdam, NL), A. Poustka et al.: "Jumping libraries and linking libraries: the next generation of molecular tools in mammalian genetics", pages 174-178, see the whole article * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0518583A1 (en) * | 1991-06-13 | 1992-12-16 | Zeneca Limited | Nucleotide sequences |
EP0707660A1 (en) * | 1993-06-15 | 1996-04-24 | The Salk Institute For Biological Studies | Method for generation of sequence sampled maps of complex genomes |
EP0707660A4 (en) * | 1993-06-15 | 1999-11-17 | Salk Inst For Biological Studi | Method for generation of sequence sampled maps of complex genomes |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kunkel et al. | Regional localization on the human X of DNA segments cloned from flow sorted chromosomes | |
JP3152917B2 (en) | Cancer evaluation method | |
Rommens et al. | Identification of the cystic fibrosis gene: chromosome walking and jumping | |
USH2191H1 (en) | Identification and mapping of single nucleotide polymorphisms in the human genome | |
Bruns et al. | Human apolipoprotein AI--C-III gene complex is located on chromosome 11. | |
Walter et al. | A method for constructing radiation hybrid maps of whole genomes | |
USH2220H1 (en) | Identification and mapping of single nucleotide polymorphisms in the human genome | |
Miniou et al. | Abnormal methylation pattern in constitutive and facultative (X inactive chromosome) heterochromatin of ICF patients | |
De Martinville et al. | Localization of DNA sequences in region Xp21 of the human X chromosome: search for molecular markers close to the Duchenne muscular dystrophy locus | |
JP3535159B2 (en) | Selective approach to DNA analysis | |
US5206137A (en) | Compositions and methods useful for genetic analysis | |
US20020198371A1 (en) | Identification and mapping of single nucleotide polymorphisms in the human genome | |
Giacalone et al. | A novel GC–rich human macrosatellite VNTR in Xq24 is differentially methylated on active and inactive X chromosomes | |
O'Connell et al. | Fine structure DNA mapping studies of the chromosomal region harboring the genetic defect in neurofibromatosis type 1 | |
Bates et al. | Microdissection of and microcloning from the short arm of human chromosome 2 | |
Sherlock et al. | Homologies between human and marmoset (Callithrix jacchus) chromosomes revealed by comparative chromosome painting | |
JPH08500723A (en) | Genome improper scanning | |
US4963663A (en) | Genetic identification employing DNA probes of variable number tandem repeat loci | |
EP0402400B1 (en) | Genetic identification employing dna probes of variable number tandem repeat loci | |
Green et al. | Integration of physical, genetic and cytogenetic maps of human chromosome 7: isolation and analysis of yeast artificial chromosome clones for 117 mapped genetic markers | |
JP2002537855A (en) | Compositions and methods for genetic analysis | |
US5939255A (en) | Yeast artificial chromosomes containing DNA encoding the cystic fibrosis (CFTR) gene | |
Wallace et al. | Direct construction of a chromosome-specific Not I linking library from flow-sorted chromosomes | |
Fearon et al. | c-Ha-ras-1 oncogene lies between beta-globin and insulin loci on human chromosome 11p. | |
Rogner et al. | A YAC clone map spanning 7.5 megabases of human chromosome band Xq28 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA JP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE |
|
NENP | Non-entry into the national phase |
Ref country code: CA |