WO1999045112A2 - Gene chd1 de susceptibilite aux maladies coronariennes lie au chromosome 11 - Google Patents

Gene chd1 de susceptibilite aux maladies coronariennes lie au chromosome 11 Download PDF

Info

Publication number
WO1999045112A2
WO1999045112A2 PCT/US1999/004682 US9904682W WO9945112A2 WO 1999045112 A2 WO1999045112 A2 WO 1999045112A2 US 9904682 W US9904682 W US 9904682W WO 9945112 A2 WO9945112 A2 WO 9945112A2
Authority
WO
WIPO (PCT)
Prior art keywords
chdl
polypeptide
dna
nucleic acid
polypeptides
Prior art date
Application number
PCT/US1999/004682
Other languages
English (en)
Other versions
WO1999045112A3 (fr
Inventor
Dennis G. Ballinger
Wei Ding
Susanne Wagner
Mark A. Hess
Original Assignee
Myriad Genetics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Myriad Genetics, Inc. filed Critical Myriad Genetics, Inc.
Priority to AU30680/99A priority Critical patent/AU3068099A/en
Publication of WO1999045112A2 publication Critical patent/WO1999045112A2/fr
Publication of WO1999045112A3 publication Critical patent/WO1999045112A3/fr

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/32Cardiovascular disorders
    • G01N2800/324Coronary artery diseases, e.g. angina pectoris, myocardial infarction
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Definitions

  • the present invention relates generally to the field of human genetics.
  • the present invention specifically relates to a human coronary heart disease susceptibility gene (CHDl), some alleles of which are related to susceptibility to coronary heart disease. More specifically, the present invention relates to germline mutations m the CHDl gene and their use in the diagnosis of predisposition to coronary heart disease and to metabolic disorders, including hypoalphalipoproteinemia, familial combined hyperlipidemia, insulin resistant syndrome X or multiple metabolic disorder, obesity, diabetes and dyslipidemic hypertension.
  • CHDl human coronary heart disease susceptibility gene
  • the invention also relates to presymptomatic therapy of individuals who carry deleterious alleles of tne CHDl gene (including gene therapy, protein replacement therapy, and administration of protein mimetics and inhibitors) .
  • tne screening of drugs for coronary heart disease or metabolic disorder therapy is also within the scope of this invention.
  • the invention further relates to the screening m patients of the CHDl gene for mutations, such screening is useful for diagnosing the predisposition to coronary heart disease and to metabolic disorders, including hypoalphalipoproteinemia, familial combined hyperlipidemia, insulin resistant syndrome X or multiple metabolic disorder, obesity, diabetes and dyslipidemic hypertension.
  • binding assays utilizing the proteins of the invention.
  • antibodies directed against protein products encoded by the CHDl gene hybridomas secreting the antibodies, and diagnostic kits comprising those antibodies.
  • Coronary heart disease (CHD) or coronary artery disease (CAD) is one of the major causes of death m the United States, accounting for about a third of all mortality.
  • CHD Coronary heart disease
  • CAD coronary artery disease
  • dyslipidemic phenotypes and their frequency in cases of early familial coronary disease include: familial hypercholesterolemia (high LDL cholesterol), 3-4%; Type III hyperlipidemia (Apo E2/E2 genotype), 0.5 to 3%; low HDL-cholesterol (HDL-C, also called hypobetalipo-
  • protememia 20 to 30%; familial combined hyperlipidemia ( FCH - high LDL-cholesterol and/or high triglycerides and/or high VLDL-cholesterol) , 20-36%; familial hypertriglyceridemia, about 20%; high Lp(a), 16 to 19%; high homocysteme, 15 to 30%; or no known concordant risk factors 10 to 20% (Williams, et al., 1990) .
  • a familial history of CHD is a risk factor independent of known physiological abnormalities (Hopkins, et al . , 1988, Jorde, et al . , 1990).
  • CHD CHD Several metabolic disorders are associated with increased risk of CHD. These include familial dyslipidemic hypertension (Williams, et al . , 1990; Williams, et al . , 1993), msulm-dependent diabetes mellitus (IDDM) , non-msulin-dependent diabetes mellitus (NIDDM) , maturity onset diabetes of the young (MODY) , insulin resistant syndrome X (Castro-Cabezas, et al., 1993; Kawamoto, et al . , 1996; Landsberg, 1996; Hjermann, 1992; Vague and Raccah, 1992), hyperthyroidism and hypothyroidism (de Bruin, et al .
  • tne disease causing alleles may have low penetrance.
  • the diseases also develop over a large number of years, thus creating the situation that a relatively minor alteration in the function of the predisposing gene(s) can, over a lifetime, have severe metabolic and phenotypic consequences.
  • the disease-causing alleles may not be obviously deleterious to gene function.
  • many metabolic diseases show significant co-morbidity, raising the possibility that multiple phenotypes might be associated with a single gene.
  • the penetrances of the individual disorders may be influenced by different alleles of the gene or by environmental or genetic background effects, and may differ between or within families segregating mutations m the predisposing gene(s) .
  • Lp(a) lipoprotem
  • CHD lipoprotem
  • Lp(a) protein levels are strongly correlated with CHD. Greater than 95% of the variation m Lp(a) protein levels is associated with the gene itself, and is mostly related to the number of Knngle repeats m the gene (DeMeester, et al . , 1995).
  • the role of the LDL receptor m lipid metabolism and CHD is another example.
  • the familial hypercholesterolemia (FH) syndrome is a rare syndrome (affecting about 1 m 500 individuals) characterized by very high low-density lipoprotem (LDL) -cholesterol, and very early CHD, usually manifest m the 20s or 30s.
  • HDL-C high-density lipoprotein-C
  • LPL gene Another complexity of the dyslipidemias is illustrated by the LPL gene. Heterozygotes for some LPL mutations show higher triglycerides and lower HDL-C, and no elevation in LDL-C, and high systolic blood pressure when compared with control individuals (Sprecher, et al., 1996; Deeb, et al . , 1996). However there is a significant variation in the extent of these abnormalities when different mutations are compared (Sprecher, et al . , 1996).
  • LPL mutations are found in individuals with a more classic familial combined hyperlipidemia (FCH), having high LDL-C as well as high TG and low HDL-C (Yang, et al., 1996) , and some with insulin-resistant syndrome X
  • MODY genes (Maturity Onset Diabetes of the Young) .
  • the MODY genes account for about 130 of every 10,000 diabetics.
  • Positional cloning and candidate gene mutation screening have identified causal mutations in four transcription factors regulating pancreatic gene expression (HNF-l ,
  • the present invention solves the problems referred to above by providing means to diagnose, prevent and treat coronary heart disease and metabolic disorders, including hypoalphalipoproteinemia, familial combined hyperlipidemia, insulin resistant syndrome X or multiple metabolic disorder, obesity, diabetes and dyslipidemic hypertension.
  • this invention provides human coronary heart disease susceptibility gene (CHDl), some alleles of which are related to susceptibility to coronary heart disease and to metabolic disorders related to lipid metabolism.
  • CHDl human coronary heart disease susceptibility gene
  • This invention also relates to germline mutations in the CHDl gene, and methods and systems for using the germline mutations of the CHDl gene in the diagnosis of predisposition to metabolic disorders.
  • the present invention also provides the means necessary for production of gene-based therapies directed at coronary heart disease or metabolic disorders.
  • These therapeutic agents may take the form of polynucleotides comprising all or a portion of the CHDl locus placed in appropriate vectors or delivered to target cells in direct ways such that the function of the CHDl protein is interfered with or reconstituted.
  • the invention further comprises the use of polypeptides of the invention for the treatment or prevention of CHD.
  • Therapeutic agents may also take the form of polypeptides based on either a portion of or the entire protein sequence of CHDl; such isolated polypeptides as well as pharmaceutical compositions comprising them are also provided by this invention. These may functionally replace the activity of CHDl in vivo, or interfere with normal CHDl function.
  • the present invention also provides isolated antibodies (e.g. monoclonal antibodies), that specifically bind to epitopes of an isolated polypeptide encoded by the CHDl locus.
  • Such methods may further comprise the step of amplifying a portion of the CHDl locus, and may further include a step of providing a set of polynucleotides that are primers for amplification of said portion of the CHDl locus.
  • Such methods may also include a step of providing the complete set of short polynucleotides defined by the sequence of CHDl or discrete subsets of that sequence, all single-base substitutions of that sequence or discrete subsets of that sequence, all 1-, 2-, 3-, or 4-base deletions of that sequence or discrete subsets of that sequence, and all 1-, 2-, 3-, or 4-base insertions in that sequence or discrete subsets of that sequence.
  • This invention also provides methods for using and kits comprising the above-mentioned antibodies to identify mutant forms of CHDl polypeptides or to detect aberrant levels of expression of CHDl polypeptides m biological samples. Such methods are useful for identifying mutations for use m either diagnosis of the predisposition to coronary heart disease or the diagnosis or prognosis of metabolic disorders.
  • This invention further provides isolated polynucleotides comprising all or a portion of the CHDl locus or comprising a mutated CHDl locus, preferably at least eight bases and not more than about 300 kilobases (kb) length. Such polynucleotides may also be antisense polynucleotides.
  • the present invention also provides a recombinant construct comprising such an isolated polynucleotide, for example, a recombinant construct suitable for expression of a polypeptide comprising a CHDl wild-type or mutant polypeptide, or a portion of either, a transformed host cell.
  • Such methods may further comprise the step of amplifying the portion of the CHDl locus, and may further include a step of providing a set of polynucleotides that are primers for amplification of said portion of the CHDl locus.
  • the method is useful for either diagnosis of the predisposition to coronary heart disease or the diagnosis or prognosis of metabolic disorders.
  • kits for detecting in an analyte a polynucleotide comprising a portion of the CHDl locus kits comprising a polynucleotide complementary to the portion of the CHDl locus packaged in a suitable container, and instructions for their use.
  • methods of preparing a polynucleotide comprising polymerizing nucleotides to yield a sequence comprised of at least eight consecutive nucleotides of the CHDl locus and methods of preparing a polypeptide comprising polymerizing ammo acids to yield a sequence comprising at least five am o acids encoded within the CHDl locus.
  • drugs e.g. binding assays
  • the CHDl locus is a gene encoding multiple CHDl proteins, some of which have been found to have sequence motifs characteristic of the z c fmger category of transcription factors, and the KRAB motif, implicated protein-protein interactions. This gene is termed CHDl herein.
  • CHDl This gene is termed CHDl herein.
  • mutations the CHDl locus m the germline are indicative of a predisposition to coronary heart disease or to metabolic disorders related to lipid metabolism.
  • the mutational events of the CHDl locus can involve deletions, insertions and point mutations within the coding sequence and the non-coding sequence, as well as within the regulatory sequence.
  • the CHDl protein is a sequence specific DNA binding protein that binds and may regulate the expression of genes involved m lipid metabolism or implicated CHD and metabolic disorders.
  • FIG. 1 Recombinant map of CHDl.
  • the recombmants are shown with solid lines representing the region shared with the haplotype segregating with disease m the family, the arrowhead at the recombinant marker, and the region between the last recombinant marker and the first non- recombmant marker is stippled.
  • the width of the lines represent the relative confidence of linkage in the family which the recombinant is found, with the thicker lines representing more likely linkage to llq23.
  • the kindred and individual carrying the recombinant chromosome are listed to the right of each
  • CHDl region transcript map A diagram of the CHDl region, showing the location of BACs and PACs, identified transcripts and the location of CHDl. Genomic DNA is represented by the top line, with the positions of some genetic markers (coded as Table 3).
  • P254 PAC 254) that form a genomic contig across tne region spanning markers 1 to 14 are shown below tne genomic DNA. Below these, the candidate genes that: were screened for mutations m CHD families are shown m their approximate locations. A set of almost 40 olfactory receptor (Olf-R) genes m the middle of the CHDl region are also shown. CHDl is located on PAC 254.
  • Olf-R olfactory receptor
  • FIG. 3 CHDl alternative transcripts.
  • cDNAl to cDNA4 indicate four alternative splices between exons A and F that affect the protein coding capacity of CHDl. The five observed 5' alternative splices are also shown; these may occur m any combinations with cDNAs 1 to 4. The approximate locations and functions of conserved sequence motifs of CHDl proteins are also shown. Figure 4.
  • a human KRAB domain consensus sequence (SEQ ID NO: 199) is listed on the top line, with the most highly conserved ammo acids m upper case.
  • the middle line (SEQ ID NO: 200) shows particular ammo acids contained m at least 15% of human KRAB domains.
  • the bottom line (SEQ ID NO: 201) gives the sequences of the CHDl KRAB domain; ammo acids conserved with the consensus are m upper case.
  • the arrow indicates the position of a mutation (K872E) found in a DNA sample from an obese diabetic who has low HDL (SEQ ID NO: 202) (see text) .
  • FIG. 5A Mobility shift of gene promoter fragments by CHDl.ZnF3-8 protein.
  • Promoter regions amplified by PCR were end labeled with 3 ⁇ P and incubated with purified CG7 GST-fusion protein (zmc fingers 3 through 8). No d(I:C) competitor was used.
  • the probes spanned: -573 to - 165 (apolipoprotem AIV), -743 to -366 (apolipoprotem CIII,, Kardassis, et al, 1996), -532 to -187 (Lipoprotem Lipase) .
  • the molar protein : probe ratio is indicated above each lane. 100X protein corresponds to approximately 140 nM in the binding reaction.
  • Open arrowheads indicate free probe. Filled arrowheads indicate the principal shifted species.
  • Figure 5B Mobility shift of Apolipoprotem AIV gene promoter subfragments by CHDl.ZnF3-8 Protein. The promoter fragment (-573 to -165) shifted by CHDl GST fusion protein was trisected by PCR amplification to 3 adjacent, non-overlapping regions (2 are shown: SI, -573 to -447 and S3, -328 to -165) . 32 P end-labeled products were incubated with purified CHDl GST-fusion protein (zmc fingers 3 through 8). The molar protein : probe ratio and addition of non-specific competitor are indicated above each lane. 100X protein corresponds to approximately 140 nM m the binding reaction. Open arrowheads indicate free probe. Filled arrowheads indicate the principal shifted species. Note that the weak shift of fragment SI is fully competed by d(I:C), indicating that it is due to non- specific binding.
  • FIG. 5C Mobility shift of gene promoter fragments by CHDl.ZnF3-8 protein. PCR amplified promoter regions were end labeled with 32 P and incubated with purified CHDl GST-fusion protein (zmc fingers 3 through 8) . Poly d(I:C) competitor was used as indicated. Relative to the start of transcription, the probes spanned: -573 to -165 (Apolipoprotem AIV, Apo AIV), -1304 to -968 (Lecithin: cholesterol acetyltransferase, LCAT) , -324 to +16 (Apolipoprotem E, Apo E) . The molar protein : probe ratio in the binding reaction was 100X (GST, 250X) ; protein concentration was approximately 140 nmolar (GST, 340 nmolar) .
  • FIG. 6A Diagrammatic summary of gel shift assay results for fragments of the Apolipoprotem AIV promoter ( Figure 6A) , the Apolipoprotem CIII enhancer (Figure 6B) and the lipoprotem lipase (LPL) promoter ( Figure 6C) .
  • Fragments marked with “B” bind to CHDl.ZnF3-8 as detected by a probe mobility shift on polyacrylamide gels; those marked "—” were not detectably shifted under the same conditions; and those marked "-/B” bound very weakly.
  • GSA probes indicate promoter fragments tested. Principal GnT region gives the sequence of each fragment with the highest degree of conservation with the GGGGT consensus (see text) .
  • the stippled boxes indicate these consensus sequences, and some of the defined protein binding sites Apo AIV promoter (Kardassis, et al., 1996) .
  • the sequence ttggtGGGGTGGGGGTGGGGGTg in Figure 6A is SEQ ID NO: 203.
  • the sequence GGGTGGGGGCGGGTGGGGGG m Figure 6B is SEQ ID NO: 204.
  • the sequence GGGGGTGGGGATGGGGTGCGGGGT in Figure 6C is SEQ ID NO: 205.
  • FIG. 7 The regulatory regions of the ApoAIV gene, the ApoCIII enhancer, and the ApoE gene, and fragments that bind CHDl . ZnF.3-8. Diagrams of promoter fragments from these genes, adapted from Kardassis, et al . , 1996, showing regions that bind proteins from nuclear extracts, and which are important for regulation of the respective genes. The ovals indicate transcription factors that bind to particular motifs; UF, unknown factor; LDNF, ligand-dependent nuclear factor (e.g., HNF-4) . Below each promoter diagram are shown the following: Promoter fragment that binds CHDl . Znf3.8 (solid line) ; CHDl consensus binding sequence block
  • This invention relates to wild-type and mutant CHDl polypeptides and DNA sequences encoding them, antibodies directed against those polypeptides, compositions comprising the polypeptides, DNA sequences or antibodies, and methods for identifying additional CHDl mutant polypeptides and antibodies and methods for the detection, tre itment and prevention of human coronary heart disease and related metabolic disorders related to lipid metabolism, including hypoalphalipoproteinemia, familial combined hyperlipidemia, insulin resistant syndrome X or multiple metabolic disorder, obesity, diabetes and dyslipidemic hypertension.
  • metabolic disorders refers to one or more conditions afflicting a human patient, either present individually or in combination, associated with a susceptibility to CHD.
  • the term includes any dyslipidemia wherein the serum level of lipid is m the bottom 10% or top 90t> of the population, based on age and sex corrected population values reported by the LRC .
  • An individual can be classified as dyslipidemic if any of the following values fall within the above defined ranges: total serum cholesterol, LDL-cholesterol, VLDL-cholesterol, HDL-cholesterol or triglycerides .
  • metabolic disorders also includes other syndromes that can accompany alterations in serum lipid levels. These syndromes include: insulin-dependent diabetes mellitus (IDDM), non-insulin-dependent diabetes mellitus (NIDDM) , hyperthyroidism, hypothyroidism, dyslipidemic hypertension, obesity, insulin resistance or multiple metabolic syndrome (or insulin resistant syndrome X) . These conditions may be present in a particular individual or family independently or in any combination.
  • IDDM insulin-dependent diabetes mellitus
  • NIDDM non-insulin-dependent diabetes mellitus
  • hyperthyroidism hyperthyroidism
  • hypothyroidism hypothyroidism
  • dyslipidemic hypertension obesity
  • insulin resistance or multiple metabolic syndrome or insulin resistant syndrome X
  • amplification of polynucleotides refers to methods such as the polymerase chain reaction (PCR) , ligation amplification (or ligase chain reaction, LCR) and amplification methods based on the use of Q-beta replicase for the purpose of amplifying polynucleotides. Also useful for this purpose are, without limitation, strand displacement amplification (SDA) and nucleic acid sequence based amplification (NASBA) . These methods are well known and widely practiced in the art. See, e.g., U.S. Patents 4,683,195 and 4,683,202 and Innis et al .
  • Primers useful for amplifying sequences from the CHDl region are preferably complementary to, and hybridize specifically to, sequences in the CHDl region or in regions that flank a target region therein.
  • CHDl sequences generated by amplification may be sequenced directly.
  • the amplified sequence (s) may be cloned prior to sequence analysis.
  • One method for the direct cloning and sequence analysis of enzymatically amplified genomic segments has been described by Scharf, 1986.
  • to encode refers to the following: a polynucleotide is said to "encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for and/or the polypeptide or a fragment thereof.
  • the anti-sense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
  • isolated or substantially pure nucleic acid or polynucleotide is one that is substantially separated from other cellular components that naturally accompany a native human sequence or protein, e.g., ribosomes, polymerases, many other human genome sequences and proteins .
  • the term embraces a nucleic acid sequence that has been removed from its naturally occurring environment, and includes recombinant or cloned DNA isolates and chemically synthesized analogs or analogs that are biologically synthesized by heterologous systems .
  • CHDl alleles refers to normal alleles (also referred to as wild-type alleles) of the CHDl locus as well as alleles carrying variations that predispose individuals to develop coronary heart disease or metabolic disorders. Such predisposing alleles are also called “CHDl susceptibility alleles” or “CHDl mutant alleles”.
  • CHDl Locus “CHDl gene”, “CHDl nucleic acids” or “CHDl polynucleotide” each refer to polynucleotides, which are m the CHDl region.
  • Some of these DNAs are likely to direct the expression, m normal or abnormal tissues, of CHDl wild-type and mutant alleles, said mutant alleles predispose an individual to develop coronary heart disease or metabolic disorders.
  • the locus is indicated m part by mutations that predispose individuals to develop coronary heart disease or metabolic disorders. These mutations fall within the CHDl region described infra.
  • the CHDl locus is intended to include CHDl coding sequences, intervening sequences and regulatory elements controlling transcription and/or translation.
  • the CHDl locus is intended to include all allelic variations of the DNA sequence.
  • CHDl nucleic acids or “CHDl polynucleotides” is also extended to refer to nucleic acids that encode a CHDl polypeptide, CHDl polypeptide fragment, homologs and variants of CHDl, protein fusions and deletions of any of the above. These nucleic acids comprise a sequence which is either derived from, or substantially similar to a natural CHDl-encodmg gene or one having substantial homology with a natural CHDl-encodmg gene or a portion thereof.
  • the polynucleotide compositions of this invention include RNA, cDNA, genomic DNA, synthetic forms, and mixed polymers, both sense and antisense strands, and may be chemically or biochemically modified or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those skilled in the art.
  • Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), pendent moieties (e.g., polypeptides), mtercalators (e.g., acridme, psoralen, etc.), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, etc.) Also included are synthetic molecules that mimic polynucleotides m their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions.
  • internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidate
  • CHDl region refers to a portion of human chromosome 11 bounded by the markers D115924 to D115912. This region contains the CHDl locus, including the CHDl gene.
  • CHDl locus all refer to the double-stranded DNA comprising the locus, allele, or region, as well as either of the single-stranded DNAs comprising the locus, allele or region.
  • a "portion" of the CHDl locus or region or allele is defined as having a minimal size of at least about eight nucleotides, or preferably about 15 nucleotides, or more preferably at least about 25 nucleotides, and may have a minimal size of at least about 40 nucleotides. This definition includes all sizes m the range of 8-40 nucleotides as well as greater than 40 nucleotides.
  • regulatory sequences refers to tnose sequences normally within 100 kilobases (kb) of the coding region of a locus, but they may also be more distant from the coding region, which affect the expression of the gene. Such regulation of expression comprises transcription of the gene, and translation, splicing, and stability of the messenger RNA.
  • operably linked refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. For instance, a promoter is operably linked to a coding sequence if the promoter affects its transcription or expression.
  • operably linked may refer to functional linkage between a nucleic acid expression control sequence (e.g., a promoter, enhancer, or array of transcription factor binding sites) and a second nucleic acid sequence, wherein the expression control sequence directs transcription of the nucleic acid corresponding to the second sequence.
  • a nucleic acid expression control sequence e.g., a promoter, enhancer, or array of transcription factor binding sites
  • vector or “recombinant DNA cloning vehicle” refers to a specifically designed nucleic acid, polynucleotide or DNA molecule capable of autonomous existence and replication in an appropriate host cell.
  • the vector comprises a member selected from the group comprising plasmid, bacteriophage, and artificial chromosome construct.
  • This vehicle or vector may "carry” inserted DNA, said inserted DNA may comprise CHDl polynucleotide or nucleic acid.
  • the vector may allow expression of one or more genes carried on the inserted DNA in an appropriate host cell.
  • the expressed gene product may be a polypeptide or RNA.
  • the vector may allow expression of the antisense RNA of a gene.
  • the vector may allow for production of antisense RNA or DNA of the inserted gene or genes in a cell-free system by methods well known in the art (Maniatis et al . , 1982; Sambrook et al . , 1989; Ausubel et al . , 1992) .
  • the vector may exist m a host cell or in substantially pure form.
  • the vector may exist m a host cell as an autonomous replicating unit or an autonomous unit, or alternatively it may integrate into the genome of the host cell.
  • the "vector” may be a recombinant DNA or polynucleotide molecule comprising all or part of the CHDl region.
  • the recombinant construct may be capable of replicating autonomously in a host cell. Alternatively, the recombinant construct may become integrated into the chromosomal DNA of the host cell.
  • Such a recombinant polynucleotide comprises a polynucleotide of genomic DNA, cDNA, semi-synthetic, or synthetic origin which, by virtue of its origin or manipulation, 1) is not associated with all or a portion of a polynucleotide with which it is associated in nature; 2) is linked to a polynucleotide other than that to which it is linked m nature; or 3) does not occur m nature. Therefore, recombinant polynucleotides comprising sequences otherwise not naturally occurring are provided by this invention. Although the wild-type sequence may be employed, it will often be altered, e.g., by deletion, substitution or insertion.
  • Genomic DNA or cDNA libraries of various types may be screened as natural sources of the nucleic acids of the present invention, or such nucleic acids may be provided by amplification of sequences resident in genomic DNA or other natural sources, e.g., by PCR.
  • the choice of cDNA libraries normally corresponds to a tissue source which is abundant in mRNA for the desired proteins. Phage libraries are normally preferred, but other types of libraries may be used. Clones of a library are spread onto plates, transferred to a substrate for screening, denatured and probed for the presence of desired sequences.
  • the DNA sequences used in this invention will usually comprise at least about five codons (15 nucleotides), more usually at least about 7-15 codons, and most preferably, at least about 35 codons. One or more introns may also be present. This number of nucleotides is usually about the minimal length required for a successful probe that would hybridize specifically with an CHDl-encoding sequence.
  • nucleic acid manipulation is described generally, for example, in Sambrook et al . , 1989 or Ausubel et al., 1992.
  • Reagents useful in applying such techniques such as restriction enzymes and the like, are widely known in the art and commercially available from such vendors as New England BioLabs, Boehringer Mannheim, Amersham, Promega Biotech, U. S. Biochemicals, New England Nuclear, end a number of other sources.
  • the recombinant nucleic acid sequences used to produce fusion proteins of the present invention may be derived from natural or synthetic sequences. Many natural gene sequences are obtainable from various cDNA or from genomic DNA libraries using appropriate probes. See, GenBank, National Institutes of Health.
  • nucleic acid refers to a nucleic acid molecule which is not naturally occurring, or which is made by the artificial combination of two otherwise separated segments of sequence. This artificial combination is often accomplished by either chemical synthesis means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. Such is usually done to replace a codon with a redundant codon encoding the same or a conservative ammo acid, while typically introducing or removing a sequence recognition site, for example, for a restriction endonuclease . Alternatively, it is performed to oin together nucleic acid segments of desired functions to generate a desired combination of functions.
  • probes refer to polynucleotide probes for the purpose of detecting polynucleotide polymorphisms associated with CHDl alleles which predispose to coronary heart disease or metabolic disorders, or are associated with coronary heart disease or metabolic disorders by hybridization.
  • Each probe is designed to form a stable hybrid with that of the target sequence, under highly stringent to moderately stringent hybridization and wash conditions. If it is expected that a probe will be perfectly complementary to the target sequence, high stringency conditions will be used. Hybridization stringency may be lessened if some mismatching is expected, for example, if variants are expected with the result that the probe will not be completely complementary.
  • Probes for CHDl alleles may be derived from the sequences of the CHDl region or its cDNAs .
  • the probes may be of any suitable length, which span all or a portion of the CHDl region, and which allow specific hybridization to the CHDl region. If the target sequence contains a sequence identical to that of the probe, the probes may be short, e.g., m the range of about 8-30 base pairs, since the hybrid will be relatively stable under even highly stringent conditions. If some degree of mismatch is expected with the probe, i.e., if it is suspected that the probe will hybridize to a variant region, a longer probe may be employed which hybridizes to the target sequence with the requisite specificity.
  • the probes will include an isolated polynucleotide attached to a label or reporter molecule and may be used to isolate other polynucleotide sequences, having sequence similarity by standard methods .
  • techniques for preparing and labeling probes see, e.g., Sambrook et al . , 1989 or Ausubel et al., 1992.
  • Other similar polynucleotides may be selected by using homologous polynucleotides.
  • polynucleotides encoding these or similar polypeptides may be synthesized or selected by use of the redundancy m the genetic code.
  • codon substitutions may be introduced, e.g., by silent changes (thereby producing various restriction enzyme recognition sites) or to optimize expression for a particular system. Mutations may be introduced to modify the properties of the polypeptide, perhaps to change ligand-bmdmg affinities, interchain or mtermolecular affinities, or the polypeptide degradation or turnover rate.
  • Probes comprising synthetic oligonucleotides or other polynucleotides of the present invention may be derived from naturally occurring or recombinant single- or double-stranded polynucleotides, or be chemically synthesized. Probes may also be labeled by nick translation, Klenow fill-in reaction, or other methods known the art.
  • Portions of the polynucleotide sequence having at least about eight nucleotides, usually at least about 15 nucleotides, and fewer than about 6 kilobases (kb), usually fewer than about 1.0 kb, from a polynucleotide sequence encoding CHDl are preferred as probes.
  • This definition therefore includes probes of sizes 8 nucleotides through 6000 nucleotides. The probes may also be used to determine whether mRNA encoding CHDl is present m a cell or tissue.
  • nucleic acid or fragment thereof indicates that when optimally aligned (with appropriate nucleotide insertions or deletions) with the other nucleic acid (or its complementary strand) , there is nucleotide sequence identity m at least about 60% of the nucleotide bases, usually at least about 70%, more usually at least about 80%, preferably at least about 90%, and more preferably at least about 95-98% of the nucleotide bases.
  • substantial homology or similarity exists when a nucleic acid or fragment thereof hybridizes to another nucleic acid (or a complementary strand thereof) under selective hybridization conditions, to a strand, or to its complement.
  • Selectivity of hybridization exists when hybridization which is substantially more selective than total lack of specificity occurs.
  • selective hybridization will occur when there is at least about 55-5 homology over a stretch of at least about 14 nucleotides, preferably at least about 65%, more preferably at least about 75-:, and most preferably at least about 90%. See, Kanehisa, 1984.
  • the length of homology comparison may be over longer stretches, and in certain embodiments will often be over a stretch of at least about nine nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 32 nucleotides, and preferably at least about 36 or more nucleotides .
  • Nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, or organic solvents, m addition to the base composition, length of the complementary strands, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled m the art.
  • Stringent temperature conditions will generally include temperatures m excess of 30°C, typically m excess of 37°C, and preferably m excess of 45°C.
  • Stringent salt conditions will ordinarily be less than 1000 mM, typically less than 500 mM, and preferably less than 200 mM. However, the combination of parameters is much more important than the measure of any single parameter. See, e.g., Wetmur and Davidson, 1968.
  • Probe sequences may also hybridize specifically to duplex DNA under certain conditions to form triplex or other higher order DNA complexes. The preparation of such probes and suitable hybridization conditions are well known m the art.
  • the term "target region” refers to a region of the nucleic acid which is amplified and/or detected.
  • target sequence refers to a sequence with which a probe or primer will form a stable hybrid under desired conditions.
  • analyte polynucleotide refers to a single- or double-stranded polynucleotide which is suspected of containing a target sequence, and which may be present m a variety of types of samples, including biological samples.
  • CHDl protein or “CHDl polypeptide” refers to a protein or polypeptide encoded by the CHDl locus, variants or fragments thereof.
  • polypeptide refers to a polymer of ammo acids and its equivalent and does not refer to a specific length of the product; thus, peptides, oligopeptides and proteins are included within the definition of a polypeptide. This term also does not refer to, or exclude modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations, and the like.
  • polypeptides containing one or more analogs of an ammo acid including, for example, unnatural ammo acids, etc.
  • polypeptides with substituted linkages as well as other modifications known m the art, both naturally and non-naturally occurring.
  • polypeptides will be at least about 50% homologous to the native CHDl sequence, preferably m excess of about 90%, and more preferably at least about 95% homologous.
  • proteins encoded by DNA which hybridize under high or low stringency conditions, to CHDl-encodmg nucleic acids and closely related polypeptides or proteins retrieved by antisera to the CHDl protem(s) .
  • the length of polypeptide sequences compared for homology will generally be at least about 16 ammo acids, usually at least about 20 residues, more usually at least about 24 residues, typically at least about 28 residues, and preferably more than about 35 residues.
  • protein modifications or fragments refers to CHDl polypeptides or fragments thereof that are substantially homologous to primary structural sequence but which include, e.g., in vivo or in vitro chemical and biochemical modifications or which incorporate unusual ammo acids. Such modifications include, for example, acetylation, carboxylation, phosphorylation, glycosylation, ubiquitmation, labeling, e.g., with radionuclides, and various enzymatic modifications, as will be readily appreciated by those well skilled in the art.
  • a variety of methods for labeling polypeptides and of substituents or labels useful for such purposes are well known in the art, and include radioactive isotopes such as 32 P, ligands which bind to labeled antiligands (e.g., antibodies), fluorophores, chemiluminescent agents, enzymes, and antiligands which can serve as specific binding pair members for a labeled ligand.
  • radioactive isotopes such as 32 P
  • ligands which bind to labeled antiligands e.g., antibodies
  • fluorophores e.g., chemiluminescent agents
  • enzymes chemiluminescent agents
  • antiligands which can serve as specific binding pair members for a labeled ligand.
  • the choice of label depends on the sensitivity required, ease of conjugation with the primer, stability requirements, and available instrumentation.
  • Methods of labeling polypeptides are well known in the art. See Sambrook et al., 1989 or
  • the present invention provides for biologically active fragments of the polypeptides.
  • Significant biological activities include ligand-bmdmg, immunological activity and other biological activities characteristic of CHDl polypeptides.
  • Immunological activities include both lmmunogenic function m a target immune system, as well as sharing of immunological epitopes for binding, serving as either a competitor or substitute antigen for an epitope of the CHDl protein.
  • epitope refers to an antigenic determinant of a polypeptide.
  • An epitope could comprise three ammo acids in a spatial conformation which is unique to the epitope. Generally, an epitope consists of at least five such ammo acids, and more usually consists of at least 8-10 such ammo acids. Methods of determining the spatial conformation of such ammo acids are known m the art.
  • tandem-repeat polypeptide segments may be used as immunogens, thereby producing highly antigenic proteins.
  • polypeptides will serve as highly efficient competitors for specific binding. Production of antibodies specific for CHDl polypeptides or fragments thereof is described below.
  • fusion protein refers to fusion polypeptides comprising CHDl polypeptides and fragments.
  • Homologous polypeptide fusions may be between two or more CHDl polypeptide sequences or between the sequences of CHDl and a related protein.
  • Heterologous fusions may be constructed which would exhibit a combination of properties or activities of the derivative polypeptides. For example, ligand-bmdmg or other domains may be "swapped" between CHDl and other polypeptides or polypeptide fragments.
  • a fusion protein may have the DNA binding domain of CHDl and the transcription activation domain of another protein (for example, the transcriptional activation domain of the yeast GAL4 protein (Ma and Ptashne, 1987)).
  • Such homologous or heterologous fusion polypeptides may display altered strength or specificity of binding.
  • a heterologous polypeptide or polypeptide fragment may confer a new activity on CHDl.
  • a fusion between CHDl or a portion of CHDl to the Schistosoma japonicum glutathione-S-transferase (GST) may be made. This fusion protein can bind to glutathione sepharose or agarose beads, whereas CHDl cannot.
  • Fusion partners include, inter alia, GST, immunoglobulins, bacterial beta-galactosidase, trpE, protein A, ⁇ -lactamase, amylase, maltose binding protein, alcohol dehydrogenase, polyhistidme (for example, six histidme at the ammo and/or carboxyl terminus of the polypeptide) , green fluorescent protein, yeast ⁇ mating factor, GAL4 transcription activation or DNA binding domain, and luciferase. See Godowski et al . , 1988. Fusion proteins will typically be made by either recombinant nucleic acid methods, as described above, or may be chemically synthesized.
  • polypeptides are described, for example, m Mer ⁇ field, 1963.
  • protein purification refers to various methods for the isolation of the CHDl polypeptides or fusion polypeptides comprising CHDl polypeptides from other biological material, such as from cells transformed with recombinant nucleic acids encoding CHDl, and are well known m the art.
  • polypeptides may be purified by lmmunoaffmity chromatography employing, for instance, the antibodies provided by the present invention.
  • Various methods of protein purification are well known in the art, and include those described in Deutscher, 1990 and Scopes, 1982.
  • isolated is substantially pure when at least about 60 to 75% of a sample exhibits a single polypeptide sequence.
  • a substantially pure protein will typically comprise about 60 to 90% W/W of a protein sample, more usually about 95%, and preferably will be over 99% pure.
  • Protein purity or homogeneity may be indicated by a number of means well known m the art, such as polyacrylamide gel electrophoresis of a protein sample, followed by visualizing a single polypeptide band upon staining the gel with a stain well known m the art. For certain purposes, higher resolution may be provided by using HPLC or other means well known the art for purification.
  • a CHDl protein is substantially free of naturally associated components when it is separated from the native contaminants that accompany it m its natural state.
  • a polypeptide that is chemically synthesized or synthesized m a cellular system different from the cell from which it naturally originates will be substantially free from its naturally associated components.
  • a protein may also be rendered substantially free of naturally associated components by isolation, using protein purification techniques well known the art.
  • a polypeptide produced as an expression product of an isolated and manipulated genetic sequence is an "isolated polypeptide, " as used herein, even if expressed m a homologous cell type. Synthetically made forms or molecules expressed by heterologous cells are inherently isolated molecules.
  • substantially homology when referring to polypeptides, indicate that the polypeptide or protein in question exhibits at least about 30% identity with an entire naturally-occurring protein or a portion thereof, usually at least about 70% identity, and preferably at least about 95% identity.
  • substantially similar function refers to the function of a modified nucleic acid or a modified polypeptide (or protein) with reference to the wild-type CHDl nucleic acid or wild-type CHDl polypeptide.
  • the modified polypeptide will be substantially homologous to the wild-type CHDl polypeptide and will have substantially the same function.
  • the modified polypeptide may have an altered ammo acid sequence and/or may contain modified ammo acids.
  • the modified polypeptide may have other useful properties, such as a longer half-life.
  • the similarity of function (activity) of the modified polypeptide may be substantially the same as the activity of the wild-type CHDl polypeptide.
  • the similarity of function (activity) of the modified polypeptide may be higher than the activity of the wild-type CHDl polypeptide.
  • the modified polypeptide is synthesized using conventional techniques, or is encoded by a modified nucleic acid and produced using conventional techniques.
  • the modified nucleic acid is prepared by conventional techniques.
  • a nucleic acid with a function substantially similar to the wild-type CHDl gene function produces the modified protein described above.
  • homology for polypeptides, is typically measured using sequence analysis software. See, e.g., tne Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wisconsin 53705. Protein analysis software matches similar sequences using measure of homology assigned to various substitutions, deletions and other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid; asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine.
  • polypeptide fragment refers to a stretch of amino acid residues of at least about five to seven contiguous amino acids, often at least about seven to nine contiguous amino acids, typically at least about nine to 13 contiguous amino acids and, most preferably, at least about 20 to 30 or more contiguous amino acids.
  • polypeptides of the present invention may be coupled to a solid-phase support, e.g., nitrocellulose, nylon, column packing materials (e.g., Sepharose beads), magnetic beads, glass wool, plastic, metal, polymer gels, cells, or other substrates.
  • a solid-phase support e.g., nitrocellulose, nylon, column packing materials (e.g., Sepharose beads), magnetic beads, glass wool, plastic, metal, polymer gels, cells, or other substrates.
  • Such supports may take the form, for example, of beads, wells, dipsticks, or membranes.
  • antibodies refers to polyclonal and/or monoclonal antibodies and fragments thereof, and immunologic binding equivalents thereof, which are capable of specifically binding to the CHDl polypeptides and fragments thereof or to polynucleotide sequences from the CHDl region, particularly from the CHDl locus or a portion thereof.
  • antibodies is used both to refer to a homogeneous molecular entity, or a mixture such as a serum product made up of a plurality of different molecular entities.
  • Polypeptides may be prepared synthetically m a peptide synthesizer or as fusion proteins as described above and coupled to a carrier molecule (e.g., keyhole limpet hemocyanm) and injected over several months into rabbits, mice, goats, etc. Sera is tested for immunoreactivity to the CHDl polypeptide or fragment.
  • Monoclonal antibodies may be made by injecting mice with the protein polypeptides, fusion proteins or fragments thereof. Monoclonal antibodies are screened by ELISA and tested for specific immunoreactivity with CHDl polypeptide or fragments thereof. See, Harlow and Lane, 1988. These antibodies will be useful m assays and as pharmaceuticals.
  • antibodies specific for binding may be either polyclonal or monoclonal, and may be produced by in vitro or in vivo techniques well known in the art.
  • an appropriate target immune system typically mouse or rabbit
  • Substantially purified antigen is presented to the immune system in a fashion determined by methods appropriate for the animal and by other parameters well known to lmmunologists .
  • the injections are performed footpads, intramuscularly, mtrape ⁇ toneally, or mtradermally . Of course, other species may be substituted for mouse or rabbit.
  • Polyclonal antibodies are then purified using techniques known m the art, adjusted for the desired specificity.
  • An immunological response is usually assayed with an immunoassay.
  • immunoassays involve some purification of a source of antigen, for example, that produced by the same cells and in the same fashion as the antigen.
  • a variety of immunoassay methods are well known m the art. See, e.g., Harlow and Lane, 1988, or God g, 1986. Monoclonal antibodies with affinities of 10 ⁇ 8
  • M “ or preferably 10 "9 to 10 "i0 M _i or stronger are typically made by standard procedures as described, e.g., in Harlow and Lane, 1988 or Godmg, 1986. Briefly, appropriate animals are selected and the desired immunization protocol followed. After the appropriate period of time, the spleens of such animals are excised and individual spleen cells fused, typically, to immortalized myeloma cells under appropriate selection conditions. Thereafter, the cells are clonally separated and the supernatants of each clone tested for their production of an appropriate antibody specific for the desired region of the antigen.
  • recombinant immunoglobulins may be produced (see U.S. Patent 4, 816,567) .
  • epitope refers to a region of a polypeptide that provokes a response by an antibody. This region needs not comprise consecutive ammo acids.
  • epitope is also known m the art as "antigenic determinant" .
  • a biological sample refers to a sample of tissue or fluid suspected of containing an analyte polynucleotide or polypeptide from an individual including, but not limited to, e.g., plasma, serum, spmal fluid, lymph fluid, the external sections of the skm, respiratory, intestinal, and genitourinary tracts, tears, saliva, blood cells, tumors, organs, tissue and samples of m vitro cell culture constituents .
  • diagnosis and “prognosmg, " as used m the context of coronary heart disease or metabolic disorders, are used to indicate
  • the practice of the present invention employs, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA, genetics, and immunology. See, e.g., Mamatis et al . , 1982; Sambrook et al., 1989; Ausubel et al . , 1992; Glover, 1985; Anand, 1992; Guthrie and Fink, 1991.
  • CHDl a genetic locus which causes susceptibility to coronary heart disease and metabolic disorders
  • the region containing the CHDl locus was identified using a variety of genetic techniques. Genetic mapping techniques initially defined the CHDl region m terms of recombination with genetic markers. Based upon studies of large extended families ("kindreds") with multiple cases of coronary heart disease and metabolic disorders, a chromosomal region has been pinpointed that contains the CHDl gene as well as putative susceptibility alleles in the CHDl locus. Two meiotic breakpoints have been discovered on the distal side of the CHDl locus which are expressed as recombmants between genetic markers and the disease locus, and two recombmants on the proximal side of the CHDl locus. Thus, a region which contains the CHDl locus is physically bounded by these markers. Figure 1 shows the order of these markers.
  • markers are essential for linking a disease to a region of a chromosome.
  • markers include restriction fragment length polymorphisms (RFLPs) (Botste et al . , 1980), markers with a variable number of tandem repeats (VNTRs) (Jeffreys et al . , 1985; Nakamura et al . , 1987), and an abundant class of DNA polymorphisms based on short tandem repeats (STRs), especially repeats of CpA (Weber and May, 1989; Litt et al . , 1989).
  • RFLPs restriction fragment length polymorphisms
  • VNTRs variable number of tandem repeats
  • STRs short tandem repeats
  • Genetic markers useful in searching for a genetic locus associated with a disease can be selected on an ad hoc basis, by densely covering a specific chromosome, or by detailed analysis of a specific region of a chromosome.
  • a preferred method for selecting genetic markers linked with a disease involves evaluating the degree of mformativeness of kindreds to determine the ideal distance between genetic markers of a given degree of polymorphism, then selecting markers from known genetic maps, which are ideally spaced for maximal efficiency. Informativeness of kindreds is measured by the probability that the markers will be heterozygous m unrelated individuals.
  • STR markers which are detected by amplification of the target nucleic acid sequence using PCR; such markers are highly informative, easy to assay (Weber and May, 1989) , and can be assayed simultaneously using multiplexing strategies (Skolmck and Wallace, 1988), greatly reducing the number of experiments required. This linkage analysis is described in Example 2.
  • markers that flank the disease locus i.e., one or more markers proximal to the disease locus, and one or more markers distal to the disease locus.
  • candidate markers can be selected from a known genetic map.
  • new markers can be identified by the STR technique, as shown in Example 2.
  • Genetic mapping is usually an iterative process. In the present invention, it began by defining flanking genetic markers around the CHDl locus, then replacing these flanking markers with other markers that were successively closer to the CHDl locus. As an initial step, recombination events, defined by large extended kindreds, helped specifically to localize the CHDl locus as either distal or proximal to regionally localized specific genetic markers.
  • one useful gene finding strategy is to generate an almost complete genomic sequence of that interval. Random genomic clone sublibraries can be prepared from each BAC or PAC clone m the minimum tiling path. Individual sublibrary clones sufficient m number to generate an, on average, 6x redundant sequence of each BAC or PAC can then be end-sequenced with vector primers. These sequences can be assembled into sequence contigs, and these contigs placed in a local genomic sequence database.
  • Genomic sequencing is described m Example 4.
  • cDNA clones cognate to the minimum tiling path BACs and PACs.
  • One preferred cDNA cloning strategy is hybrid selection.
  • cDNA can be prepared from a number of human tissues and human cell lines in such a manner that the cDNA molecules have PCR primer binding sites (anchors) at each end. This cDNA can be affinity captured with the minimum tiling path BACs and PACs. Captured cDNA can then be amplified by PCR using the anchor primers and then cloned. Individual clones can then be end-sequenced with vector primers.
  • sequences of these cDNA clones can be analyzed for similarity to genomic sequence contigs generated from BACs and PACs on the minimum tiling path. One can then identify individual exons of genes m the genetically defined interval by parsing the sequences of true-positive hybrid selected clones across these genomic sequence contigs . Hybrid selection is described m Example 5.
  • RACE Rapid amplification of cDNA ends
  • cDNA can be prepared from a number of human tissues in a manner such that the cDNA molecules have PCR primer binding sites (anchors) at their 5' ends, 3' ends, or both.
  • anchors PCR primer binding sites
  • PCR amplification with 3' end anchor primers and gene specific forward primers can generate 3' RACE products.
  • cDNA cloning techniques can also miss exons that lie between already known exons of a gene; for instance, this can easily occur if a particular exon is only included in a relatively rare splice variant of a transcript.
  • Combinatorial inter-exon PCR is an effective strategy for detecting these exons.
  • One can design a forward primer based on sequences from the first known exon of the gene and a set of reverse primers, one based on the sequence of each of the downstream exons (or any subset thereof) of the gene.
  • PCR products that differ in length from the expected product can be purified. In either RACE or combinatorial inter-exon PCRs, the PCR products can either be purified and then sequenced directly or first cloned and then sequenced. RACE and mter-exon PCR are described m Example 5.
  • cDNA library screening Another useful strategy for finding new 5', 3 ' , or internal sequences is cDNA library screening.
  • CHDl susceptibility alleles will co-segregate with the disease m large kindreds. They will also be present at a much higher frequency m non-kmdred individuals with coronary heart disease or metabolic disorders than in individuals m the general population. Whether one is comparing CHDl sequences from coronary heart disease or dyslipidemic cases to those from unaffected individuals, the key is to find mutations that are serious enough to cause obvious disruption to the normal function of the gene product. These mutations can take a number of forms.
  • alteration of the wild-type CHDl locus is detected.
  • the method can be performed by detecting the wild-type CHDl locus and confirming the lack of a predisposition to metabolic disorders at the CHDl locus.
  • "Alteration of a wild-type gene” encompasses all forms of mutations including deletions, insertions and point mutations m the coding and noncod g regions. Deletions may be of the entire gene or of only a portion of the gene. Point mutations may result in stop codons, frameshift mutations or ammo acid substitutions. Such mutations may be present in individuals either with or without symptoms of coronary heart disease or metabolic disorders.
  • Point mutations or deletions may alter the protein produced by CHDl, impairing its function. Point mutations or deletions may occur m regulatory regions, such as m the promoter of the gene, leading to loss or diminution of expression of the mRNA. Point mutations or deletions may also abolish proper RNA processing, leading to reduction or loss of expression of the CHDl gene product, expression of an altered CHDl gene product, or to a decrease in mRNA stability or translation efficiency.
  • Useful diagnostic techniques include, but are not limited to, fluorescent m situ hybridization
  • FISH fluorescence in situ hybridization
  • Predisposition to coronary heart disease or metabolic disorders can be ascertained by testing any tissue of a human for mutations of the CHDl gene. For example, a person who has inherited a germline CHDl mutation would be prone to develop coronary heart disease or metabolic disorders. This can be determined by testing DNA from any tissue of the person's body. Most simply, blood can be drawn and DNA extracted from the cells of the blood. In addition, prenatal diagnosis can be accomplished by testing fetal cells, placental cells or amniotic cells for mutations of the CHDl gene. Alteration of a wild-type CHDl allele, whether, for example, by point mutation or deletion, can be detected by any of the means discussed herein.
  • Direct DNA sequencing either manual sequencing or automated fluorescent sequencing can detect sequence variation.
  • manual sequencing is very labor-intensive, but under optimal conditions, mutations m the coding sequence of a gene are rarely missed.
  • Another approach is the smgle-stranded conformation polymorphism assay (SSCA) (Orita et al., 1989) .
  • SSCA smgle-stranded conformation polymorphism assay
  • the fragments with shifted mobility on SSCA gels are then sequenced to determine the exact nature of the DNA sequence variation.
  • Other approaches based on the detection of mismatches between the two complementary DNA strands include clamped denaturing gel electrophoresis (CDGE) (Sheffield et al., 1991), heteroduplex analysis (HA) (White et al . , 1992) and chemical mismatch cleavage (CMC) (Grompe et al . , 1989). None of the methods described above will detect large deletions, duplications or insertions, nor will they detect a regulatory mutation that affects transcription or translation of the protein.
  • Detection of point mutations may be accomplished by molecular cloning of the CHDl allele (s) and sequencing the allele (s) using techniques well known m the art.
  • the gene sequences can be amplified directly from a genomic DNA preparation from the tissue, using known techniques, as exemplified m Example 6. The DNA sequence of the amplified sequences can then be determined.
  • SSCA smgle-stranded conformation analysis
  • DGGE denaturing gradient gel electrophoresis
  • CDGE denaturing gradient gel electrophoresis
  • RFLP restriction fragment length polymorphism
  • SSCA detects a band that migrates differentially because the sequence change causes a difference in single-strand, intramolecular base pairing.
  • RNase protection involves cleavage of the mutant polynucleotide into two or more smaller fragments.
  • DGGE detects differences in migration rates of mutant sequences compared to wild-type sequences, using a denaturing gradient gel.
  • an allele-specific oligonucleotide assay an oligonucleotide is designed that detects a specific sequence, and the assay is performed by detecting the presence or absence of a hybridization signal.
  • the protein binds only to sequences that contain a nucleotide mismatch m a heteroduplex between mutant and wild-type sequences.
  • Mismatches are hybridized nucleic acid duplexes in which the two strands are not 100% complementary. Lack of total homology may be due to deletions, insertions, inversions or substitutions. Mismatch detection can be used to detect point mutations m the gene or its mRNA product. While these techniques are less sensitive than sequencing, they are simpler to perform on a large number of samples.
  • An example of a mismatch cleavage technique is the RNase protection method. In the practice of the present invention, the method involves the use of a labeled ⁇ boprobe that is complementary to the human wild-type CHDl gene coding sequence.
  • the riboprobe and either mRNA or DNA isolated from the tumor tissue are annealed (hybridized) together and subsequently digested with the enzyme RNase A which is able to detect some mismatches in a duplex RNA structure. If a mismatch is detected by RNase A, it cleaves at the site of the mismatch. Thus, when the annealed RNA preparation is separated on an electrophoretic gel matrix, if a mismatch has been detected and cleaved by RNase A, an RNA product will be detected that is smaller than the full length duplex RNA for the riboprobe and the mRNA or DNA.
  • the riboprobe does not need to be the full length of the CHDl mRNA or gene but can be a segment of either. If the riboprobe comprises only a segment of the CHDl mRNA or gene, it will be desirable to use a number of these probes to screen the whole mRNA sequence for mismatches .
  • DNA probes can be used to detect mismatches, through enzymatic or chemical cleavage. See, e.g., Cotton et al . , 1988; Shenk et al., 1975; Novack et al . , 1986.
  • mismatches can be detected by shifts in the electrophoretic mobility of mismatched duplexes relative to matched duplexes. See, e.g., Ca ⁇ ello, 1988.
  • the cellular mRNA or DNA that might contain a mutation, can be amplified using PCR (see below) before hybridization. Changes m DNA of the CHDl gene can also be detected using Southern hybridization, especially if the changes are gross rearrangements, such as large deletions and insertions.
  • DNA sequences of the CHDl gene that have been amplified by the polymerase chain reaction may also be screened using allele-specific probes.
  • These probes are nucleic acid oligomers, each of which contains a region of the CHDl gene sequence harboring a known mutation.
  • one oligomer may be about 30 nucleotides in length (although shorter and longer oligomers are also usable as well recognized by those of skill in the art) , corresponding to a portion of the CHDl gene sequence.
  • PCR amplification products can be screened to identify the presence of a previously identified mutation in the CHDl gene.
  • Hybridization of allele-specific probes with amplified CHDl sequences can be performed, for example, on a nylon filter. Hybridization to a particular probe under high stringency hybridization conditions indicates the presence of the same mutation in the tumor tissue as in the allele-specific probe.
  • the newly developed technique of nucleic acid analysis via microchip technology is also applicable to the present invention.
  • this technique thousands of distinct oligonucleotide probes are embedded m an array on a silicon or glass chip. Nucleic acid to be analyzed is fluorescently labeled and hybridized to the probes on the chip. It is also possible to study nucleic acid-protem interactions using these nucleic acid microchips. Using this technique one can determine the presence of mutations or even sequence the nucleic acid being analyzed, or one can measure expression levels of a gene of interest.
  • a major advantage of this method is that parallel processing ol many, even thousands, of probes at once can be accomplished and thereby increase the rate of analysis tremendously.
  • Several papers that use this technique have been published (Hacia et al . , 1996; Ramsay, 1998, and Schena et al . , 1996).
  • the most definitive test for mutations m a candidate locus is to directly compare genomic CHDl sequences from coronary heart disease or metabolic disorders patients with those from a control population.
  • Mutations from coronary heart disease or metabolic disorder patients falling outside the coding region of CHDl can be detected by examining the non-coding regions, such as mtrons and regulatory sequences near or withm the CHDl gene.
  • An early indication that mutations noncodmg regions are important may come from Northern blot experiments that reveal messenger RNA molecules of abnormal size or abundance in coronary heart disease or metabolic disorder patients as compared to control individuals.
  • Alteration of CHDl mRNA expression can be detected by any techniques known m the art. These include Northern blot analysis, PCR amplification and RNase protection. Diminished mRNA expression indicates an alteration of the wild-type CHDl gene. Alteration of wild-type CHDl genes can also be detected by screening for alteration of wild-type CHDl protein. For example, monoclonal antibodies immunoreactive with CHDl can be used to screen a tissue. Lack of cognate antigen would indicate a CHDl mutation. Antibodies specific for products of mutant alleles could also be used to detect mutant CHDl gene product. Such immunological assays can be done in any convenient formats known m the art.
  • Any means for detecting an altered CHDl protein can be used to detect alteration of wild-type CHDl genes.
  • Functional assays such as protein binding determinations, can be used.
  • assays can be used that detect CHDl biochemical function, for instance, DNA binding. Finding a mutant CHDl gene product indicates presence of a mutant CHDl allele.
  • Mutant CHDl genes or gene products can also be detected other human body samples, such as serum, stool, urine, sputum and buccal swabs. The same techniques discussed above for detection of mutant CHDl genes or gene products tissues can be applied to other body samples.
  • CHDl gene product itself may be secreted into the extracellular space and found m these body samples even m the absence of cells. By screening such body samples, a simple diagnosis can be achieved.
  • the primer pairs of the present invention are useful for determination of the nucleotide sequence of a particular CHDl allele using PCR.
  • the pairs of smgle-stranded DNA primers can be annealed to sequences within or surrounding the CHDl gene on chromosome 11 m order to amplify the DNA comprising the CHDl gene itself.
  • a complete set of these primers allows synthesis of all of the nucleotides of the CHDl gene coding sequences, i.e., the exons.
  • the set of primers preferably allows synthesis of both mtron and exon sequences. Allele-specific primers can also be used. Such primers anneal only to particular CHDl mutant alleles, and thus will only amplify a product m the presence of the mutant allele as a template.
  • primers may have restriction enzyme recognition sequences appended to their 5' ends.
  • all nucleotides of the primers are derived from CHDl sequences or sequences adjacent to CHDl, except for the few nucleotides necessary to form a restriction enzyme recognition site.
  • the primers themselves can be synthesized using techniques well known in the art. Generally, the primers can be made using commercially-available oligonucleotide synthesizing machines. Given the sequence of the CHDl gene shown in SEQ ID NOs : 1, 3, 5, and 7, design of particular primers is well within the skill of the art.
  • the nucleic acid probes provided by the present invention are useful for a number of purposes. They can be used in Southern hybridization to genomic DNA and in the RNase protection method for detecting point mutations already discussed above. The probes can be used to detect PCR amplification products. They may also be used to detect mismatches with the CHDl gene or mRNA using other techniques.
  • the presence of an altered (or a mutant) CHDl gene that produces a protein having a loss of function, or altered function may correlate to an increased risk of coronary heart disease or metabolic disorders.
  • a biological sample is prepared and analyzed for a difference between the sequence of the CHDl allele being analyzed and the sequence of the wild-type CHDl allele.
  • Mutant CHDl alleles can be initially identified by any of the techniques described above. The mutant alleles are then sequenced to identify the specific mutation of the particular mutant allele. Alternatively, mutant CHDl alleles can be initially identified by identifying mutant (altered) CHDl proteins, using conventional techniques. The mutant alleles are then sequenced to identify the specific mutation for each allele. The mutations, especially those which lead to an altered function of the CHDl protein, are then used for the diagnostic and prognostic methods of the present invention.
  • CHDl GENE STRUCTURE As detailed in Example 7, The CHDl gene sequence has been determined. Ten exons and about 20 kb of contiguous flanking genomic DNA have been sequenced. Four polypeptides, due to alternative splicing, are predicted to be encoded by this locus based on the sequence data. Transcripts encoding all four proteins have been observed in cDNAs from various sources . More than four alternatively spliced transcripts are predicted and observed to encode these four CHDl proteins.
  • the CHDl protein has domains with significant sequence homology to protein domains m the database ( Figure 3) .
  • One such domain is a set of eight C2H2 zmc-fmger motifs. Zmc-fmger motifs often serve as nucleic acid binding motifs, and can also serve as protein interaction motifs.
  • a leucine-rich SCAN domain is found near the N-terminus of all of the alternative proteins (amino acids 49-125) . This domain is found in at least 10 other putative transcription factors, but its function is currently unknown (Williams et al . 1995, Lee et al., 1997).
  • Figure 8 displays a comparison between the CHDl SCAN domain and a consensus SCAN domain sequence derived from homology analysis of SCAN domain containing zinc-finger proteins in the GenBank database. Yeast-two-hybrid experiments as well as in vitro interaction studies indicate that the SCAN domain acts as a protein-protein interaction surface leading to homo- and/or heterodimerization of two SCAN containing peptides, polypeptides or proteins.
  • the functional form of CHDl may therefore include a homo- and/or heterodimer of different CHDl isoforms or CHDl and other SCAN domain containing zinc-finger proteins.
  • Precedents for transcription factors acting as dimers include members of the bZIP family, bHLH proteins and nuclear receptors (Kouzarides and Ziff, 1988, Fairman et al. 1993, Fawell et al . , 1990).
  • a third domain, the KRAB domain (amino acids 235-276 in the protein encoded by cDNAl), is found in many zinc-finger containing transcription factors. It is often a site for protein- protein interaction, and it has been observed as a transcriptional repression domain (Kim et al . , 1996,
  • CHDl serves as a sequence-specific DNA-binding transcription factor.
  • the presence of a KRAB domain raises the possibility that at least one function of CHDl is that of a repressor: it binds to its cognate binding sites on CHDl target genes and turns these genes off or reduces the level of transcription of these genes.
  • Two of the alternative cDNAs (-3 and -4) encode small proteins largely identical to the N- termmus of the longer protein products (-1 and -2, respectively) . Tagged fusion proteins have identified the subcellular localization of some of these proteins.
  • the protein encoded by cDNAl is largely localized to the nucleus, whereas the protein encoded by cDNA3 is found to be diffuse throughout the cell. These localizations were monitored by fusing the relevant CHDl open reading frame to green fluorescent protein under the control of the cytomegalovirus promoter, transfectmg these constructs into 293 cells and monitoring expression microscopically.
  • the N-termmus may interact with another protein, call it "protein X", and target protein X to the transcriptional control region of relevant genes.
  • protein X a protein that also binds protein X but lacks a DNA binding motif could regulate the effective concentration of protein X, and the function of the protein complex bound to the regulatory region.
  • Such alternative transcripts retaining only partial function have been described for transcription factors and found to serve as competitive regulators (Chen et al., 1994, Arshura et al . , 1995, and Walker et al . , 1996).
  • polynucleotides of the present invention may be produced by replication m a suitable host cell. Natural or synthetic polynucleotide fragments coding for a desired fragment will be incorporated into recombinant polynucleotide constructs, usually DNA constructs, capable of introduction into and replication in a prokaryotic or eukaryotic cell.
  • polynucleotide constructs will be suitable for replication in a unicellular host, such as yeast or bacteria, but may also be intended for introduction to (with and without integration into the genome) cultured mammalian or plant or other eukaryotic cell lines.
  • a unicellular host such as yeast or bacteria
  • the purification of nucleic acids produced by the methods of the present invention is described, e.g., in Sambrook et al . , 1989 or Ausubel et al . , 1992.
  • the polynucleotides of the present invention may also be produced by chemical synthesis, e.g., by the phosphoramidite method described by Beaucage and Caruthers, 1981 or the triester method according to Matteucci and Caruthers, 1981, and may be performed on commercial, automated oligonucleotide synthesizers.
  • a double-stranded fragment may be obtained from the single-stranded product of chemical synthesis either by synthesizing the complementary strand and annealing the strands together under appropriate conditions or by adding the complementary strand using DNA polymerase with an appropriate primer sequence.
  • Polynucleotide constructs prepared for introduction into a prokaryotic or eukaryotic host may comprise a replication system recognized by the host, and comprises the intended polynucleotide fragment encoding the desired polypeptide, preferably with transcription and translational initiation regulatory sequences operably linked to the polypeptide-encoding polynucleotide segment.
  • Expression vectors may include, for example, an origin of replication or autonomously replicating sequence (ARS) and expression control sequences, a promoter, an enhancer and necessary processing information sites, such as ribosome-binding sites, RNA splice sites, polyadenylation sites, transcriptional termination sequences, and mRNA stabilizing sequences.
  • ARS autonomously replicating sequence
  • Secretion signals may also be included where appropriate, whether from a native CHDl protein or from other receptors or from secreted polypeptides of the same or related species. These secretion signals thereby allow the protein to cross and/or lodge in cell membranes, and thus attain its functional topology, or be secreted from the cell.
  • Such vectors may be prepared by means of standard recombinant techniques well known in the art and discussed, for example, in Sambrook et al . , 1989 or Ausubel et al . 1992.
  • An appropriate promoter and other necessary vector sequences will be selected so as to be functional in the host, and may include, when appropriate, those naturally associated with CHDl genes. Examples of workable combinations of cell lines and expression vectors are described in Sambrook et al., 1989, or Ausubel et al . , 1992; see also, e.g., Metzger et al . , 1988. Many useful vectors are known in the art and may be obtained from such vendors as Stratagene, New England Biolabs, Promega Biotech, and others. Promoters such as the trp, lac and phage promoters, tRNA promoters and glycolytic enzyme promoters may be used in prokaryotic hosts.
  • Useful yeast promoters include promoter regions for metallothionein, 3-phosphoglycerate kinase or other glycolytic enzymes such as enolase or glyceraldehyde-3-phosphate dehydrogenase, enzymes responsible for maltose and galactose utilization, and others.
  • Vectors and promoters suitable for use in yeast expression are further described in Hitzeman et al . , EP 73, 675A.
  • Appropriate non-native mammalian promoters might include the early and late promoters from SV40 (Fiers et al .
  • the construct may be joined to an amplifiable gene (e.g., DHFR) so that multiple copies of the gene may be made.
  • amplifiable gene e.g., DHFR
  • expression vectors may replicate autonomously, they may also replicate by being inserted into the genome of the host cell, by methods well known in the art.
  • Expression and cloning vectors will likely contain a selectable marker, a gene encoding a protein necessary for survival or growth of a host cell transformed with the vector. The presence of this gene ensures growth of only those host cells bearing the cloning vehicle.
  • Typical selection genes encode proteins that a) confer resistance to antibiotics or other toxic substances, e.g. ampicillin, neomycin, methotrexate, etc.; b) complement auxotrophic deficiencies; or c) supply critical nutrients not available from complex media, e.g., the gene encoding D-alanine racemase for Bacilli.
  • the choice of the proper selectable marker will depend on the host cell, and appropriate markers for different hosts are well known in the art.
  • the vectors containing the nucleic acids of interest can be transcribed in vitro, and the resulting RNA introduced into the host cell by well-known methods, e.g., by injection (see, Kubo et al . , 1988), or the vectors can be introduced directly into host cells by methods well known in the art, which vary depending on the type of cellular host, including electroporation; transfection employing calcium chloride, rubidium chloride, calcium phosphate, DEAE-dextran, or other substances; microprojectile bombardment; lipofection; infection (where the vector is an infectious agent, such as a retroviral genome) ; and other methods. See generally, Sambrook et al . , 1989 and Ausubel et al . , 1992.
  • the introduction of the polynucleotides into the host cell by any method known in the art, including, inter alia, those described above, will be referred to herein as "transformation.”
  • the cells into which nucleic acids described above have been introduced are meant to also include the progeny of such cells.
  • nucleic acids and polypeptides of the present invention may be prepared by expressing the CHDl nucleic acids or portions thereof in vectors or other expression vehicles in compatible prokaryotic or eukaryotic host cells.
  • prokaryotic hosts are strains of Escherichia coli, although other prokaryotes, such as Bacillus subtilis or Pseudomonas may also be used.
  • Mammalian or other eukaryotic host cells such as those of yeast, filamentous fungi, plant, insect, amphibian or avian species, may also be useful for production of the proteins of the present invention. Propagation of mammalian cells in culture is per se well known. See, Jakoby and Pastan, 1979. Examples of commonly used mammalian host cell lines are VERO and HeLa cells, Chinese hamster ovary (CHO) cells, and WI38, BHK, and COS cell lines. An example of a commonly used insect cell line is SF9. However, it will be appreciated by the skilled practitioner that other cell lines may be appropriate, e.g., to provide higher expression, desirable glycosylation patterns, or other features.
  • Clones are selected by using markers, the choice of which depends on the mode of the vector construction.
  • the marker may be on the same or a different DNA molecule, preferably the same DNA molecule.
  • the transformant may be selected, e.g., by resistance to ampicillin, tetracycline or other antibiotics. Production of a particular product based on temperature sensitivity may also serve as an appropriate marker.
  • Prokaryotic or eukaryotic cells transformed with the polynucleotides of the present invention will be useful not only for the production of the nucleic acids and polypeptides of the present invention, but also, for example, in studying the characteristics of CHDl polypeptides.
  • Antisense polynucleotide sequences are useful in preventing or diminishing the expression of the CHDl locus, as will be appreciated by those skilled in the art.
  • polynucleotide vectors containing all or a portion of the CHDl locus or other sequences from the CHDl region (particularly those flanking the CHDl locus) may be placed under the control of a promoter in an antisense orientation and introduced into a cell. Expression of such an antisense construct within a cell will interfere with CHDl transcription and/or translation and/or replication.
  • the probes and primers based on the CHDl gene sequences disclosed herein are used to identify homologous CHDl gene sequences and proteins in other species. These CHDl gene sequences and proteins are used m the diagnostic/prognostic, therapeutic and drug screening methods described herein for the species from which they have been isolated.
  • CHDl protein consisting of the last six zmc fingers has been purified as a GST-fusion protein.
  • a consensus DNA site (GGGGT) for this protein was selected by in vitro DNA binding studies by a method as described in Morris et al . , 1994. This consensus DNA site is found to exist m multiple copies m the regulatory regions upstream of several genes involved in lipid metabolism.
  • GST-CHD1 fusion protein binds specifically to promoter fragments of, inter alia, the ApoIV, ApoCIII, ApoE, LCAT, and LPL genes m in vitro DNA binding studies ( Figures 5 and 6) .
  • Table 12 A full summary of the promoters shown to bind to CHDl is provided Table 12.
  • the genes to whose promoters CHDl binds can be grouped according to function.
  • the first class is a set of apolipoprotem genes that encode structural components of circulating lipoprotems .
  • the second class is a set of genes encoding enzymes known to influence lipoprotem composition.
  • the third class is a set of genes implicated directly in the etiology of atherosclerosis, angiogenesis, diabetes, obesity and metabolic syndrome X.
  • Six genes whose promoter fragments are bound by CHDl encode proteins with no known involvement m CHD or metabolic disorders related to liplid metabolism.
  • Example 8 provides a detailed analysis of these three classes of genes.
  • HNF4 hepatic nuclear factor 4
  • Transfection assays indicate that CHDl represses transcription from this promoter suggesting that CHDl may regulate HNF4 expression in vivo .
  • Pathological consequences of CHDl dysfunction are likely include deregulation of HNF4 expression that may be counteracted by agonists/antagonists of HNF .
  • HNF4 is a member of the nuclear receptor superfamily, a class of ligand-activated transcription factors. HNF4 functions as a major regulator of liver- specific gene expression, and is involved in the expression of apolipoproteins Al, All, AIV B and CIII (Kardassis et al . , 1996) .
  • HNF4 ligand-activated nuclear receptor HNF4 presents an excellent target for drug development.
  • Example 9 describes polymorphisms m CHDl m CHDl and CEPH control cases. Particularly notable is a mutation of CHDl m a diabetic patient. This mutation changes a Lysme to Glutamic Acid withm the KRAB domain. A change in charge a putative protem-protem interaction domain is highly significant. This mutant protein may be unable to interact with a target protein. The lack of such interaction may have significant consequence to tne expression of CHDl target genes and thus to lipid metabolism.
  • genomic sequences including exonJ and promoter elements for CHDl have been identified (Example 7) .
  • Five polymorphisms and one insertion were found in CHD and CEPH samples. Their position and frequencies are listed m Table 17.
  • a biological sample such as blood is prepared and analyzed for the presence or absence of susceptibility alleles of CHDl. Results of these tests and interpretive information are returned to the health care provider for communication to the tested individual.
  • diagnoses may be performed by diagnostic laboratories, or, alternatively, diagnostic kits are manufactured and sold to health care providers or to private individuals for self-diagnosis.
  • the screening method involves amplification of the relevant CHDl sequences.
  • the screening method involves a non-PCR based strategy.
  • Such screening methods include two-step label amplification methodologies that are well known in the art. Both PCR and non-PCR based screening strategies can detect target sequences with a high level of sensitivity.
  • the most popular method used today is target amplification.
  • the target nucleic acid sequence is amplified with polymerases.
  • One particularly preferred method using polymerase-driven amplification is the polymerase chain reaction (PCR) .
  • PCR polymerase chain reaction
  • This preferred method is exemplified in Example 6.
  • the polymerase chain reaction and other polymerase-driven amplification assays can achieve over a million-fold increase m copy number through the use of polymerase-driven amplification cycles.
  • the resulting nucleic acid can be sequenced or used as a substrate for DNA probes, or for incorporation into cloning vectors .
  • the biological sample to be analyzed such as blood or serum
  • the biological sample to be analyzed may be treated, if desired, to extract the nucleic acids.
  • the sample nucleic acid may be prepared in various ways to facilitate detection of the target sequence; e.g. denaturation, restriction digestion, electrophoresis or dot blotting.
  • the targeted region of the analyte nucleic acid usually must be at least partially smgle-stranded to form hybrids with the targeting sequence of the probe. If the sequence is naturally smgle-stranded, denaturation will not be required. However, if the sequence is double-stranded, the sequence will probably need to be denatured. Denaturation can be carried out by various techniques known the art.
  • Analyte nucleic acid and probe are incubated under conditions that promote stable hybrid formation of the target sequence m the probe with the putative targeted sequence in the analyte.
  • the region of the probes used to bind to the analyte can be made completely complementary to the targeted region of human chromosome 11. Therefore, high stringency conditions are desirable m order to prevent false positives. Conditions of high stringency, however, are used only if the probes are complementary to regions of the chromosome that are unique in the genome.
  • the stringency of hybridization is determined by a number of factors during hybridization and during the washing procedure, including temperature, ionic strength, base composition, probe length, and concentration of formamide. These factors are outlined m, for example, Mamatis et al .
  • the formation of higher order hybrids such as triplexes, quadraplexes, etc., may be desired to provide the means of detecting target sequences .
  • Detection, if any, of the resulting hybrid is usually accomplished by the use of labeled probes.
  • the probe may be unlabeled, but may be detectable by specific binding with a ligand that is labeled, either directly or indirectly. Suitable labels and methods for labeling probes and ligands are known in the art.
  • radioactive labels that may be incorporated by known methods (e.g., nick translation, random priming or end- labeling by T4 polynucleotide kinase) , biotm, fluorescent groups, chemiluminescent groups (e.g., dioxetanes, particularly triggered dioxetanes) , enzymes, antibodies and etc. Variations of this basic scheme are known in the art, and include those variations that facilitate separation of the hybrids to be detected from extraneous materials and/or that amplify the signal from the labeled moiety. A number of these variations are reviewed Matthews and K ⁇ cka, 1988; Landegren et al., 1988; Mittlm, 1989; U.S. Patent 4,868,105, and m EPO Publication No. 225, 807.
  • non-PCR based screening assays are also contemplated in this invention.
  • This procedure hybridizes a nucleic acid probe (or an analog such as a methyl phosphonate backbone replacing the normal phosphodiester), to the low level DNA target.
  • This probe may have an enzyme covalently linked to the probe, such that the covalent linkage does not interfere with the specificity of the hybridization.
  • This complex consisting of enzyme, probe, conjugate and target nucleic acid can then be isolated away from the free probe enzyme conjugate.
  • a substrate is then added for enzyme detection. Enzymatic activity is observed as a change m color development or luminescent output resulting m a 10 3 -10° increase in sensitivity.
  • the small ligand attached to the nucleic acid probe is specifically recognized by an antibody-enzyme conjugate.
  • digoxigenm is attached to the nucleic acid probe.
  • Hybridization is detected by an antibody-alkaline phosphatase conjugate that acts on a chemiluminescent substrate.
  • the small ligand is recognized by a second ligand-enzyme conjugate that is capable of specifically complexmg to the first ligand.
  • a well known embodiment of this example is the biotm-avidin type of interactions.
  • biotm-avidin For methods for labeling nucleic acid probes and their use in biotm-avidm based assays see Rigby et al . , 1977 and Nguyen et al . , 1992.
  • tne nucleic acid probe assays employing a cocktail of nucleic acid probes capable of detecting CHDl.
  • a cocktail of nucleic acid probes capable of detecting CHDl.
  • the CHDl gene sequence in a patient more than one probe complementary to CHDl is employed where the cocktail includes probes capable of binding to the allele-specific mutations identified m populations of patients with alterations CHDl.
  • any number of probes can be used, and will preferably include probes corresponding to the major gene mutations identified as predisposing an individual to coronary heart disease or metabolic disorders.
  • Some candidate probes contemplated include probes comprising the allele-specific mutations identified m Tables 7 and 8 and those comprising the CHDl regions corresponding to SEQ ID Nos: 1, 3, 5, 7, 9 and 206, both 5' and 3' to the mutation site.
  • the genetic defect underlying CHD or metabolic disease can also be detected on the basis of the alteration of wild-type CHDl polypeptide. Such alterations can be determined by sequence analysis m accordance with conventional techniques. More preferably, antibodies (polyclonal or monoclonal) are used to detect differences , or the absence of, CHDl peptides.
  • the antibodies may be prepared as defined under "antibodies” (further shown m Examples 11 and 12) . Other techniques for raising and purifying antibodies are well known m the art and any such techniques may be chosen to achieve the preparations claimed m this invention. In a preferred embodiment of the invention, antibodies will lmmunoprecipitate
  • CHDl proteins from solution as well as react with CHDl protein on Western or lmmunoblots of polyacrylamide gels.
  • antibodies will detect CHDl proteins in paraffin or frozen tissue sections, using lmmunocytochemical techniques.
  • Preferred embodiments relating to methods for detecting CHDl or its mutations include enzyme-lmked lmmunosorbent assays (ELISA) , radioimmunoassays (RIA) , lmmunoradiometric assays (IRMA) and lmmunoenzymatic assays (IEMA), including sandwich assays using monoclonal and/or polyclonal antibodies .
  • ELISA enzyme-lmked lmmunosorbent assays
  • RIA radioimmunoassays
  • IRMA lmmunoradiometric assays
  • IEMA lmmunoenzymatic assays
  • Exemplary sandwich assays are described by David et al. in U.S. Patent Nos. 4,376,110 and 4,486,530, hereby incorporated by reference, and exemplified in Example 14.
  • This invention is particularly useful for screening compounds by using the CHDl polypeptide or binding fragment thereof in any of a variety of drug screening techniques .
  • the CHDl polypeptide or fragment employed in such a test may either be free in solution, affixed to a solid support (as shown in Example 13) , or borne on a cell surface.
  • One method of drug screening utilizes eukaryotic or prokaryotic host cells stably transformed with recombinant polynucleotides expressing the CHDl polypeptide or fragment. Such cells, either in viable or fixed form, can be used for standard binding assays, preferably in competitive binding assays.
  • An example of method (a) is provided in Example 13, wherein the drug candidates are peptides.
  • the present invention provides methods of screening for drugs comprising interaction between a drug candidate with a CHDl polypeptide or fragment thereof and assaying (I) for the presence of a complex between the drug candidate and the CHDl polypeptide or fragment, or (11) for the presence of a complex between the CHDl polypeptide or fragment and a ligand such as a polypeptide or DNA sequence, by methods well known in the art.
  • assaying for the presence of a complex between the drug candidate and the CHDl polypeptide or fragment, or (11) for the presence of a complex between the CHDl polypeptide or fragment and a ligand such as a polypeptide or DNA sequence
  • Free CHDl polypeptide or fragment is separated from that present in a protein rprote or protein: DNA complex, and the amount of free (i.e., uncomplexed) label is a measure of the binding of the drug candidate to CHDl or its interference with CHDl: ligand binding, respectively.
  • Another technique for drug screening provides high throughput screening for compounds having suitable binding affinity to the CHDl polypeptides and is described m detail m Geysen, PCT application WHO 84/03564, published on September 13, 1984. Briefly, large numbers of different small peptide test compounds are synthesized on a solid substrate, such as plastic pins or some other surface. The peptide test compounds are reacted with CHDl polypeptide and washed. Bound CHDl polypeptide is then detected by methods well known in the art.
  • Purified CHDl can be coated directly onto plates for use in the aforementioned drug screening techniques.
  • Non-neutralizmg antibodies to the polypeptide can be used to capture antibodies to immobilize the CHDl polypeptide on the solid phase.
  • Th s invention also contemplates the use of competitive drug screening assays in which neutralizing antibodies capable of specifically binding to the CHDl polypeptide compete with a test compound for binding to the CHDl polypeptide or fragments thereof. In this manner, the antibodies can be used to detect the presence of any test compound sharing one or more antigenic determinants of the CHDl polypeptide.
  • a further technique for drug screening involves the use of host eukaryotic cell lines or cells (such as described above) that have a nonfunctional CHDl gene. These host cell lines or cells are defective at the CHDl polypeptide level. The host cell lines or cells are grown in the presence of the candidate drug compound. The rate of growth of the host cells is measured to determine if the compound is capable of regulating the growth of CHDl defective cells .
  • a further technique for drug screening involves the use of host prokaryotic or eukaryotic cell lines or cells (such as described above) that have a reporter gene construct under the transcriptional regulation of the CHDl high-af mity DNA recognition sequences (see Example 8) , and that express either endogenous or exogenous CHDl polypeptide or fragment.
  • the host cell lines or cells are then exposed to a drug compound.
  • the rate of transcription of the reporter gene is then monitored to determine if the compound is capable of altering its expression.
  • CHDl is a sequence-specific DNA binding protein. It may be possible to use oligonucleotides comprising the CHDl DNA recognition sequence as an inhibitor.
  • the goal of rational drug design is to produce structural analogs of biologically active polypeptides of interest or of small molecules with which they interact (e.g., agonists, antagonists, inhibitors) in order to fashion drugs which are, for example, more active or stable forms of the polypeptide, or which, e.g., enhance or interfere with the function of a polypeptide in vivo. See, e.g., Hodgson, 1991.
  • An example of rational drug design is the development of HIV protease inhibitors (E ⁇ ckson et al., 1990).
  • peptides are analyzed by the "alanme scan” approach (Wells, 1991). In this technique, a particular ammo acid residue is replaced by Ala, and its effect on the peptide 's activity is determined. Each of the ammo acid residues of the peptide is analyzed this manner to determine the functionally important regions of the peptide (i.e. the peptide is alanme "scanned") . It is also possible to isolate a target-specific antibody, selected by a functional assay, and then to solve its crystal structure. In principle, this approach yields a pharmacore upon which subsequent drug design can be based.
  • a target-specific antibody selected by a functional assay
  • anti-idiotypic antibodies anti-ids
  • the binding site of the anti-ids would be expected to be an analog of the original protein.
  • the anti-ids could then be used to identify and isolate peptides from banks of chemically or biologically produced banks of peptides. Selected peptides would then act as the pharmacore.
  • drugs that have, for example, improved CHDl polypeptide activity or stability or that act as inhibitors, agonists, antagonists, etc. of CHDl polypeptide activity.
  • CHDl polypeptide By virtue of the availability of cloned CHDl sequences, sufficient amounts of the CHDl polypeptide may be made available to perform such analytical studies as x-ray crystallography. In addition, the knowledge of the CHDl protein sequence provided herein will guide those employing computer modeling techniques in place of, or addition to, x-ray crystallography.
  • a method is also provided that supply wild-type CHDl function to a cell carrying mutant CHDl alleles. Supplying such a function should suppress phenotypic expression of coronary heart disease or metabolic disorders m the recipient cell or m an organism bearing such cell.
  • the wild-type CHDl gene or a part of the gene may be introduced into the cell m a vector such that the gene remains extrachromosomal, or integrates at a random location into the cellular DNA. In such situations, the gene will be expressed by the cell from the extrachromosomal or chromosomal location, respectively.
  • the gene fragment snould encode a part of the CHDl protein that is required for suppression of the coronary heart disease or metabolic disorders phenotype. More preferred is the situation where the wild-type CHDl gene or a part thereof is introduced into the mutant cell in such a way that it recombines with the endogenous mutant CHDl gene present in the cell, especially if the mutant allele is a dominant allele (a cell bearing a dominant CHDl mutant allele is phenotypically mutant even in the presence of a wild-type allele) . Such recombination requires a double recombination event that results in the correction of the CHDl gene mutation.
  • Vectors for introduction of genes both for recombination and for extrachromosomal maintenance are known in the art, and any suitable vector may be used.
  • Methods for introducing DNA into cells such as electroporation, calcium phosphate coprecipitation and viral transduction are known in the art, and the choice of method is within the competence of the person of ordinary skill in the art.
  • Cells transformed with the wild-type CHDl gene can be used as model systems to study metabolic disorders and drug treatments that alter cellular metabolism.
  • the CHDl gene or fragment may be employed in gene therapy methods in order to increase the amount of the expression products of such genes.
  • Such gene therapy is particularly appropriate for use in cells in which the level of CHDl polypeptide is absent or diminished compared to normal cells. It may also be useful to increase the level of expression of a given CHDl gene even in those cells in which the mutant gene is expressed at a "normal" level, but the gene product is not fully functional. It may also be useful to increase the levels of CHDl in normal cells, providing the cell or host with a more atheroprotective phenotype . Gene therapy would be carried out according to generally accepted methods, for example, as described by Friedman, 1991. Cells from a patient's tissue of interest (i.e.
  • a virus or plasmid vector (see further details below) , containing a copy of the CHDl gene linked to expression control elements and possibly capable of replicating inside the cells, is prepared. Suitable vectors are known, such as disclosed m U.S. Patent 5,252,479 and PCT published application WHO 93/07282. The vector is then injected into the patient into the appropriate target tissue or systemically, or used to infect cells in vitro, and the cells then used to repopulate or supplement the patient's tissues. If the transfected gene is not permanently incorporated into the genome of each of the targeted cells, the treatment may have to be repeated periodically.
  • Gene transfer systems known m the art may be useful m the practice of the gene therapy methods of the present invention. These include viral and nonviral transfer methods.
  • viruses have been used as gene transfer vectors, including papovaviruses, e.g., SV40 (Madzak et al . , 1992), adenovirus (Berkner, 1992; Berkner et al . , 1988; Gorziglia and Kapikian, 1992; Quant et al . , 1992; Rosenfeld et al . , 1992; Wilkinson et al., 1992; Stratford-Pemcaudet et al . , 1990), vaccinia virus (Moss, 1992), adeno-associated virus (Muzyczka, 1992;
  • herpes viruses including HSV and EBV (Margolskee, 1992; Johnson et al . , 1992; Fink et al . , 1992; Breakfield and Geller, 1987; Freese et al . , 1990) , and retroviruses of avian (Brandyopadhyay and Temm, 1984; Petropoulos et al . , 1992), murme (Miller, 1992; Miller et al . , 1985; Sorge et al . , 1984; Mann and Baltimore, 1985; Miller et al . , 1988), and human origin (Shimada et al . , 1991; Helseth et al . , 1990; Page et al., 1990; Buchschacher and Panganiban, 1992). Most numan gene therapy protocols have been based on disabled murme retroviruses .
  • Nonviral gene transfer methods known the art include chemical techniques such as calcium phosphate coprecipitation (Graham and van der Eb, 1973; Pellicer et al., 1980); mechanical techniques, for example micromj ection (Anderson et al . , 1980; Gordon et al., 1980; Brmster et al . , 1981; Constant i and Lacy, 1981); membrane fusion-mediated transfer via liposomes (Feigner et al . , 1987; Wang and Huang, 1989; Kaneda et al, 1989; Stewart et al .
  • chemical techniques such as calcium phosphate coprecipitation (Graham and van der Eb, 1973; Pellicer et al., 1980); mechanical techniques, for example micromj ection (Anderson et al . , 1980; Gordon et al., 1980; Brmster et al . , 1981; Constant i and Lacy, 1981); membrane fusion-mediated transfer via liposomes (
  • Viral-mediated gene transfer can be combined with direct m vivo gene transfer using liposome delivery, allowing one to direct the viral vectors to the tumor cells and not into the surrounding nondividmg cells.
  • the retroviral vector producer cell line can be injected into tissues (Culver et al . , 1992). These producer cells would then provide a continuous source of vector particles. This technique has been approved for use m humans with inoperable brain tumors.
  • plasmid DNA of any size is combined with a polylysme-conjugated antibody specific to the adenovirus hexon protein, and the resulting complex is bound to an adenovirus vector.
  • the trimolecular complex is then used to infect cells.
  • the adenovirus vector permits efficient binding, mternalization, and degradation of the endosome before the coupled DNA is damaged.
  • Liposome/DNA complexes have been shown to be capable of mediating direct in vivo gene transfer. While m standard liposome preparations the gene transfer process is nonspecific, localized in vivo uptake and expression have been reported m tumor deposits, for example, following direct situ administration (Nabel, 1992), and may apply to particular tissues. Gene transfer techniques that target DNA directly to liver or other target tissues, are preferred. Receptor—mediated gene transfer, for example, is accomplished by the conjugation of DNA (usually in the form of covalently closed supercoiled plasmid) to a protein ligand via polylys e. Ligands are chosen on the basis of the presence of the corresponding ligand receptors on the cell surface of the target cell/tissue type.
  • ligand-DNA conjugates can be injected directly into the blood if desired and are directed to the target tissue where receptor binding and mternalization of the DNA-protem complex occurs.
  • coinfection with adenovirus can be included to disrupt endosome function.
  • Peptides that have CHDl activity can be supplied to cells with mutant or missing CHDl alleles.
  • Protein can be produced by expression of the cDNA sequence m bacteria, for example, using known expression vectors.
  • CHDl polypeptide can be extracted from CHDl-producmg mammalian cells.
  • the techniques of synthetic chemistry can be employed to synthesize CHDl protein. Any of such techniques can provide the composition of the present invention comprising the CHDl protein.
  • the composition is substantially free of other human proteins. This is most readily accomplished by synthesis a microorganism or in vitro.
  • Active CHDl molecules can be introduced into cells by micromj ection or by use of liposomes, for example. Alternatively, some molecules may be taken up by cells, actively or by diffusion. Extracellular application of the CHDl gene product may be sufficient to affect phenotype. Supply of molecules with CHDl activity should lead to partial reversal of the altered metabolic state. Other molecules with CHDl activity (for example, peptides, drugs or organic compounds) may also be used to effect such a reversal. Modified polypeptides having substantially similar function are also used for peptide therapy.
  • RESULTS OF USE TRANSFORMED HOSTS
  • the cells are typically cultured epithelial cells. These may be isolated from individuals with CHDl mutations. Alternatively, the cell line can be engineered to carry the mutation in the CHDl allele, as described above. After a test substance is applied to the cells, the phenotype of the cell is determined. Any metabolic trait of mutant cell lines, such as lipid metabolism or glucose metabolism can be assayed. Assays for each of these traits are known in the art . Animals for testing therapeutic agents can be selected after mutagenesis of whole animals or after treatment of germline cells or zygotes. The latter approach includes insertion of mutant CHDl alleles, usually from a second animal species, as well as insertion of disrupted homologous genes.
  • the endogenous CHDl gene(s) of the animals may be disrupted by insertion or deletion mutation or other genetic alterations using conventional techniques (Capecchi, 1989; Valancius and Smithies, 1991; Hasty et al . , 1991; Shmkai et al . , 1992; Mombaerts et al . , 1992; Philpott et al . , 1992; Snouwaert et al . , 1992; Donehower et al . , 1992).
  • appropriate metabolic profiles must be assessed. If the test alters cellular or organismal metabolism m an appropriate way, then the test substance is a candidate therapeutic agent for the treatment of the metabolic disorders identified herein.
  • Extensive coronary heart disease prone kindreds were ascertained from index cases with early coronary heart disease (before the age of 50 for men or 55 for women) . These probands were contacted, and those with familial history of coronary heart disease were recruited for lipid profiling. Large kindreds were expanded from those with extensive clustering of coronary heart disease or dyslipidemia . The large number of meioses present in these large kindreds provided the power to detect whether the CHDl locus was segregating, and increased the opportunity for informative recombmants to occur within the small region being investigated. This vastly improved the chances of establishing linkage to the CHDl region, and greatly facilitated the reduction of the CHDl region to a manageable size, which permits identification of the CHDl locus itself.
  • Each kindred was extended through all available connecting relatives and to all informative first degree relatives of each proband or affected relative. Medical records or death certificates were obtained for confirmation of coronary heart disease. Each key connecting individual and all informative individuals were invited to participate by providing a blood sample from which DNA was extracted, and for extensive lipid profiling, and medical histories were gathered. We also sampled spouses, siblings, and offspring of deceased cases so that the genotype of the deceased cases could be inferred from the genotypes of their relatives. The criteria for selection of kindreds to analyze for CHDl linkage were: 1) genotypes available, or inferable, for 6 or more coronary heart disease or dyslipidemic cases, and 2) at least genotyped cases within a second degree of relationship to another genotyped case.
  • the liability classes were: Class 1, strongly, affected, HDL-C less than 10% of population (corrected for age and sex) ; Class 2, weakly affected, HDL-C greater than 10% but less than 25%; Class 3, HDL-C greater than 25%.
  • Each individual was scrutinized for the presence of factors that might confound HDL-C levels, and the phenotype corrected accordingly. Confounding factors included body mass index, alcohol consumption and prescription drug use.
  • EXAMPLE 2 Selection of Kindreds That Are Linked to
  • Nuclear pellets were extracted from 16 ml of ACD blood, and DNA extracted with phenol and chloroform, precipitated with ethanol, and resuspended m Tris-EDTA.
  • the markers used for genotypmg were short tandem repeat (STR) loci at llq23 which flanked the most likely CHDl location as indicated from preliminary genomic search data.
  • the region showing preliminary linkage was from D11S924 to D11S912, an interval of about 20cM (centiMorgans) .
  • This model assumed a rare autosomal dominant susceptibility locus (gene frequency of 0.01) and allowed for a sporadic rate of coronary heart disease. Marker allele frequencies were estimated from unrelated individuals present in the kindreds. A total of 42 kindreds were analyzed with at least nine markers from the region (see Table 3 for a list of markers) . Most of these were haplotyped by hand, and segregating haplotypes were assigned as rare alleles at a single locus. The genotypes and haplotypes were analyzed for linkage using a number of different dyslipidemic phenotypes.
  • Genomic clone contig assembly in the CHDl region started from the 1993-94 Genethon human genetic linkage map (Gyapay, et al . , 1994). YACs located in the interval between D11S1353 and D11S933 were ordered from Genome Systems. Primer pairs for markers located
  • Primer pairs defining BAC or PAC end markers were designed from these sequences. These new markers were checked against the YACs and a human chromosome ll-contammg rodent cell line to ensure that they mapped to the correct chromosome and within the correct interval. These new markers were checked against the already identified BACs/PACs to determine the positions of these clones relative to each other. The outside markers from each clone contig were used to screen the Myriad BAC library; those that were negative on that BAC library were used to screen the BAC and PAC libraries at Genome Systems. Repeated cycles of library screening and marker development allowed for the construction of a BAC/PAC contig that spanned the minimal recombinant interval ( Figure 2) .
  • a 15 clone BAC/PAC contig spans the interval between D11S1353 and D1S933 (markers 1 and 14, see Table 3) .
  • the CHDl locus must lie in the interval between the markers 1 and 54 ( Figure 1 and Table 3) .
  • This interval is spanned by a 15 clone BAC/PAC contig.
  • the sizes of these BACs and PACs in the contig and extensive sequencing of those BACs and PACs we estimate the size of the minimal genetically defined interval containing CHDl to be about 1 megabase.
  • BAC or PAC DNA was sheared by sonication.
  • the sonicated DNA was incubated with mung-bean nuclease (Pharmacia Biotech) followed by treatment with a Pfu polishing kit
  • the DNA fragments were fractionated by size on a 0.8% TAE agarose gel, and fragments in the size range of 1.0 - 1.6 kb were excised under longwave (365 nm) ultraviolet light.
  • the excised gel slice was rotated 180 degrees relative to the original direction of electrophoresis and then placed into a new gel tray containing 1.0% GTG-Seaplaque low melting temperature agarose (FMC corporation) before the gel solidified. Electrophoresis was repeated for the same time and voltage as the first run, resulting in a concentration of the DNA fragments in a small volume of agarose, and the gel slice containing the DNA fragments was once again excised from the gel.
  • DNA fragments were purified from the agarose by incubating the gel slice with beta-agarase (New England Biolabs), followed by removal of the agarose monomers using disposable microconcentrators (Amicon) that employ a 50,000 Daltons molecular weight cutoff filter.
  • DNA fragments were ligated into the Hinc II site of the plasmid pMYG2, a pBluescript (Stratagene) derivative where the polylinker has been replaced by the pMYG2 polylinker (Table 4) .
  • the vector was prepared by digestion with Hinc I I fol lowed by dephosphorylation with cal f al kal ine phosphatase ( Boehrmger Mannheim) .
  • Ligated products were transformed into DH5- E. coli competent cells (Life Technologies, Inc.) and plated on LB plates containing ampicillm, IPTG, and Bluo-gal (Sigma; Life Technologies, Inc.). White colonies were used to inoculate individual wells of 1 ml 96-well microtiter plates (Beckman) containing 200 microliters of LB media supplemented with ampicillm at 50 micrograms per milliliter. The plates were incubated for 16-20 hours m a shaking incubator at 37° Celsius. After incubation, 20 microliters of dimethyl sulfoxide were added to each well and the plates were stored frozen. The inserts of random-sheared clones were amplified from E. coli cultures by PCR with vector primers, and the PCR products were sequenced with M13 forward or reverse fluorescent energy transfer (FET) dye-labeled primers on ABI 377 sequencers.
  • FET fluorescent energy transfer
  • DNA sequencing gel files were examined for lane tracking accuracy and adjusted where necessary before data extraction.
  • ABI sample files resulting from gel files were converted to the Standard Chromatogram Format (SCF) and trimmed of sequencing vector (pMYGl or pMYG2 ) . Trimmed sequences were assembled using Acem.bly (Gold-Mieg et al . , 1995; Durbin and Thierry-Mieg, 1991) . Contiguous sequence resulting from automatic assembly was screened for residual vector sequence (both sequencing vector and cloning vector) as well as for bacterial contamination using BLAST (Altschul et al . , 1990).
  • Remaining sequences were arranged according to sequence similarity to overlapping genomic clones. Repetitive sequence was masked from the sequence contigs using xblast (Claverie and States, 1993) . These masked sequences were placed m a Genetic Data Environment (GDE) (Smith et al . , 1994) local database for subsequent similarity searches. Similarities among genomic DNA sequences and hybrid-selected cDNA clones as well as GenBank entries - both DNA and protein - were identified using BLAST. The DNA sequences were also characterized with respect to short period repeats, CpG content, and long open reading frames.
  • GDE Genetic Data Environment
  • EXAMPLE 5 Gene Identification cDNA Preparation.
  • First-strand cDNA molecules were synthesized from Poly (A) enriched RNA from different human tissues (listed Table 5) using the tailed random primer FSnnNIO (all primers are listed in Table 6) or the tailed oligo-deoxythymidme primer FSnnT12 and Superscript II reverse transcriptase (Gibco BRL) in reverse transcription reactions.
  • nn' refers to a dmucleotide specific for the tissue source (Table 5)
  • ⁇ N' refers to any of the four nucleotides.
  • Hybrid selection was performed by a modified procedure of Lovett et al . (1991). Selection probes were prepared from purified BACs or PACs by digestion with Hmf I and Exonuclease III. The smgle-stranded probe was photolabelled with photobiotm (Gibco BRL) according to the manufacturer's recommendations. Probe, cDNA and Cot-1 DNA and poly A RNA were hybridized overnight at 40°C in 2.4M TEA-C1, 10 mM NaP0 4 , 1 mM EDTA.
  • Hybridized cDNAs were captured on streptavidm-paramagnetic particles (Dynal), eluted, re-amplified with UCP.A AND FS, and large cDNA molecules (>400 bp) were fractionated by gel electrophoresis and purified from the gel. The selected, amplified cDNA molecules were hybridized with an additional aliquot of probe and Cot-1 DNA. Captured and eluted products were amplified again with UCP.A AND FS, size-selected by gel electrophoresis, and cloned into dephosphorylated pUC18 digested with Hmc II. Ligation products were transformed into XL2-Blue ultra-competent cells (Stratagene) .
  • Insert-containing clones were identified by blue/white selection on Xgal or Bluo-gal plates. Inserts were amplified by colony PCR with vector primers. The colony PCR products were arrayed on a dot-blot apparatus, and filters hybridized separately with Cot-1 DNA or probe DNA prepared by the "nick translation" method. Probe positive, Cot-1 negative clones were then sequenced on ABI 377 sequencers. Alignment of these cDNA sequences to corresponding genomic sequences, and parsing of the cDNA sequences across those genomic sequences revealed exons, allowing for the initial characterization of genes located within the region.
  • Inter-exon (island hopping) PCR Following sequence analysis of the first hybrid selected clones that originated from CHDl, several primers were designed to try to amplify CHDl products from various tissue cDNAs . Internal exons and the 3' terminal exons (identified as sequences in the dbEST database that were homologous to genomic sequences adjacent to known exons of CHDl) were identified and confirmed by amplification using the primer pairs described in Table 6 (i.e. IH.F1 and IH.R1 formed one such a primer pair) . Amplified products were fractionated by gel electrophoresis and purified and either directly sequenced using dye terminator chemistry or cloned and sequenced as described above for hybrid selection.
  • 5' RACE The 5' end exon of CHDl was identified by a modified RACE protocol.
  • Amplified cDNA molecules from liver and brain were further amplified through two rounds of nested PCR, using the primer pairs GS1 or NR1 and UCP.A followed by GS2 or any of NR2-NR6 and UCP.B (Table 6) .
  • the gene-specific primer was at about 5-fold excess over the anchor primer (UCP.A) to increase the proportion of specifically primed products.
  • Amplified products were subjected tp gel electrophoresis, purified, cloned and sequenced as described above for hybrid selection.
  • the primers NR3 through NR6 allow for 5' RACE specific to each of cDNAs 1-4. 45112
  • Genomic DNA Using genomic DNAs from CHD kindred members, population control individuals and diabetes affecteds, nested PCR amplifications were performed to generate PCR products of the candidate genes that were screened for CHDl mutations .
  • the primers listed Table 7 were used to produce amplicons of the CHDl gene.
  • PCR conditions were an initial denaturation step at 95°C for 1 minute (TaqPlus) or 10 minutes (AmpliTaq Gold), followed by cycles of denaturation at 96°C (12 seconds), annealing at 55°C (15 seconds) and extension at 72°C (45-60 seconds).
  • PCR products were sequenced with M13 forward or reverse fluorescent energy transfer (FET) dye-labeled primers on ABI 377 sequencers.
  • FET fluorescent energy transfer
  • Chromatograms were analyzed for the presence of polymorphisms or sequence aberrations m either the Macintosh program Sequencher (Gene Codes) or the Java program Mutscreen (Myriad, proprietary) .
  • Table 7 Oligonucleotides for Mutation Screening from Genomic DNA (respectively (SEQ ID NOS : 38- 175 ) ) genomic name alias sequence position primary amplicon for exons A and B
  • MS7.6P1 ms7.6pl CTC CTT TGT CCG CCT CTC TG 10506
  • MS7.6A2 ms7.6a2 CAC ATC TCC GTC ATG GTT GGT G 9601
  • MS7.6P2 ms7.6p2 GCT CCT TTG TCC GCC TCT CTG 10507 secondary amplicon for intron GF
  • MS7.6R2 ms7.6q2 AGG AAA CAG CTA TGA CCA TGA CAA
  • MS7.17F1 ms7.5cl GTT TTC CCA GTC ACG ACG CTA GAG CTG CTT GTG CTG G 10513 MS7.17R1 ms7.5rl AGG AAA CAG CTA TGA CCA T CAT GGG GCT CAT GGT ATA TG 10800 MS7.17F2 ms7.5c2 GTT TTC CCA GTC ACG ACG GTG CTG GAA CAA TTT CTT AC 10525 MS7.17R2 ms7.5r2 AGG AAA CAG CTA TGA CCA T CAT GGT ATA TGA GCA ACC C 10791 primary amplicon for exon G
  • MS7.7Q2 ms7.7q2 AGG AAA CAG CTA TGA CCA TGCT CTA ACT TCC TAA GAT CCC 9806 primary amplicon for 3' UTR
  • GAG GAC CCA CTC AG 15105 MS7.10R2 ms7.4q2 AGG AAA CAG CTA TGA CCA TGT AGA
  • MS7.8R1 AGG AAA CAG CTA TGA CCA TAA GGA GGA GCT GAA GGT TAT C 11170
  • MS7.8R2 AGG AAA CAG CTA TGA CCA TGG GAT GCG CAG GCC TGC ACT G 11132
  • MS7.8D1 GTT TTC CCA GTC ACG ACG CAG GCT GGG GGT GGT GAG AGA 11090
  • MS7.8S1 AGG AAA CAG CTA TGA CCA TCC GCT CCT AAA TGC ACC GTC T 11382
  • MS7.9P1 CAA AAT CCT GGG AAT GAC ACG 9269 MS7.9A2 GTG CCT GTT ACG TGC CAG TGC 8122 MS7.9P2 CAC CAG CTA TTA TCT TTC TAA 9358 secondary amplicon for promoter 1 MS7.9B1 GTT TTC CCA GTC ACG ACG GAT AGT
  • MS7.9R1 AGG AAA CAG CTA TGA CCA TCT TGG
  • MS7.9R2 AGG AAA CAG CTA TGA CCA TCG AGC
  • cDNA Total RNAs prepared from CHD kindred lymphocytes were treated with DNase I (Boehnnger Mannheim) to remove contaminating genomic DNA, and then converted to heteroduplex cDNA with a mix of N10 random primers and a tailed oligo dT primer, and Superscript II reverse transcriptase (Life Technologies) m a reverse transcription reaction. These cDNA molecules were used as the template for nested PCR amplifications to generate the cDNA PCR products of the candidate genes that were screened for CHDl mutations.
  • DNase I Boehnnger Mannheim
  • PCR products were fractionated by gel electrophoresis and purified and then sequenced with M13 forward or reverse fluorescent energy transfer (FET) dye-labeled primers on ABI 377 sequencers. The sequences of these products were ar.axyzed m GDE to determine their exon structure. Chromatograms were analyzed for the presence of polymorphisms or sequence aberrations m either the Macintosh program Sequencher (Gene Codes) or the Java program Mutscreen (Myriad, proprietary) .
  • FET fluorescent energy transfer
  • EXAMPLE 7 CHDl Gene Structure The CHDl gene sequence has been determined.
  • SEQ ID NO: 9 is the sequence for CHDl including exons and flanking genomic sequence.
  • the DNA sequence of SEQ ID NO: 206 differs from SEQ ID NO: 9 by a single G/C base-pair deletion at position 533 of SEQ ID NO: 9.
  • the DNA sequence of SEQ ID NO: 206 may be produced by deleting one G/C base- pair at position 533 by in vitro mutagenesis procedures or recombinant DNA protocols well known in the art (see, e.g., Ausulbel et al, 1992) from the DNA sequence produced by the methods described above.
  • SEQ ID NO: 209 is the genomic sequence of ExonJ and the promoter region of CHDl.
  • the last nucleotide of SEQ ID NO: 209 is one nucleotide before the first nucleotide of SEQ ID NO: 9 or SEQ ID NO: 206.
  • SEQ ID NO: 210 is the genomic sequence of CHDl comprising SEQ ID NO: 209 and SEQ ID NO: 9. Position 1 to position 2,933 of SEQ ID NO: 210 is SEQ ID NO: 209 and position 2,934 to 23,071 is SEQ ID NO: 9.
  • SEQ ID NO: 1 is the sequence of an alternative CHDl transcript (cDNAl) .
  • SEQ ID NO: 3 is the sequence of an alternative CHDl transcript (cDNA2) .
  • SEQ ID NO: 5 is the sequence of an alternative CHDl transcript (cDNA3) .
  • SEQ ID NO: 7 s the sequence of an alternative CHDl transcript (cDNA4) .
  • SEQ ID No: 10 is the sequence of alternative 5' exon J.
  • SEQ ID No: 11 is the sequence of alternative 5' exon I (-21).
  • SEQ ID No: 12 is the sequence of alternative 5' exon I (+21).
  • SEQ ID No: 13 is the sequence of alternative 5' exon H.
  • SEQ ID No: 14 is the sequence of alternative 5' exon G.
  • the alternative transcripts represent alternative splice donors and acceptors at the ends of intronFA
  • Sequence ID NO. : 2 is the protein encoded by cDNAl; sequence ID NO.: 4 by cDNA2 ; sequence ID NO.: 6 by cDNA3; and sequence ID NO.: 8 by cDNA4. These sequences are shown m Table 9. The genomic nucleotide positions of the alternative splices are described m Table 10. Alternative transcripts encoding all four alternative proteins have been detected m cDNAs from various sources. Different tissues contain the alternative transcripts at different relative abundances.
  • CHDl cDNA2 SEQ ID NO: 3 GGCCCTTGGAAGAAAATCCTCGCTGTGTCCAGGCTGAGGCGGGGGGCTAATGACA GTGTGAGCTCTAGATGGTGTGAGACCACCCCAAAGCCAAGAAATGGCTACAGCCG 5 TGGAACCAGAGGACCAGGATCTTTGGGAAGAAGAGGGAATTCTGATGGTGAAACT
  • CHDl protein encoded by cDNA3 SEQ ID NO : 6 MATAVEPEDQDLWEEEGI LMVKLEDDFTCRPESVLQRDDPVLETSHQNFRRFRYQ EAASPREALIRLRELCHQWLRPERRTKEQILELLVLEQFLTVLPGELQSWVRGQR PESGEEAVTLVEGLQKQPRRPRR
  • AAAGAC T T C C AG AAG C T G T AAAG AC T T C C AG AAG C AAG AAG AT T C AAC C AT C T AAAACGCCATGCAGGAAAATAGCCAAACCTTCTCCATTTAAGTAGAGAATAAATC TTAGTAGCGTTCTCTGCAGAATATAACAACGCTGCAAAAAGGCCATTTCACAGGA ATATAATCAAAACTGCAGATTCTCAGGGTTTCCCGTAAGACGACTTCTCTGCTCTCT TCTGTTTGTGGTTTCTTTTAGTTGTACATCTCCTAGACAAGTCCAAGGAAC
  • the CHDl proteins have three conserved homology domains ( Figure 3) .
  • the most informative of these is a set of eight C2H2 zmc-fmger motifs m the proteins encoded by cDNAs 1 and 2 (all exonE, at ammo acid residues 399-419, 427-447, 483-503, 511-531, 539-559, 567-587, 595-615 and 623-643 m the protein encoded by cDNAl) .
  • Zmc-fmger motifs often serve as nucleic acid binding motifs, and can also serve as protem interaction motifs.
  • a leucme-rich SCAN domain is found near the N-termmus of all of the alternative proteins (ammo acids 49-125) .
  • Figure 8 displays a comparison between the CHDl SCAN domain and a consensus SCAN domain sequence derived from homology analysis of SCAN domain containing zmc fmger proteins m the
  • GenBan database Yeast-two-hybrid experiments as well as in vitro interaction studies indicate that the SCAN domain acts as a protem-protem interaction surface leading to homo- and/or heterodimerization of two SCAN containing peptides.
  • the functional form of CHDl may therefore include a homo- and/or heterodimer of different CHDl isoforms or CHDl and other SCAN domain containing zmc fmger proteins.
  • Precedents for transcription factors acting as dimers include members of the bZIP family, bHLH proteins and nuclear receptors (Kouzarides and Ziff, 1988, Fairman et al . 1993, Fawell et al., 1990) .
  • a third domain, the KRAB domain (ammo acids 235-276 m the protein encoded by cDNAl ) , is found in many zmc- fmger containing transcription factors. It is often a site for protem-protem interaction that mediates transcriptional repression (Kim et al., 1996, Moosmann et al . , 1996). These motifs together suggest that CHDl serves as a sequence- specific DNA binding transcription factor. The presence of a KRAB domain raises the possibility that at least one function of CHDl is that of a repressor, being able to reduce the transcriptional activity of genes it regulates.
  • Two of the alternative cDNAs encode small proteins largely identical to the N- term us of the longer protein products (-1 and -2, respectively) .
  • Tagged fusion proteins have identified the subcellular localization of some of these proteins.
  • the protein encoded by cDNAl is largely localized to the nucleus, whereas the protein encoded by cDNA3 is found to be diffuse throughout the cell. These localizations were monitored by fusing the relevant CHDl open reading frame to green fluoresent protein under the control of the cytomegalovirus promoter, transfectmg these constructs into 293 cells and monitoring expression with fluorescent microscopy.
  • the presence of multiple protein products raises the interesting possibility that their relative proportion may influence function.
  • the N- termmus may interact with another protem, call it "protein X", and target protem X to the transcriptional control region of relevant genes.
  • protein X protein X
  • target protem X target protem X to the transcriptional control region of relevant genes.
  • the presence of a fragment of the CHDl protem that also binds prote X but lacks a DNA binding motif could regulate the effective concentration of protem X, and the function of the protem complex bound to the regulatory region.
  • Such alternative transcripts retaining only partial function have been described for transcription factors and found to serve as competitive regulators (Chen et al . , 1994, Arshura et al . , 1995, and Walker et al . , 1996).
  • CHDl Fusion Proteins Three coding sequence fragments corresponding to predicted zmc fingers 1 through 2, 1 through 8 and 3 through 8 of CHDl were amplified from random-primed liver cDNA using PCR with Pfu enzyme (Strategene) .
  • the primer sequences are shown in Table 11.
  • the PCR primers incorporated restriction sites in the same translational reading frame as the same sites in the polylinker of pGEX-4T-3 (Pharmacia), a GST fusion protein expression vector.
  • the PCR fragments are cloned into this vector using these restriction sites.
  • the ligation reactions were transformed into DH5 cells. Protein expression from these clones was confirmed by SDS-PAGE.
  • the pGEX 4T-3 clones were transferred to BL21 cells for large scale production of proteins.
  • Proteins for use in the in vitro selection and gel shift experiments were synthesized as according to manufacturer's instructions (Pharmacia) .
  • the fusion proteins were retained on the sepharose matrix.
  • Proteins for gel shift experiments were eluted from the glutathione-sepharose and dialyzed to remove residual glutathione. Protein c ncentration was estimated from SDS-polyacrylamide gels.
  • Probes were prepared by PCR amplification of genomic DNA using Pfu and Taq plus long enzymes (Strategene) , or by direct synthesis of plus and minus strands (ABI model 3948) . Single stranded oligonuclotides were annealed to generate duplex DNA and the unannealed oligonucleotides were removed (Qiagen Gel Purification Kit) . DNA fragments were end-labeled with 32 P; unincorporated label and PCR primers were removed (Qiagen PCR Purification Kit) ; and the concentration of probe was determined by direct counting.
  • Protem-DNA binding reactions and gel electrophoresis were similar to those described for other zmc f ger proteins (see, for example, Pedone et al, 1996, Morris et al . , 1994, Cook et al . , 1996).
  • a GST fusion protem containing the last six zmc fingers of CHDl (CHDl . ZnF3-8 ) was expressed in bacteria, purified and used to define a consensus binding site by selection of specific sequences from random oligonucleotides, essentially as described in Morris et al., 1994.
  • a consensus binding motif (GGGGT) resulted. This motif was found m multiple copies m the regulatory regions upstream of the start of transcription in several genes known to be involved in lipid metabolism. Several promoter fragments containing these sequences were amplified from genomic DNA and gel shift assays were performed. 45112
  • the promoter fragments that CHDl binds to can be grouped into several classes (Table 12) .
  • Class 1 includes the HDL structural proteins ApoAIV and ApoE, as well as the ApoCIII enhancer, which regulate the liver specific expression of the ApoAI, CIII, AIV genes (reviewed in Kardassis et al . , 1996).
  • Apo apolipoprotein
  • ApoE apolipoprotein
  • ApoCIII enhancer which regulate the liver specific expression of the ApoAI, CIII, AIV genes
  • ApoAI, ApoCIII, ApoAIV loci have been genetically associated with several dyslipidemias and atherosclerosis.
  • ApoE is a component of many circulating lipoproteins, and mediates interactions of these proteins with the LDL-receptor .
  • Common polymorphisms of ApoE alter its affinity for the LDL receptor, and can cause dyslipidemic phenotypes and predisposition to atherosclerosis (Xu et al . , 1991, Davignon et al . , 1988).
  • the second class of promoters that bind to CHDl includes several enzymes known to influence lipoprotem composition.
  • Class 2 includes the lipoprotem lipase gene (LPL), the lecithin: cholesterol acyltransferase gene (LCAT) , the phospholipid transport protem gene (PLTP) and the hepatic triglyce ⁇ de lipase gene (HTGL) .
  • LPL and LCAT deficiencies are associated with atherosclerosis and HDL-C levels (Cohen et al . , 1994, Kuivenhoven et al, 1997) .
  • PLTP and HTGL can alter the composition of HDL particles in vitro (e.g.
  • the third class of promoters that bind to CHDl protem includes several other genes implicated directly in the etiology of atherosclerosis, obesity and diabetes.
  • Vascular endothelial growth factor (VEGF) is involved in atherosclerosis and angiogenesis, and modulation of its activity is the focus of several atherosclerosis intervention studies and drug discovery programs (Waltenberger 1997, Sueishi et al . , 1997, Ferrara and Davie-Smyth, 1997).
  • IA-1 is an msulmoma associated zmc fmger gene, expression of which is regulated m a similar way to several genes involved maturity onset diabetes of the young (MODY) (see background section for review of MODY) .
  • a common polymorphism of the beta-3 adrenergic receptor ( ⁇ 3AR) gene is associated with obesity (Silver et al . , 1997), msulm resistance and weight control m NIDDM patients (Sakane et al . , 1997), and with visceral obesity and decreased serum triglycerides (e.g. Kim-Motoyama et al . , 1997) .
  • msulm resistant syndrome X may be partly explained by a common variant of ⁇ 3AR (reviewed Groop, 1997) .
  • This gene is the target of a number of drug discovery programs for the treatment of obesity and diabetes (reviewed m Strosberg and Pietr-Rouxel, 1996) .
  • CHDl has been found to bind to a promoter fragment of the HNF4 gene (hepatic nuclear factor 4). Transfection assays indicate that CHDl represses transcription from this promoter suggesting that CHDl may regulate HNF4 expression in v vo . Pathological consequences of CHDl ⁇ ysfunction are likely include deregulation of HNF4 expression that may be counteracted by agonists/antagonists of HNF4.
  • HNF4 is a member of the nuclear receptor superfamily, a class of ligand-activated transcription factors. HNF4 functions as a major regulator of liver- specific gene expression, and is involved in the expression of apolipoprotems Al, All, AIV B and CIII (Kardassis et al . , 1996). Mutations m HNF4 have been identified in MODY1 (maturity-onset diabetes of the young) cases (Yamagata et al . , 1996, Furuta et al., 1997) linking HNF4 to diabetes. As a ligand-activated nuclear receptor HNF4 presents an excellent target for drug development.
  • CHDl is a sequence specific DNA binding protein. It binds to fragments of the regulatory regions of a subset of apolipoprotein genes, a set of genes known to be intimately involved in the regulation of plasma lipoprotein metabolism, and a set of genes that have links to atherosclerosis, obesity, NIDDM and insulin resistant syndrome X. CHDl has also been shown to bind to the regulatory region of HNF4 , whose gene product is involved in regulating the expression of several apolipoprotein genes. The binding of CHDl to these regulatory regions makes it very probable that CHDl is involved in their regulation, and in the pathophysiology of these disorders.
  • the DNA samples that were screened for CHDl mutations were extracted from blood of patients with CHD or other metabolic disorders who were participating in research studies on the genetics of coronary heart disease. All sub ects signed appropriate informed consent. All exons of CHDl and intron sequences within aoout 20-30 bases of the exons were screened for mutations m a set of 75 affected individuals from 43 kindreds, using the mutation screening protocol and primers described Example 6. These represent individuals segregating haplotypes m the region of CHDl, and 9 spouses from the most likely linked families. In addition, a set of samples from diabetics were also screened for mutations. The number of samples screened for each exon is shown m Tables 14 and 15.
  • the other polymorphisms are in the non-coding regions 5' or 3' to the open reading frame, m mtronGF m positions unlikely to directly alter splicing patterns, or m third positions of codons that do not alter the sequence of the encoded prote .
  • the frequency of selected common polymorphisms was compared between the 75 mutation screening samples (CHD in Table 14) and 120 CEPH control individuals (CEPH m Table 14) .
  • the CEPH controls are grandparents of the UTAH CEPH kindreds (obtained from the Coriell Institute for Medical Research) , and represent a good population control for the CHD kindreds.
  • haplotypes associated with ten polymorphisms m CHDl and the frequencies of these haplotypes m CHD patients and CEPH controls, are described m Table 16 The five most common haplotypes have all been observed m homozygous individuals. These haplotypes may be useful m identifying mtragemc deletions m individual samples, or m segregation analysis of possible mutations m linked families .
  • genomic sequences including exonJ and promoter elements for CHDl have been identified (Example 7) .
  • Five additional polymorphisms and one insertion were found in CHD and CEPH samples. Their position and frequencies are listed Table 17.
  • Genomic position of the polymorphisms was derived by setting the nucleotide position number 2,934 of SEQ ID NO: 210 as +1.
  • CHDl gene The structure and function of CHDl gene are determined according to the following methods.
  • CHDl binds to DNA sequence-specifically and it binds to promoter fragments of genes whose gene products are involved m lipid metabolism (Example 8) , biological experiments are designed to address its role in transcription regulation.
  • the full length protem is expressed in appropriate cells to assess the role of CHDl in transcription.
  • Inducible expression of the gene m tissue culture cells, such as HepG2 cells, will be used to study any alterations in the expression of other genes that are caused by CHDl, including those genes identified m Example 8.
  • the ability of CHDl to regulate transcription of these and other genes is analyzed by transient reporter expression systems m mammalian cells.
  • Segments of CHDl coding sequence are expressed as fusion proteins m E ⁇ coli .
  • the proteins, expressed at high levels, are purified by gel elution and used to immunize rabbits and mice using a procedure similar to the one described by Harlow and Lane, 1988. This procedure has been shown to generate antibodies against various other proteins (for example, see Kraemer, et al . , 1993).
  • Rabbits are immunized with 100 mg of the protem in complete Freund's adjuvant and boosted twice in three- week intervals, first with 100 mg of immunogen in incomplete Freund's ad uvant followed by 100 mg of immunogen in phosphate buffer saline (PBS) .
  • PBS phosphate buffer saline
  • Antibody containing serum is collected three weeks thereafter. This procedure can be repeated to generate antibodies against the mutant forms of the CHDl protem.
  • These antibodies, in conjunction with antibodies to wild type CHDl are used to detect the presence and the relative level of the mutant forms in various tissues and biological fluids.
  • Monoclonal antibodies are generated according to the following protocol. Mice are immunized with immunogen comprising intact CHDl or CHDl peptides (wild type or mutant) conjugated to keyhole limpet hemocyanm using glutaraldehyde or EDC as is well known in the art .
  • the immunogen is mixed with an adjuvant.
  • Each mouse receives four injections of 10 to 100 mg of immunogen and after the fourth injection blood samples are taken from the mice to determine if the serum contains antibody to the immunogen.
  • Serum titer is determined by ELISA or RIA. Mice with sera positive for the presence of antibody to the immunogen are selected for hyb ⁇ doma production.
  • Spleens are removed from immune mice and a single cell suspension is prepared (see Harlow and Lane, 1988). Briefly, P3.65.3 myeloma cells (American Type Culture Collection, Rockville, MD) are fused with immune spleen cells using polyethylene glycol as described by Harlow and Lane, 1988. Cells are plated at a density of 2xlO cells/well 96 well tissue culture plates. Individual wells are examined for growth and the supernatants of wells with growth are tested for the presence of CHDl specific antibodies by ELISA or RIA using wild type or mutant CHDl target protem. Cells in positive wells are expanded and subcloned to establish and confirm monoclonality . Clones with the desired specificities are expanded and grown as ascites m mice or a hollow fioer system to produce sufficient quantities of antibody for characterization and assay development.
  • Peptides that bind to the CHDl gene product are isolated from both chemical and phage-displayed random peptide libraries as follows. Fragments of the CHDl gene product are expressed as glutathione-S-transferase (GST) and six histidme (His-tag) fusion proteins m both E_;_ coli and SF9 cells.
  • GST glutathione-S-transferase
  • His-tag histidme
  • the fusion protem is isolated using either a glutathione matrix (for GST fusions proteins) or nickel chelation matrix (for His-tag fusion proteins). This target fusion protem preparation is either screened direct y as described below, or eluted with glutathione or lmidizole.
  • the target protem is immobilized to either a surface such as polystyrene; or a resm such as agarose; or solid supports using either direct absorption, covalent linkage reagents such as glutaraldehyde, or linkage agents such as biotm-avid .
  • Two types of random peptide libraries of varying lengths are generated: synthetic peptide libraries that may contain derivatized residues, for example by phosphorylation or myristylation, and phage-displayed peptide libraries which may be phosphorylated. These libraries are incubated with immobilized CHDl gene product in a variety of physiological buffers. Next, unbound peptides are removed by repeated washes, and bound peptides recovered by a variety of elution reagents such as low or high pH, strong denaturants, glutathione, or lmidizole. Recovered synthetic peptide mixtures are sent to commercial services for peptide micro-sequencing to identify enriched residues. Recovered phage are amplified and rescreened. The positive plaques are purified, and the DNAs encoding the peptides are then sequenced to determined the identity of the displayed peptides.
  • Peptides identified from the above screens are synthesized in larger quantities as biotm conjugates by commercial services. These peptides are used in both solid and solution phase competition assays with CHDl and its interacting partners identified yeast 2-hybr ⁇ d screens. Versions of these peptides that are fused to membrane-permeable motifs (Lm et al . , 1995; Rojas et al . , 1996) will be chemically synthesized, added to cultured cells and the effects on growth, apoptosis, differentiation, cofactor response, and internal changes will be assayed.
  • EXAMPLE 1 Sandwich Assay for CHDl Monoclonal antibody is attached to a solid surface such as a plate, tube, bead, or particle.
  • the antibody is attached to the well surface of a 96-well ELISA plate.
  • 100 ml sample e.g., serum, urine, tissue cytosol
  • CHDl peptide/protem wild-type or mutant
  • 100 ml sample e.g., serum, urine, tissue cytosol
  • the sample is incubated for 2 hours at room temperature. Next, the sample fluid is decanted, and the solid phase is washed with buffer to remove unbound material.
  • a second monoclonal antibody to a different antigenic determinant on the CHDl peptide/protem
  • This antibody is labeled with a detector molecule (e.g., - I, enzyme, fluorophore, or a chromophore) and the solid phase with the second antibody is incubated for two hours at room temperature.
  • the second antibody is decanted and the solid phase is washed with buffer to remove unbound material.
  • the amount of bound label which is proportional to the amount of CHDl peptide/protem present m the sample, is quantitated.
  • Separate assays are performed using monoclonal antibodies that are specific for the wild-type CHDl as well as monoclonal antibodies specific for each of the mutations identified m CHDl.
  • yeast colonies are assayed for expression of the lacZ reporter gene by ⁇ -galactosidase filter assay. Colonies that both grow m the absence of histidme and are positive for production of ⁇ -galactosidase are chosen for further cnaracterization .
  • the activation domain fusion plasmid is purified from positive colonies by the smash-and-grab technique. These plasmids are introduced into E . coli DH5 by electroporation and purified from E_;_ coli by the alkaline lysis method. To test for the specificity of the interaction, specific activation domain fusion plasmids are cotransformed into strain Y190 with plasmids encoding various DNA-bmdmg domain fusion proteins, including fusions to CHDl and human lamm C. Transformants from these experiments are assayed for expression of the HIS3 and lacZ reporter genes. Positives that express reporter genes with CHDl constructs and not with lamm C constructs encode bona fide CHDl interacting proteins.
  • proteins are identified and characterized by sequence analysis of the insert of the appropriate activation domain plasmid. This procedure is repeated with mutant forms of the CHDl gene, to identify proteins that interact with only the mutant protem or to determine whether a mutant form of the CHDl protem can or cannot interact with a protem known to interact with wild-type CHDl.
  • Ace.mbly A graphic interactive program to support shotgun and directed sequencing projects.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Hematology (AREA)
  • Immunology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Urology & Nephrology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Cell Biology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Zoology (AREA)
  • Microbiology (AREA)
  • Toxicology (AREA)
  • Food Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Acyclic And Carbocyclic Compounds In Medicinal Compositions (AREA)

Abstract

L'invention porte sur le gène (CHD1) humain de susceptibilité aux maladies coronariennes dont certains allèles sont en relation avec ladite susceptibilité, sur des mutations des lignées germinales du gène CHD1 et leur utilisation dans le diagnostic de la prédisposition aux maladies coronariennes et des troubles du métabolisme, dont l'hypoalphalipoprotéinémie, l'hyperlipidémie familiale combinée, le syndrome X de résistance à l'insuline, ou les troubles multiples du métabolisme, l'obésité, le diabète et l'hypertension dyslipidémique, sur la thérapie présymptomatique d'individus porteurs d'allèles délétères (dont la thérapie génique, la thérapie par remplacement de protéines ou l'administration de protéines mimétiques et d'inhibiteurs), et sur le criblage de médicaments de thérapie des dyslipidémies.
PCT/US1999/004682 1998-03-04 1999-03-04 Gene chd1 de susceptibilite aux maladies coronariennes lie au chromosome 11 WO1999045112A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU30680/99A AU3068099A (en) 1998-03-04 1999-03-04 Chromosome 11-linked coronary heart disease susceptibility gene chd1

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US3494198A 1998-03-04 1998-03-04
US09/034,941 1998-03-04
US8093498P 1998-04-06 1998-04-06
US60/080,934 1998-04-06

Publications (2)

Publication Number Publication Date
WO1999045112A2 true WO1999045112A2 (fr) 1999-09-10
WO1999045112A3 WO1999045112A3 (fr) 1999-11-04

Family

ID=26711582

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/004682 WO1999045112A2 (fr) 1998-03-04 1999-03-04 Gene chd1 de susceptibilite aux maladies coronariennes lie au chromosome 11

Country Status (2)

Country Link
AU (1) AU3068099A (fr)
WO (1) WO1999045112A2 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104050344A (zh) * 2013-09-30 2014-09-17 西安时代基因健康科技有限公司 冠心病的表征参数的获取方法
CN116144753A (zh) * 2022-12-27 2023-05-23 湖南家辉生物技术有限公司 一种致病基因chd1突变位点的应用及检测试剂

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
HILLIER ET AL.: "The WashU-Merck EST Project" EMBL SEQUENCE DATABASE, 7 June 1996 (1996-06-07), XP002114882 HEIDELBERG DE *
HILLIER ET AL.: "WashU-merck EST Project 1997" EMBL SEQUENCE DATABASE, 25 April 1997 (1997-04-25), XP002114883 HEIDELBERG DE *
LEE ET AL.: "Zinc finger protein" EMBL SEQUENCE DATABASE, 1 November 1996 (1996-11-01), XP002114884 HEIDELBERG DE & LEE ET AL.: GENOMICS, vol. 43, 1997, pages 191-201, *
MONACO ET AL.: "Homo sapiens ZNF202 beta (ZNF202) mRNA, complete cds" EMBL SEQUENCE DATABASE, 16 November 1998 (1998-11-16), XP002114887 HEIDELBERG DE -& MONACO ET AL.: GENOMICS, vol. 52, 1998, pages 358-362, XP002114888 *
STOKES ET AL.: "DNA.binding and chromatin localization properties of CHD1" MOLECULAR AND CELLULAR BIOLOGY, vol. 15, no. 5, May 1995 (1995-05), pages 2745-2753, XP002114885 *
WOODAGE ET AL.: "Characterization of the CHD family of proteins" PROC NATL ACAD SCI USA, vol. 94, October 1997 (1997-10), pages 11472-11477, XP002114886 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104050344A (zh) * 2013-09-30 2014-09-17 西安时代基因健康科技有限公司 冠心病的表征参数的获取方法
CN116144753A (zh) * 2022-12-27 2023-05-23 湖南家辉生物技术有限公司 一种致病基因chd1突变位点的应用及检测试剂

Also Published As

Publication number Publication date
WO1999045112A3 (fr) 1999-11-04
AU3068099A (en) 1999-09-20

Similar Documents

Publication Publication Date Title
US6162897A (en) 17q-linked breast and ovarian cancer susceptibility gene
AU686004B2 (en) In vivo mutations and polymorphisms in the 17q-linked breast and ovarian cancer susceptibility gene
EP0699754B1 (fr) Méthode de diagnose pour la prédisposition au cancer du sein et des ovaires
US5709999A (en) Linked breast and ovarian cancer susceptibility gene
US20040053262A1 (en) Supressor gene
EP1100825B1 (fr) Mutations geniques mink humaines associees a l'arythmie
EP1349955B1 (fr) Polymorphisme commun dans scn5a a l'origine d'arythmie cardiaque medicamenteuse
WO1999021875A9 (fr) Kncq2 et kncq3, genes du canal potassium ayant subi une mutation dans des convulsions neonatales familiales benignes (bfnc) et d'autres epilepsies
US6225451B1 (en) Chromosome 11-linked coronary heart disease susceptibility gene CHD1
JP2004508835A (ja) ヒト骨粗鬆症遺伝子
EP1562973A2 (fr) Gene de susceptibilite d'un infarctus du myocarde
AU2003201728B2 (en) Gene for peripheral arterial occlusive disease
EP1100538B1 (fr) Kvlqt1 - gene associe au syndrome du qt long
WO2000029571A1 (fr) Gene codant une proteine transmembranaire
AU2003201728A1 (en) Gene for peripheral arterial occlusive disease
US20060141462A1 (en) Human type II diabetes gene-slit-3 located on chromosome 5q35
WO1999045112A2 (fr) Gene chd1 de susceptibilite aux maladies coronariennes lie au chromosome 11
US7462447B2 (en) Methods for evaluating susceptibility to a bone homeostasis disorder
US7151161B1 (en) Human genes of chromosome 11q13.3
WO1999027075A1 (fr) Proteine interagissant avec brca1 au niveau de son extremite carboxylique
US6063576A (en) Actin mutations in dilated cardiomyopathy, a heritable form of heart failure
CA2501523A1 (fr) Diabete humain de type ii: voie du gene kv-proteine d'interaction (kchip1) situes sur le chromosome 5
US20020168752A1 (en) MMSC2 - an MMAC1 interacting protein
WO2000012694A1 (fr) Gene de susceptibilite du cancer de la prostate lie au chromosome 1 et suppresseur tumoral multi-site

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase