WO2003093826A2 - Assays for identifying cholesterol - lowering molecules - Google Patents

Assays for identifying cholesterol - lowering molecules Download PDF

Info

Publication number
WO2003093826A2
WO2003093826A2 PCT/IB2003/002376 IB0302376W WO03093826A2 WO 2003093826 A2 WO2003093826 A2 WO 2003093826A2 IB 0302376 W IB0302376 W IB 0302376W WO 03093826 A2 WO03093826 A2 WO 03093826A2
Authority
WO
WIPO (PCT)
Prior art keywords
kalpa
polypeptide
cholesterol
compound
activity
Prior art date
Application number
PCT/IB2003/002376
Other languages
French (fr)
Other versions
WO2003093826A3 (en
Inventor
Stephane Benjamin
Catherine Clusel
Aurelie Daury
Laurent Essioux
Original Assignee
Clinigenetics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Clinigenetics filed Critical Clinigenetics
Priority to AU2003233124A priority Critical patent/AU2003233124A1/en
Publication of WO2003093826A2 publication Critical patent/WO2003093826A2/en
Publication of WO2003093826A3 publication Critical patent/WO2003093826A3/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/92Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving lipids, e.g. cholesterol, lipoproteins, or their receptors
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/04Endocrine or metabolic disorders
    • G01N2800/044Hyperlipemia or hypolipemia, e.g. dyslipidaemia, obesity

Definitions

  • the invention relates to a gene and the encoded protein involved in cholesterol- lowering, and their use in diagnostics, treatment of disease, and in the identification of molecules for the treatment and prevention of disease.
  • the present invention thus discloses methods of screening for the identification of molecules useful in the treatment of heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
  • Atherosclerosis such as myocardial infarction, stroke and peripheral vascular disease are a major cause of mortality and morbidity.
  • the quality of life of millions of people is adversely affected by angina and heart failure caused by coronary heart disease.
  • Hyperlipidaemia has been associated with an increased risk of developing these conditions. For this reason it is desirable to understand the etiology of hyperlipidaemia and to develop effective treatments for this condition.
  • Hyperlipidaemia has been defined as plasma cholesterol and triglyceride levels that exceed "normal" (95th percentile of levels of the general population) levels. However, the ideal cholesterol level is much less than the normal level of the general population.
  • hypercholesterolaemia hypercholesterolaemia
  • CAD coronary artery disease
  • hypertriglyceridaemia may also be involved in atherosclerosis and can, in extreme cases, cause potentially life-threatening pancreatitis. Hyperlipidaemia can arise through a genetic disorder, as a result of other medical conditions or environmental influences, or a combination of these factors.
  • Familial hypercholesterolemia is the most common (frequency 1/500) autosomal-dominant disease affecting lipid metabolism (Brown and Goldstein, 1986, Science 232:34-47); hetero2ygous affected persons have LDL levels twice normal levels and develop premature coronary disease, whereas homozygous individuals have sixfold-elevated LDL levels and often die of cardiovascular disease at age ⁇ 20 years.
  • Cholesterol-synthesis inhibitors have exerted a gratifying effect on the course of atherosclerosis (Gould et al., 1998, Circulation 97:946-952). However, preferable would be to stimulate endogenous mechanisms that have a similar cholesterol-lowering effect.
  • the present inventors have, by conducting genotype-phenotype association studies and expression studies, narrowed this region and identified the Kalpa gene as demonstrating an association with lowered cholesterol.
  • the inventors have also provided screening assays which can be used to identify molecules for therapeutic treatment and diagnostics for cholesterol- related disorders.
  • the invention includes diagnostic and activity assays, and uses in therapeutics, for
  • CHOL cholesterol regulation
  • HDL HDL- cholesterol
  • LDL LDL-cholesterol
  • TGRL Triglycerides
  • LDL/HDL ratio LDL/HDL ratio
  • compositions and methods of the invention are useful for example in the diagnosis and treatment of diseases such as heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
  • diseases such as heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
  • the present invention thus also relates to nucleic acid molecules, including in particular the complete cDNA sequences encoding Kalpa, portions thereof encoding polypeptides homologous thereto, as well as to polypeptides encoded by the Kalpa gene.
  • Figure 1 depicts a cDNA sequence encoding the Kalpa protein, including the sequence referred to herein as SEQ ID No 3.
  • TM domains (TM from 1 to 10) are yellow boxes
  • TPR domains (TPR from 1 to 7) are white boxes
  • Figure 2 depicts nucleic acid sequences comprising SNPs in and surrounding the Kalpa gene in the human chromosome 13q region, including the sequences referred to herein as SEQ ID Nos 5 to 23.
  • Figures 3A and 3B show the results of RT-PCR analysis, demonstrating amplification of a specific product in brain and liver.
  • Figure 3C shows the results of Northern blot analysis, demonstrating the existence of one Kalpa transcript at 3.63 kb. There appears to be a tissue- specific distribution of the band. The most intense signal is detected in the skeletal muscle. The signals detected in the kidney, liver and brain are quite intense, while the signals detected in the placenta, spleen, small intestine, tongue, thyroid, stomach, spinal cord and prostate are less intense. The signals detected in the brain and liver (blot 1 lanes 1 and 8) were in agreement with the RT-PCR results.
  • SEQ ID No 1 is a genomic DNA sequence encoding the human Kalpa protein.
  • SEQ ID No 2 and SEQ ID No 3 are cDNA sequences encoding the human Kalpa protein.
  • SEQ ID No 4 is an amino acid sequence of the human Kalpa protein.
  • SEQ ID Nos 5 to 23 are nucleic acid sequences comprising SNPs in and surrounding the Kalpa gene in the human chromosome 13q region.
  • the molecular regulation of cellular sterol metabolism has been elucidated by Brown and Goldstein and their colleagues.
  • the LDLR gene promoter contains a sterol response element (SRE) that is required for regulating transcription of the gene encoding LDLR in response to cellular sterol content.
  • SRE sterol response element
  • Two SRE-binding proteins (SREBP-1 and -2) that contain two transmembrane domains and are localized to the endoplasmic reticulum (ER).
  • SCAP SREBP-cleavage activating protein
  • SREBP in the SCAP complex When cells are overloaded with sterols, SREBP in the SCAP complex is no longer accessible to SIP cleavage and remains membrane-bound. Conversely, when sterol concentrations are limiting, the current data suggest that the SCAP/SREBP complex can move to a post-ER compartment. At this location, the complex encounters SIP and S2P, leading to the release of soluble SREBP (reviewed by Brown and Goldstein, 1999).
  • the statins inhibit HMGCoA reductase, the rate-limiting enzyme in cholesterol biosynthesis, reducing cellular sterol content and thereby de-repressing the transport of the SCAP-SREBP complex and resulting in the upregulation of LDLR, which in turn transports blood LDL into the liver, reducing blood levels of LDL.
  • LDL low- density lipoprotein
  • statins e.g. Zocor, Siravastatin, Lovastatin, Atorvastatin, Pravastatin
  • drugs that inhibit the activity of 3 -hydroxy methyl-glutaryl- coenzymeA reductase an enzyme in the cholesterol biosynthetic pathway.
  • statins e.g. Zocor, Siravastatin, Lovastatin, Atorvastatin, Pravastatin
  • drugs that inhibit the activity of 3 -hydroxy methyl-glutaryl- coenzymeA reductase an enzyme in the cholesterol biosynthetic pathway.
  • people vary in their responsiveness to these drugs. In particular, some patients with severe forms of hypercholesterolemia are not very responsive to statins or to any other known drug therapy.
  • LDL lipoprotein particles are the major form of plasma cholesterol. Most LDL particles arise from the conversion of very low density (VLDL) particles secreted by the liver. LDL particles are thus not directly synthesized. Rather, the liver produces very low density lipoprotein (VLDL), which is secreted into the bloodstream. While in the bloodstream, VLDL is converted into LDL. This occurs through the action of lipoprotein lipase (LPL), an enzyme residing on the lumenal surface of the capillary endothelium.
  • LPL lipoprotein lipase
  • LPL catalyzes the hydrolysis of the triglycerides in the VLDL particle, thus shrinking the diameter of the particle and enriching it for cholesterol and cholesterol ester (cholesterol ester is not a substrate for LPL).
  • VLDL also acquires cholesterol ester through the action of cholesterol ester transfer protein (CETP).
  • CETP cholesterol ester transfer protein
  • CETP is in the bloodstream and promotes the transfer of cholesterol ester ftom HDL to VLDLand the reciprocal transfer of triglyceride from VLDL to HDL.
  • the actions of LPLand CETP lead to the conversion of a triglyceride-rich particle, VLDL, to a cholesterol-rich particle, LDL.
  • LDL levels can be caused by diminished clearance ofLDL particles from the circulation or by increased production of LDL or both.
  • the clearance of LDL from the circulation is largely mediated by the LDL receptor.
  • patients with familial hypercholesterolemia a disease known to be caused by LDL receptor mutations, have LDL levels 8 fold elevated (in the homozygous form) or 2 foldelevated (in the heterozygous form), as compared to patients with normal LDL receptor. This observation provides strong support for the key role of the LDL receptor in LDL metabolism.
  • This 37 cM region identified in a large Arab family and a population of healthy twins, has been narrowed down to a genetic distance of 12 cM corresponding to 13.6 Mb, between markers D13S265 to D13S158.
  • the invention thus provides method of treating and preventing disease and identifying medicaments for the treatment of disease, the methods involving acting on biological pathways and biological functions mediated by the Kalpa gene.
  • the inventors have further characterized genes involved in cholesterol-lowering using differential expression analysis, thereby independently identifying the gene Kalpa as a candidate involved in cholesterol lowering pathway.
  • a "reference" phenotype and a related "affected” phenotype can represent a hepatic cell treated and untreated with a drug.
  • the total RNA is extracted, the mRNA purified and the cDNA synthesized.
  • the two cDNA's are then mixed together and submitted to a subtractive hybridization process in order to select those genes which are differentially-expressed.
  • the recovered cDNA is inserted in a cloning vector and the corresponding clones are randomly sequenced.
  • statins carried out differential expression analysis for the study of the pathogenesis of cardiovascular diseases with a special emphasis on the pharmacogenomics of 3-hydroxy-3-methylglutaryl coenyzme A reductase inhibitors, known as statins.
  • Statins exert the greatest effect on plasma LDL cholesterol.
  • the beneficial effects of Statins have been shown in a series of epidemiological primary and secondary prevention studies. In addition to their action on LDL cholesterol, these drugs also increase HDL cholesterol, reduce triglycerides and have a beneficial effect on some of the fundamental mechanisms involved in the development of arteriosclerosis.
  • Kalpa was identified as differentially expressed in cells which are treated versus untreated by statins, and upon analysis of chromosomal location of all differentially expressed genes was analysed. Kalpa was found to map to the locus of interest between D13S265 and D13S158.
  • the inventors further carried out an association study to study more closely polymo ⁇ hisms in the Kalpa gene that are associated with the low-cholesterol phenotype.
  • Kalpa polymorphisms were identied, including one frequent coding polymorphism (SNP5.3) in exon 10 of the Kalpa gene, a non conservative polymo ⁇ hism (Isoleucine to Valine) with an allelic frequency of 0.34.
  • SNP5.3 frequent coding polymorphism
  • a non conservative polymo ⁇ hism Isoleucine to Valine
  • An association test of LDL-Cholesterol adjusted for age, BMI and gender with this coding polymorphism was performed. An association was found using familial based association method (ref Abecassis et al., 2001).
  • the p-value associated with the test was 0.003. In an independent cohort of german adult of 143 individuals this association was confirmed using analysis of variance with LDL-Cholesterol adjusted for age, BMI and gender, giving a p-value of 0.03. The size of the effect was similar in both population. These two independent associations studies show that the non-conservative coding polymo ⁇ hism in the Kalpa gene impact the LDL-cholesterol metabolism in the general population. To further characterize Kalpa, a study of the expression profile was carried out. A northern blot analysis was performed on 24 different human tissues. A probe of the Kalpa gene was obtained by RT-PCR experiments and the sequence was obtained in order to confirm the specificity of the probe to the Kalpa gene sequence.
  • tissue-specific expression of the gene in: skeletal muscle, kidney, liver, placenta, brain and thyroid. No expression was detected in many different tissues such as heart, colon, thymus, lung and adrenal gland.
  • the messenger RNA was expected to be approximately of 3.8Kb in size, and a single isoform was detected by using the probe.
  • oligonucleotides include RNA, DNA, or RNA/DNA hybrid sequences of more than one nucleotide in either single chain or duplex form.
  • nucleic acids and “nucleic acid molecule” is intended to include DNA molecules (e.g., cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA or RNA generated using nucleotide analogs.
  • the nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.
  • nucleotide sequence may be employed to designate indifferently a polynucleotide or a nucleic acid. More precisely, the expression “nucleotide sequence” encompasses the nucleic material itself and is thus not restricted to the sequence information (i.e.
  • nucleic acids oligonucleotides
  • polynucleotides polynucleotides
  • an “isolated” nucleic acid molecule is one which is separated from other nucleic acid molecules which are present in the natural source of the nucleic acid.
  • an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived.
  • the isolated Kalpa nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived.
  • an "isolated" nucleic acid molecule such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
  • a nucleic acid molecule of the present invention e.g., a nucleic acid molecule having the nucleotide sequence of SEQ ID Nos 1 or 2 or 3, a portion thereof, can be isolated using standard molecular biology techniques and the sequence information provided herein. Using all or a portion of the nucleic acid sequence of SEQ ID Nos 1 or 2 or 3, as a hybridization probe, Kalpa nucleic acid molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning. A Laboratory Manual., 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).
  • nucleic acid molecule encompassing all or a portion of e.g. SEQ ID Nos 1 or 2 or 3, can be isolated by the polymerase chain reaction (PCR) using synthetic oligonucleotide primers designed based upon the sequence of SEQ ID No 1 or 2 or 3.
  • PCR polymerase chain reaction
  • a nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques.
  • the nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis.
  • oligonucleotides corresponding to Kalpa nucleotide sequences can be prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer.
  • hybridizes to is intended to describe conditions for moderate stringency or high stringency hybridization, preferably where the hybridization and washing conditions permit nucleotide sequences at least 60% homologous to each other to remain hybridized to each other.
  • the conditions are such that sequences at least about 70%, more preferably at least about 80%, even more preferably at least about 85%, 90%, 95% or 98% homologous to each other typically remain hybridized to each other.
  • Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.
  • a preferred, non-limiting example of stringent hybridization conditions are as follows: the hybridization step is realized at 65°C in the presence of 6 x SSC buffer, 5 x Denhardt's solution, 0,5% SDS and lOO ⁇ g/ml of salmon sperm DNA. The hybridization step is followed by four washing steps:
  • hybridization conditions being suitable for a nucleic acid molecule of about 20 nucleotides in length. It will be appreciated that the hybridization conditions described above are to be adapted according to the length of the desired nucleic acid, following techniques well known to the one skilled in the art, for example be adapted according to the teachings disclosed in Hames B.D. and Higgins S.J. (1985) Nucleic Acid Hybridization: A Practical Approach.
  • an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to a sequence of SEQ ID No 1 or 2 or 3 corresponds to a naturally-occurring nucleic acid molecule.
  • a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).
  • sequences are aligned for optimal comparison memeposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence and non-homologous sequences can be disregarded for comparison purposes).
  • the length of a reference sequence aligned for comparison pu ⁇ oses is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%>, and even more preferably at least 70%, 80%>, 90% or 95% of the length of the reference sequence, preferably at least 100, preferably at least 200, more preferably at least 300, even more preferably at least 400, and even more preferably at least 500, 600, at least 700, at least 800, at least 900, at least 1000, at least 1200, at least 1400, at least 1600, at least 1800, or at least 2000 nucleotides are aligned. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared.
  • amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid "homology”
  • the comparison of sequences and determination of percent homology between two sequences can be accomplished using a mathematical algorithm.
  • a preferred, non-limiting example of a mathematical algorithim utilized for the comparison of sequences is the algorithm of Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2264-68, modified as in Karlin and Altschul, 1993, Proc. Natl. Acad. Sci. USA 90:5873-77, the disclosures of which are incorporated herein by reference in their entireties.
  • Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of Altschul, et al., 1990, J. Mol. Biol. 215:403- 10.
  • Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Research 25(17):3389-3402.
  • the default parameters of the respective programs e.g., XBLAST and NBLAST
  • polypeptide refers to a polymer of amino acids without regard to the length of the polymer; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This term also does not specify or exclude post-expression modifications of polypeptides, for example, polypeptides which include the covalent attachment of glycosyl groups, acetyl groups, phosphate groups, lipid groups and the like are expressly encompassed by the term polypeptide.
  • polypeptides which contain one or more analogs of an amino acid (including, for example, non-naturally occurring amino acids, amino acids which only occur naturally in an unrelated biological system, modified amino acids from mammalian systems etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring.
  • polypeptide refers to a polymer of amino without regard to the length of the polymer; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide.
  • an “isolated” or “purified” protein or biologically active portion thereof is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the Kalpa polypeptide, or a biologically active fragment or homologue thereof protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized.
  • the language “substantially free of cellular material” includes preparations of a protein according to the invention (e.g. Kalpa polypeptide, or a biologically active fragment or homologue thereof) in which the protein is separated from cellular components of the cells from which it is isolated or recombinantly produced.
  • the language "substantially free of cellular material” includes preparations of a protein according to the invention having less than about 30% (by dry weight) of protein other than the Kalpa (also referred to herein as a "contaminating protein”), more preferably less than about 20% of protein other than the protein according to the invention, still more preferably less than about 10% of protein other than the protein according to the invention, and most preferably less than about 5% of protein other than the protein according to the invention.
  • the protein according to the invention or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%), more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation.
  • the language “substantially free of chemical precursors or other chemicals” includes preparations of Kalpa polypeptide, or a biologically active fragment or homologue thereof in which the protein is separated from chemical precursors or other chemicals which are involved in the synthesis of the protein.
  • the language “substantially free of chemical precursors or other chemicals” includes preparations of a Kalpa having less than about 30% (by dry weight) of chemical precursors or non- Kalpa chemicals, more preferably less than about 20% chemical precursors or non- Kalpa chemicals, still more preferably less than about 10% chemical precursors or non- Kalpa chemicals, and most preferably less than about 5% chemical precursors or non- Kalpa chemicals.
  • recombinant polypeptide is used herein to refer to polypeptides that have been artificially designed and which comprise at least two polypeptide sequences that are not found as contiguous polypeptide sequences in their initial natural environment, or to refer to polypeptides which have been expressed from a recombinant polynucleotide.
  • antibody refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site which specifically binds (immunoreacts with) an antigen, such as a Kalpa polypeptide, or a biologically active fragment or homologue thereof.
  • immunologically active portions of immunoglobulin molecules include F(ab) and F(ab') 2 fragments which can be generated by treating the antibody with an enzyme such as pepsin.
  • the invention provides polyclonal and monoclonal antibodies that bind a Kalpa polypeptide, or a biologically active fragment or homologue thereof.
  • monoclonal antibody or “monoclonal antibody composition”, as used herein, refers to a population of antibody molecules that contain only one species of an antigen binding site capable of immunoreacting with a particular epitope of a Kalpa polypeptide.
  • a monoclonal antibody composition thus typically displays a single binding affinity for a particular Kalpa with which it immunoreacts.
  • primer denotes a specific oligonucleotide sequence which is complementary to a target nucleotide sequence and used to hybridize to the target nucleotide sequence.
  • a primer serves as an initiation point for nucleotide polymerization catalyzed by DNA polymerase, RNA polymerase or reverse transcriptase.
  • probe denotes a defined nucleic acid segment (or nucleotide analog segment, e. g., polynucleotide as defined herein) which can be used to identify a specific polynucleotide sequence present in samples, said nucleic acid segment comprising a nucleotide sequence complementary of the specific polynucleotide sequence to be identified.
  • cholesterol-related disorder refers to any condition arising from, influenced by or influencing regulation of cholesterol or cholesterol levels.
  • a “cholesterol- related disorder” may be a condition arising from, influenced by or characterized by abnormal levels of or regulation of cholesterol.
  • a “cholesterol-related disorder” includes diseases or disorders arising from, influenced by or incluencing regulation of HDL- cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation.
  • lowering cholesterol or “cholesterol-lowering” may refer to lowering total cholesterol, lowering HDL-cholesterol, lowering LDL-cholesterol, lowering triglycerides and modulating, preferably lowering the LDL/HDL ratio.
  • trait and “phenotype” are used interchangeably herein and refer to any visible, detectable or otherwise measurable property of an organism such as symptoms of, or susceptibility to a disease for example.
  • allele is used herein to refer to variants of a nucleotide sequence.
  • a biallelic polymorphism has two forms. Typically the first identified allele is designated as the original allele whereas other alleles are designated as alternative alleles. Diploid organisms may be homozygous or heterozygous for an allelic form.
  • the term "heterozygosity rate” is used herein to refer to the incidence of individuals in a population, which are heterozygous at a particular allele. In a biallelic system the heterozygosity rate is on average equal to 2Pa (1-Pa), where Pa is the frequency of the least common allele. In order to be useful in genetic studies a genetic marker should have an adequate level of heterozygosity to allow a reasonable probability that a randomly selected person will be heterozygous.
  • genotype refers the identity of the alleles present in an individual or a sample.
  • a genotype preferably refers to the description of the polymo ⁇ hism alleles present in an individual or a sample.
  • genotyping a sample or an individual for a polymorphism consists of determining the specific allele or the specific nucleotide carried by an individual at a polymo ⁇ hism.
  • mutation refers to a difference in DNA sequence between or among different genomes or individuals which has a frequency below 1%>.
  • haplotype refers to a combination of alleles present in an individual or a sample.
  • a haplotype preferably refers to a combination of polymorphism alleles found in a given individual and which may be associated with a phenotype.
  • polymo ⁇ hism refers to the occurrence of two or more alternative genomic sequences or alleles between or among different genomes or individuals.
  • Polymo ⁇ hic refers to the condition in which two or more variants of a specific genomic sequence can be found in a population.
  • a “polymo ⁇ hic site” is the locus at which the variation occurs.
  • a “single nucleotide polymo ⁇ hism” is a single base pair change. Typically a single nucleotide polymo ⁇ hism is the replacement of one nucleotide by another nucleotide at the polymorphic site.
  • single nucleotide polymorphism preferably refers to a single nucleotide substitution.
  • the polymo ⁇ hic site may be occupied by two different nucleotides.
  • polymorphism includes “biallelic marker”, which as used herein refers to a polymo ⁇ hism having two alleles at a fairly high frequency in the population, preferably a single nucleotide polymo ⁇ hism (SNP).
  • SNP single nucleotide polymo ⁇ hism
  • the terms “bilallelic marker” and “biallelic polymo ⁇ hisms” may also be used interchangeably with the terms “single nucleotide polymorphism”.
  • a “polymo ⁇ hism allele” refers to the nucleotide variants present a polymo ⁇ hic site.
  • the frequency of the less common allele of the polymorphisms of the present invention has been validated to be greater than 1%, preferably the frequency is greater than 10%, more preferably the frequency is at least 20% (i. e. heterozygosity rate of at least 0.32), even more preferably the frequency is at least 30% (i. e. heterozygosity rate of at least 0.42).
  • a polymorphism wherein the frequency of the less common allele is 30%) or more is termed a"high quality polymo ⁇ hism.”
  • a "promoter” refers to a DNA sequence recognized by the synthetic machinery of the cell required to initiate the specific transcription of a gene.
  • operably linked refers to a linkage of polynucleotide elements in a functional relationship.
  • a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the coding sequence.
  • two DNA molecules are said to be "operably linked” if the nature of the linkage between the two polynucleotides does not (I) result in the introduction of a frame-shift mutation or (2) interfere with the ability of the polynucleotide containing the promoter to direct the transcription of the coding polynucleotide.
  • upstream is used herein to refer to a location, which is toward the 5' end of the polynucleotide from a specific reference point.
  • base paired and “Watson & Crick base paired” are used interchangeably herein to refer to nucleotides which can be hydrogen bonded to one another be virtue of their sequence identities in a manner like that found in double-helical DNA with thymine or uracil residues linked to adenine residues by two hydrogen bonds and cytosi refers to an antigenic determinant of a polypeptide.
  • An epitope can comprise as few as 3 amino acids in a spatial conformation which is unique to the epitope. Generally an epitope consists of at least 6 such amino acids, and more usually at least 8-10 such amino acids.
  • Methods for determining the amino acids which make up an epitope include x-ray crystallography, 2-dimensional nuclear magnetic resonance, and epitope mapping e.g. the Pepscan method described by H. Mario Geysen et al., 1984, Proc. Natl. Acad. Sci. U.S.A. 81:3998-4002; PCT Publication No. WO 84/03564; and PCT Publication No. WO 84/03506.
  • Kalpa-related polymo ⁇ hism relates to one or a set of polymo ⁇ hisms in linkage disequilibrium with the Kalpa gene.
  • the term Kalpa-related polymo ⁇ hisms also encompasses polymorphisms located on SEQ ID No 2 or 3, or preferably the polymorphisms disclosed in Table 3 and as identified in SEQ ID Nos 5 to 23.
  • the preferred Kalpa protein-related polymorphisms alleles of the present invention can include each a Kalpa allele, the allele(s) described individually or in groups consisting of all the possible combinations of the alleles.
  • non-genic is used herein to describe genomic sequences, as well as polynucleotides and primers which occur outside the nucleotide positions shown in the human Kalpa genomic sequence of SEQ ID No 1.
  • genie is used herein to describe Kalpa gene as well as polynucleotides and primers which do occur in the nucleotide positions shown in the human Kalpa genomic sequence of SEQ ID No 1.
  • Kalpa is a membrane protein involved in SREBP traffic
  • the molecular regulation of cellular sterol metabolism has been elucidated by Brown 0 and Goldstein and their colleagues.
  • the LDLR gene promoter contains a sterol response element (SRE) that is required for regulating transcription of the gene encoding LDLR in response to cellular sterol content.
  • SREBP- 1 and -2 Two SRE-binding proteins that contain two transmembrane domains and are localized to the endoplasmic reticulum (ER).
  • SCAP SREBP-cleavage activating protein
  • SREBP in the SCAP complex When cells are overloaded with sterols, SREBP in the SCAP complex is no longer accessible to SIP cleavage and remains membrane-bound. Conversely, when sterol concentrations are limiting, the current data suggest that the SCAP/SREBP complex can move to a post-ER compartment. At this location, the complex encounters S 1 P and S2P, leading to the release of soluble SREBP (reviewed by Brown and Goldstein, Proc. Natl. Acad. Sci. USA., 1999, 96(20): 11041-8).
  • statins inhibit HMGCoA reductase, the rate-limiting enzyme in cholesterol biosynthesis, reducing cellular sterol content and thereby de-repressing the transport of the SCAP-SREBP complex and resulting in the upregulation of LDLR, which in turn transports blood LDL into the liver, reducing blood levels of LDL.
  • the Kalpa protein of the invention contains five transmembrane domains in the N teminal part of the protein and a TPR motif.
  • the TPR motif may be involved in protein-protein interaction (e.g. interactions with Kalpa-targets or Kalpa- target proteins). It has further been proposed that the TPR protein preferably interacts with WD-40 repeat proteins.
  • the SCAP contain a WD-40 motif which is know to interact with the membrane bound SREBP.
  • the invention provides assays based on interactions of the TPR motif of the Kalpa protein with a WD-40 domain.
  • the TPR-motif, and hence the Kapla protein interacts with a SCAP polypeptide.
  • the Kalpa protein of the invention participates in the internalization and the transport of the precursor SREBPs from the ER to the Golgi, where SIP and S2P, sequentially cleave the SREBPs and liberate the mature and active SREBP proteins.
  • the Kalpa protein also contains an Aldo-keto reductase (AKRs) domain.
  • the assays of the invention involve detecting aldo-keto reductase activity.
  • Aldo-keto reductases (AKRs) form an enzyme superfamily including aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenase and dihydrodiol dehydrogenases. They are monomeric NADPH-dependent oxidoreductase, about 320 residues in size, with broad substrate specificities (Bohren et al., 1989, J.Biol.Chem. 264:9547-51; Jez et al., 1997, Biochem. J.
  • AKRs are found in mammals, amphibians, plants yeast, protozoa and bacteria. They metabolize a diverse range of substrates, including aliphatic and aromatic aldehydes, monosaccharides, steroid, prostaglandin, polycyclic aromatic hydrocarbons and isoflavinoids. These enzymes catalyze the reduction of carbonyl-containing compounds, like carbonyl- containing sugars and aromatic compounds, to the corresponding alcohols.
  • the aldo-keto reductase enzymes are structurally very similar.
  • aldose reductase One known reaction catalyzed by a family member, aldose reductase, is the reduction of the circulating glucose to sorbitol, a hyperosmotic sugar, which is then further metabolized to fructose by sorbitol dehydrogenase. Under normal conditions, the reduction of glucose to sorbitol is a minor pathway.
  • sorbitol In hyperglycemic states, however, the accumulation of sorbitol is implicated in the development of diabetic complications (OMIM* 103880 Aldo-keto reductase family 1, member Bl). Members of this enzyme family are also highly expressed in some liver cancers (Cao et al., 1998, J.Biol. Chem. 273:11429-35). Similarly, the mammalian aldehyde reductases play a significant role in the metabolism of neurotransmitter aldehydes produced by monoamine oxidase, and aldehyde reductases inhibitors may have anti-depressant properties.
  • hydroxysteroid dehydrogenase of this superfamily have the potential to act as molecular switches, by converting potent steroid hormones into inactive metabolites, thereby regulating the amount of hormone that can bind and activate nuclear receptors.
  • Another member of the aldo-keto reductase family the bile acidic binding protein (AKR1C2, also named DD2 and 3 ⁇ -HSD type III) has been isolated from the liver and shown to have high- affinity binding for the bile (Hara et al., 1996, Biochem. J., 313:373-76). This AKR enzyme may assist in the rapid intracellular transport of bile acids from the sinusoidal to the canalicular pole of the cell.
  • aldose reductase-like sequence similar to human and rat aldose reductase has also been reported in the mouse, the mouse vas deferens protein (MVDP) which expression is regulated by cyclic AMP at the protein and mRNA levels (Aigueperse et al., 1999, J. Endocrinol., 160:147-54). Since aldose reductase is a major reductase for isoproaldehyde, a product of side-chain cleavage of cholesterol, in human and animal adrenal glands, this may offer a clue to the function of MVDP in adrenals.
  • MVDP mouse vas deferens protein
  • AKRs are potential therapeutic targets, and structure-based drug design may lead to compounds with the desired specificity and clinical efficacy.
  • the crystallographic structures of aldehyde and aldose reductase are known as well as the aldose reductase homolgues FR-1 and CHO reductase.
  • Current research is focused on the structural differences that allow subtle but distinct substrate specificities and on refining the catalytic mechanism through site-directed mutagenesis of active site residues.
  • the invention thus provides method for the identification of aldo keto reductase inhibitors and activators. Briefly, compounds to be tested are arrayed in the well of a multi-well plate in varying concentrations along with an appropriate buffer and substrate.
  • Aldo keto reductase is measured for each well and the ability of each compound to inhibit the aldo keto reductase activity can be determined, as well as the dose-response profiles. This assay can also be used to identify molecules which enhance aldo keto reductase activity.
  • aldo-reductase inhibitors are available and can be used, including: (a) hexoestol analogues; (b) 17 beta-oestradiol; (c) phenolphtalein; (d) flufenamic acid; (e) 3,5,3 ',5'-tetraiodothyrop ⁇ ionic acid analogs.
  • DDH1 and DDH2 dihydrodiol dehydrogenase
  • CHDR chlordecone reductase
  • a Kalpa activity refers to an activity exerted by a Kalpa polypeptide or nucleic acid molecule, or a biologically active fragment or homologue thereof as determined in vivo, or in vitro, according to standard techniques.
  • a Kalpa activity is a direct activity, such as an association with a Kalpa-target molecule, glycosyltransferase activity such as preferably an O-linked glycosylation activity (preferably O-GlcNAc transferase activity) and/or aldo-keto reductase activity.
  • a "Kalpa target molecule” is a molecule with which a Kalpa binds or interacts in nature, such that a Kalpa-mediated function is achieved.
  • a Kalpa target molecule can be another Kalpa protein or polypeptide which is substantially identical or which shares structural similarity (e.g. forming a dimer or multimer).
  • a Kalpa target molecule can be a non-Kalpa comprising protein molecule, or a non-self molecule, such as a polypeptide containing a WD-40 domain, or more preferably a SCAP protein.
  • Binding or interaction with a Kalpa target molecule or with other targets can be detected for example using a two hybrid-based assay in yeast to find drugs that disrupt interaction of the Kalpa bait with the target prey, or an in vitro interaction assay with recombinant Kalpa and target proteins.
  • a Kalpa activity may be an indirect activity, such as an activity mediated by interaction of the Kalpa with a Kalpa target molecule such that the target molecule modulates a downstream cellular activity (e.g., interaction of a Kalpa molecule with a Kalpa target molecule can modulate the activity of that target molecule on an intracellular signaling pathway).
  • interaction of Kalpa molecule with a SCAP protein may modulate a SREBP activity or localization and/or LDL receptor expression.
  • a Kalpa activity is detected by assessing any of the following activities: cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation, LDL/HDL ratio regulation; LDL-R expression, or any suitable therapeutic endpoint discussed herein in the section titled "Methods of Treatment”. Kalpa activity may be assessed either in vitro (cell or non-cell based) or in vivo depending on the assay type and format.
  • a Kalpa polypeptide, or a biologically active fragment or homologue thereof can be the minimum region of a polypeptide that is necessary and sufficient for activity.
  • a functional Kalpa protein may be only a small portion of the respective protein, about 10 amino acids to about 15 amino acids, or from about 20 amino acids to about 25 amino acids, or from about 30 amino acids to about 35 amino acids, or from about 40 amino acids to about 45 amino acids, or from about 50 amino acids to about 55 amino acids, or from about 60 amino acids to about 70 amino acids, or from about 80 amino acids to about 90 amino acids, or about 100 amino acids in length.
  • Kalpa or Kalpa polypeptide activity may require a larger portion of the native protein than may be defined by protein-protein interaction, DNA binding, cell assays or by sequence alignment.
  • the invention includes novel protein domains of the Kalpa protein, including methods of screening for modulators of Kalpa which act on the novel domains.
  • the invention thus encompasses a Kalpa polypeptide comprising a polypeptide having at least a Kalpa sequence in the protein or corresponding nucleic acid molecule, preferably a Kalpa sequence corresponding of SEQ ID No 4.
  • a Kalpa member may comprise an amino acid sequence of at least about 25, 30, 34, 40, 45, 50, 60, 70, 80 to 90 amino acid residues in length, of which at least about 50-80%, preferably at least about 60-10%, more preferably at least about 65%, 75%) or 90% of the amino acid residues are identical or similar amino acids-to a functional domain of Kalpa of SEQ ID No 4.
  • Isolated proteins of the present invention preferably Kalpa polypeptides, or a biologically active fragments or homologues thereof, have an amino acid sequence sufficiently homologous to the respective amino acid sequence of SEQ ID No 4.
  • the term "sufficiently homologous" refers to a first amino acid or nucleotide sequence which contains a sufficient or minimum number of identical or equivalent (e.g., an amino acid residue which has a similar side chain) amino acid residues or nucleotides to a second amino acid or nucleotide sequence such that the first and second amino acid or nucleotide sequences share common structural domains or motifs and/or a common functional activity.
  • amino acid or nucleotide sequences which share common structural domains have at least about 30-40%) identity, preferably at least about 40-50% identity, more preferably at least about 50-60%, and even more preferably at least about 60-70%, 70-80%, 80%, 90%, 95%, 97%, 98%, 99% or 99.8%) identity across the amino acid sequences of the domains and contain at least one and preferably two structural domains or motifs, are defined herein as sufficiently homologous.
  • amino acid or nucleotide sequences which share at least about 30%), preferably at least about 40%, more preferably at least about 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99% or 99.8% identity and share a common functional activity are defined herein as sufficiently homologous.
  • the invention encompasses any of the Kalpa polypeptides, as well as fragment thereof, nucleic acids complementary thereto and nucleic acids capable of hybridizing thereto under stringent conditions.
  • the genomic sequence of the Kalpa gene for use in accordance with the present invention is provided in SEQ ID No 1.
  • the Kalpa gene genomic sequence comprises exons and introns as well as sequences approximately lOkb upstream and downstream of the first and last Kalpa exon.
  • the positions of the exons on the respective SEQ ID No for the Kalpa genes are provided in Table 1.
  • Particularly preferred genomic sequences of the Kalpa gene include isolated, purified, or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1.
  • Preferably said contiguous span comprises at least 1 one single nucleotide polymorphism, or the complements thereof.
  • the present invention provides Kalpa intron and exon polynucleotide sequences including polymo ⁇ hisms.
  • Particularly preferred polynucleotides of the present invention include purified, isolated or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of a sequence of SEQ ID No 1 or the complements thereof, wherein said span includes a polymo ⁇ hism.
  • said polymo ⁇ hism is selected from the polymorphisms at the nucleotide positions on SEQ ID No 1 described in the column titled ["Gene_position"] in Table 3.
  • This column of Table SNP provides the position of each SNP of SEQ ID Nos 5 to 23 on the respective genomic sequence of SEQ ID No 1. It will be appreciated that either allele specified at position 25 in the respective SEQ ID No of SEQ ID Nos: 5 to 23 may be present at the polymo ⁇ hic base.
  • the nucleic acids defining the Kalpa gene intronic polynucleotides may be used as oligonucleotide primers or probes in order to detect the presence of a copy of a Kalpa gene in a test sample, or alternatively in order to amplify a target nucleotide sequence within the Kalpa sequences.
  • the genomic sequences of the Kalpa gene contains regulatory sequences both in the non-coding 5'-flanking region and in the non-coding 3'-flanking region that border the Kalpa transcribed region containing the exons of the gene.
  • the promoter activity of the regulatory regions contained in the Kalpa gene of polynucleotide sequences of SEQ ID No 1 can be assessed by any known method. Methods for identifying the polynucleotide fragments of SEQ ID No 1 involved in the regulation of the expression of the Kalpa gene are well-known to those skilled in the art (see Sambrook et al., Molecular Cloning: A Laboratory Manual, 2 nd edition, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989).
  • the reporter gene for example beta galactosidase or chloramphenicol acetyl transferase
  • Genomic sequences located upstream of the first exon of a Kalpa gene may be cloned into any suitable promoter reporter vector, such as the pSEAP-Basic, pSEAP-Enhancer, ppgal-Basic, p (3gal- Enhancer, or pEGFP-I Promoter Reporter vectors available from Clontech, or pGL2-basic or pGL3 -basic promoterless luciferase reporter gene vector from Promega.
  • Each of these promoter reporter vectors include multiple cloning sites positioned upstream of a reporter gene encoding a readily assayable protein such as secreted alkaline phosphatase, luciferase, beta galactosidase, or green fluorescent protein.
  • the sequences upstream the first exon of a Kalpa gene are inserted into the cloning sites upstream of the reporter gene in both orientations and introduced into an appropriate host cell.
  • the level of reporter protein is assayed and compared to the level obtained with a vector lacking an insert in the cloning site.
  • the presence of an elevated expression level in the vector containing the insert with respect to the control vector indicates the presence of a promoter in the insert.
  • Promoter sequences within the 5' non-coding regions of the Kalpa gene may be further defined by constructing nested 5' and/or 3' deletions using conventional techniques such as Exonuclease III or appropriate restriction endonuclease digestion.
  • the resulting deletion fragments can be inserted into the promoter reporter vector to determine whether the deletion has reduced or obliterated promoter activity, such as described, for example, by Coles et al. (Hum. Mol. Genet., 7: 791-800, 1998). In this way, the boundaries of the promoter
  • the activity and the specificity of the promoter of a Kalpa gene can further be assessed by monitoring the expression level of a detectable polynucleotide operably linked to the respective Kalpa gene promoter in different types of cells and tissues.
  • the detectable polynucleotide may be either a polynucleotide that specifically hybridizes with a predefined oligonucleotide probe, or a polynucleotide encoding a detectable protein, including a Kalpa gene polypeptide or a fragment or a variant thereof. This type of assay is well known to those skilled in the art and is described in US 5,502,176, and US 5,266,488.
  • Polynucleotides carrying the regulatory elements located both at the 5' end and at the 3' end of the Kalpa gene coding region may be advantageously used to control the transcriptional and translational activity of a heterologous polynucleotide of interest, said polynucleotide being heterologous as regards to the Kalpa gene regulatory region.
  • a "biologically active" fragment of a Kalpa regulatory element preferably a element comprised in SEQ ID No 1 according to the present invention is a polynucleotide comprising or alternatively consisting of a fragment of said polynucleotide which is functional as a regulatory region for expressing a recombinant polypeptide or a recombinant polynucleotide in a recombinant cell host.
  • a nucleic acid or polynucleotide is "functional" as a regulatory region for expressing a recombinant polypeptide or a recombinant polynucleotide if said regulatory polynucleotide contains nucleotide sequences which contain transcriptional and translational regulatory information, and such sequences are "operably linked" to nucleotide sequences which encode the desired polypeptide or the desired polynucleotide.
  • the regulatory polynucleotides according to the invention may be advantageously part of a recombinant expression vector that may be used to express a coding sequence in a desired host cell or host organism.
  • a further object of the invention consists of an isolated polynucleotide comprising: a) a nucleic acid comprising a regulatory nucleotide sequence selected from the group consisting of a nucleotide sequence comprising a polynucleotide of SEQ ED No 1; b) a polynucleotide encoding a desired polypeptide or a nucleic acid of interest, operably linked to the nucleic acid defined in (a) above.
  • polypeptide encoded by the nucleic acid described above may be of various nature or origin, encompassing proteins of prokaryotic or eukaryotic origin.
  • polypeptides expressed under the control of a Kalpa gene regulatory region there may be cited bacterial, fungal or viral antigens.
  • the desired nucleic acids encoded by the above described polynucleotide may be complementary to a desired coding polynucleotide, for example to the Kalpa gene coding sequence, and thus useful as an antisense polynucleotide.
  • Such a polynucleotide may be included in a recombinant expression vector in order to express the desired polypeptide or the desired nucleic acid in host cell or in a host organism.
  • Kalpa cDNA Sequences of the Kalpa gene As mentioned, the invention relates to the use of Kalpa proteins.
  • Kalpa cDNAs for use in accordance with the present invention are described in of SEQ ID No 2 or 3.
  • Figure 2 shows the location on the SEQ ID No 3 cDNA of Kalpa functional domains.
  • the Open Reading Frame encoding the respective Kalpa proteins spans from the nucleotide positions of SEQ ID No 2 or 3 as shown in Table 2.
  • Additional preferred cDNA polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, or 1000 nucleotides from a sequence of SEQ ID No 2 or 3 and the complements thereof.
  • Preferably said contiguous span is a contiguous span selected from group consisting of the sequences of nucleic acid positions 112 to 2493 of SEQ ID No 2 or of the sequences of nucleic acid positions 287 to 2664 of SEQ ID No 3, and the complements thereof.
  • Additional preferred polynucleotides include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300, 400, 500, or 1000 nucleotides from a sequence of SEQ ID No 2 or 3, wherein said contiguous span comprises one of the alleles at a polymorphic base.
  • a Kalpa cDNA may comprise, consist or consist essentially of a functional domain referred to in Table 4.
  • Table 4 provides the position of the respective domain (referred to in Table 4 as "feature") on the cDNA of the respective gene.
  • Additional preferred cDNA polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 25, 30, 35, 40, 50, 75, 100, 150 or 200 nucleotides from a sequence selected from the group consisting of the positions on SEQ ID No 2 or 3 of the functional domains listed in Table 4, and the complements thereof.
  • the polynucleotide disclosed above that contains the coding sequence of the Kalpa gene of the invention may be expressed in a desired host cell or a desired host organism, when this polynucleotide is placed under the control of suitable expression signals.
  • the expression signals may be either the expression signals contained in the regulatory regions in the Kalpa gene of the invention or may be exogenous regulatory nucleic sequences.
  • Such a polynucleotide, when placed under the suitable expression signals, may also be inserted in a vector for its expression.
  • nucleic acid for use in accordance with the invention is a purified, isolated, or recombinant nucleic acid comprising the nucleotide sequence of SEQ ID No 2 or 3, complementary sequences thereto, and fragments thereof.
  • the invention also pertains to a purified or isolated nucleic acid comprising a polynucleotide having at least 70%, 80%, 85%), 90%> or 95%> nucleotide identity with a polynucleotide of SEQ ID No 2 or 3, or advantageously 99 % nucleotide identity, preferably 99.5%> nucleotide identity and most preferably 99.8% nucleotide identity with a polynucleotide of SEQ ID No 2 or 3, or a sequence complementary thereto or a biologically active fragment thereof.
  • Another object of the invention relates to purified, isolated or recombinant nucleic acids comprising a polynucleotide that hybridizes, under the stringent hybridization conditions defined herein, with a polynucleotide of SEQ ID No 2 or 3, or a sequence complementary thereto or a variant thereof or a biologically active fragment thereof. Also encompassed is a purified, isolated, or recombinant nucleic acid polynucleotide encoding a Kalpa polypeptide, as further described herein.
  • the invention pertains to purified or isolated nucleic acid molecules that encode a portion or variant of a Kalpa protein, wherein the portion or variant displays a Kalpa activity.
  • said portion or variant is a portion or variant of a naturally occurring full-length Kalpa comprising, consisting essentially of, or consisting of a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 2 or 3, wherein said nucleic acid encodes a portion or variant having a Kalpa activity described herein.
  • the invention relates to a polynucleotide encoding a portion consisting of 8-20, 20-50, 50-70, 60-100, 100 - 150, 150- 200, 200-250 or 250 - 350 amino acids, of SEQ ID No 4, or a variant thereof, wherein said portion displays a Kalpa activity described herein.
  • a Kalpa variant nucleic acid may, for example, encode a biologically active Kalpa comprising at least 1, 2, 3, 5, 10, 20 or 30 amino acid changes from the respective sequence selected from the group consisting of SEQ ID No 4, or may encode a biologically active Kalpa comprising at least 1%, 2%, 3%, 5%, 8%, 10% or 15% changes in amino acids from the respective sequence of SEQ ID No 4.
  • nucleic acid molecules which are complementary to Kalpa nucleic acids described herein.
  • a complementary nucleic acid is sufficiently complementary to the nucleotide respective sequence shown in SEQ ID No 2 or 3, such that it can hybridize to said nucleotide sequence shown in SEQ ID No 2 or 3, thereby forming a stable duplex.
  • Another object of the invention is a purified, isolated, or recombinant nucleic acid encoding a Kalpa polypeptide comprising, consisting essentially of, or consisting of an amino acid sequence selected from the group consisting of SEQ ID No 4, or fragments thereof, wherein the isolated nucleic acid molecule encodes a functional domain of a Kalpa (e.g.
  • a Kalpa nucleic acid encodes a Kalpa polypeptide comprising at least two Kalpa functional domains, such as for example a glycosyltransferase domain and a TPR repeat domain, or an aldo-keto reductase domain and a TPR repeat domain, or at least two TPR repeat domains.
  • nucleic acids of the invention include isolated, purified, or recombinant Kalpa nucleic acids comprising, consisting essentially of, or consisting of a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or
  • a nucleic acid fragment encoding a biologically active portion of a Kalpa can be prepared by isolating a portion of a nucleotide sequence of SEQ ID No 2 or 3, which encodes a polypeptide having a Kalpa biological activity (the biological activities of the Kalpa described herein), expressing the encoded portion of the Kalpa (e.g., by recombinant expression in vitro or in vivo) and assessing the activity of the encoded portion of the Kalpa.
  • the invention further encompasses nucleic acid molecules that differ from the Kalpa nucleotide sequences of the invention due to degeneracy of the genetic code and encode the same Kalpa, or fragment thereof, of the invention.
  • the polymo ⁇ hic base is present at nucleotide position 25 in each of the 49mer nucleotide seqeunces of SEQ ID Nos 5 to 23.
  • Table 3 summarizes the position of each SNP in a Kalpa gene, and provides the position of the SNP in the genomic sequence of the Kalpa gene by referring to the genomic sequences of SEQ ID No 1.
  • polymo ⁇ hism at nucleotide position 1696 of SEQ ID No 3 is an exonic non- conservative polymorphism changing Isoleucine to Valine at amino acid residue 438 of the depicted encoded protein or at amino acid position 472 of SEQ ID No 4.
  • the present invention encompasses polynucleotides for use as primers and probes in the methods of the invention.
  • These polynucleotides may consist of, consist essentially of, or comprise a contiguous span of nucleotides of a sequence from any sequence in the Sequence Listing as well as sequences which are complementary thereto ("complements thereof).
  • the "contiguous span” may be at least 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that a contiguous span of these lengths is consistent with the lengths of the particular Sequence ID.
  • flanking sequences surrounding the polymo ⁇ hic bases are enumerated in the Sequence Listing. Rather, it will be appreciated that the flanking sequences surrounding the polymo ⁇ hisms, or any of the primers of probes of the invention which, are more distant from the markers, may be lengthened or shortened to . any extent compatible with their intended use and the present invention specifically contemplates such sequences. It will be appreciated that the polynucleotides referred to in the Sequence Listing may be of any length compatible with their intended use. Also the flanking regions outside of the contiguous span need not be homologous to native flanking sequences which actually occur in human subjects.
  • nucleotide sequence which is compatible with the nucleotides intended use is specifically contemplated.
  • the contiguous span may optionally include the Kalpa-related polymorphism in said sequence.
  • SNPs, or polymo ⁇ hisms generally consist of a polymo ⁇ hism at one single base position. Each polymo ⁇ hism therefore corresponds to two forms of a polynucleotide sequence which, when compared with one another, present a nucleotide modification at one position.
  • the nucleotide modification involves the substitution of one nucleotide for another.
  • either the original or the alternative allele of the polymo ⁇ hisms disclosed in Table 3, or the first or second allele disclosed in SEQ ID Nos 5 to 23 may be specified as being present at the Kalpa-related SNP.
  • the original allele can be obtained the genomic sequences of SEQ ID No 1.
  • the SNPs may be specified which consist of more complex polymorphisms including insertions/deletions of at least one nucleotide.
  • Preferred polynucleotides may consist of, consist essentially of, or comprise a contiguous span of nucleotides of a sequence from SEQ ID Nos 1, 2 or 3, or 5 to 23 as well as sequences which are complementary thereto.
  • the "contiguous span” may be at least 8, 10, 12, 15, 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that a contiguous span of these lengths is consistent with the lengths of the particular Sequence ID.
  • the contiguous span may optionally comprise a polymorphism selected from the group consisting of polymorphisms of Table 3 or present in SEQ ID Nos 5 to 23.
  • the invention also relates to polynucleotides that hybridize, under conditions of high or intennediate stringency, to a polynucleotide of a sequence from any sequence in the Sequence Listing as well as sequences, which are complementary thereto.
  • polynucleotides are at least 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that a polynucleotide of these lengths is consistent with the lengths of the particular Sequence ID.
  • Preferred polynucleotides comprise a Kalpa-related polymorphism.
  • either allele (e.g. the original or the alternative allele) of the polymo ⁇ hisms disclosed in may be specified as being present at the Kalpa-related polymorphism.
  • Particularly preferred polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1, wherein said contiguous span comprises at least 1, 2, 3, 4, 5 or 10 of the nucleotide positions of polymo ⁇ hic bases listed in Table 3, and the complements thereof. Said nucleotide positions may specify either of the alleles indicated in the sequence of SEQ ID NO 5 to 23 as corresponding to the particular polymorphism.
  • Additional preferred polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides from a sequence of SEQ ID No 2 or 3, wherein said contiguous span comprises at least one Kalpa-related polymo ⁇ hism.
  • the present invention further embodies isolated, purified, and recombinant polynucleotides which encode polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of SEQ ID No 4, wherein said contiguous span comprises at least one Kalpa-related polymorphism.
  • the primers of the present invention may be designed from the disclosed sequences for any method known in the art.
  • a preferred set of primers is fashioned such that the 3' end of the contiguous span of identity with the sequences of the Sequence Listing is present at the 3' end of the primer.
  • Such a configuration allows the 3' end of the primer to hybridize to a selected nucleic acid sequence and dramatically increases the efficiency of the primer for amplification or sequencing reactions.
  • the contiguous span is found in one of the nucleic sequences described in the sequence listing. Allele specific primers may be designed such that a polymorphism is at the 3' end of the contiguous span and the contiguous span is present at the 3' end of the primer.
  • Such allele specific primers tend to selectively prime an amplification or sequencing reaction so long as they are used with a nucleic acid sample that contains one of the two alleles present at a polymo ⁇ hism.
  • the 3' end of primers of the invention may be located within or at least 2, 4, 6, 8, 10, 12, 15, 18, 20, 25, 49, 50, 100, 250, 500, or 1000, to the extent that this distance is consistent with the particular Sequence ID, nucleotides upstream of a Kalpa-related polymo ⁇ hism in said sequence or at any other location which is appropriate for their intended use in sequencing, amplification or the location of novel sequences or markers.
  • a preferred set of amplification primers is derived from sequences described in SEQ ID Nos. 5 to 23. Primers with their 3' ends located 1 nucleotide upstream of a Kalpa-related polymorphism have a special utility as microsequencing assays.
  • the probes of the present invention may be designed from the disclosed sequences for any method known in the art, particularly methods which allow for testing if a particular sequence or marker disclosed herein is present.
  • a preferred set of probes may be designed for use in the hybridization assays of the invention in any manner known in the art such that they selectively bind to one allele of a polymorphism, but not the other under any particular set of assay conditions.
  • Preferred hybridization probes may consists of, consist essentially of, or comprise a contiguous span which ranges in length from 8, 10, 12, 15, 18 or 20 to 25, 35, 40, 50, 60, 70, or 80 nucleotides, or be specified as being 12, 15, 18, 20, 25, 35, 40, or 50 nucleotides in length and including a Kalpa-related polymo ⁇ hism of said sequence.
  • said polymo ⁇ hism may be within 6, 5, 4, 3, 2, or 1 nucleotides of the center of the hybridization probe or at the center of said probe.
  • nucleotides in a polynucleotide with respect to the center of the polynucleotide is described herein in the following manner.
  • the nucleotide at an equal distance from the 3' and 5' ends of the polynucleotide is considered to be"at the center" of the polynucleotide, and any nucleotide immediately adjacent to the nucleotide at the center, or the nucleotide at the center itself is considered to be "within 1 nucleotide of the center.”
  • any of the five nucleotides positions in the middle of the polynucleotide would be considered to be within 2 nucleotides of the center, and so on.
  • the polymo ⁇ hism, allele or polymorphism is "at the center" of a polynucleotide if the difference between the distance from the substituted, inserted, or deleted polynucleotides of the polymo ⁇ hism and the 3' end of the polynucleotide, and the distance from the substituted, inserted, or deleted polynucleotides of the polymo ⁇ hism and the 5' end of the polynucleotide is zero or one nucleotide.
  • the polymo ⁇ hism is considered to be "within 1 nucleotide of the center.” If the difference is 0 to 5, the polymorphism is considered to be “within 2 nucleotides of the center.” If the difference is 0 to 7, the polymorphism is considered to be "within 3 nucleotides of the center, " and so on.
  • the polymo ⁇ hism, allele or polymo ⁇ hism is "at the center" of a polynucleotide if the difference between the distance from the substituted, inserted, or deleted polynucleotides of the polymo ⁇ hism and the 3' end of the polynucleotide, and the distance from the substituted, inserted, or deleted polynucleotides of the polymo ⁇ hism and the 5'end of the polynucleotide is zero or one nucleotide.
  • the polymorphism is considered to be "within 1 nucleotide of the center.” If the difference is 0 to 5, the polymorphism is considered to be “within 2 nucleotides of the center.” If the difference is 0 to 7, the polymorphism is considered to be "within 3 nucleotides of the center” and so on.
  • any of the polynucleotides of the present invention can be labeled, if desired, by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means.
  • useful labels include radioactive substances, fluorescent dyes or biotin.
  • polynucleotides are labeled at their 3' and 5' ends.
  • a label can also be used to capture the primer, so as to facilitate the immobilization of either the primer or a primer extension product, such as amplified DNA, on a solid support.
  • a capture label is attached to the primers or probes and can be a specific binding member which forms a binding pair with the solid phase reagent's specific binding member (e. g. biotin and streptavidin). Therefore depending upon the type of label carried by a polynucleotide or a probe, it may be employed to capture or to detect the target DNA. Further, it will be understood that the polynucleotides, primers or probes provided herein, may, themselves, serve as the capture label.
  • a solid phase reagent's binding member is a nucleic acid sequence
  • it may be selected such that it binds a complementary portion of a primer or probe to thereby immobilize the primer or probe to the solid phase.
  • a polynucleotide probe itself serves as the binding member
  • the probe will contain a sequence or "tail" that is not complementary to the target.
  • a polynucleotide primer itself serves as the capture label
  • at least a portion of the primer will be free to hybridize with a nucleic acid on a solid phase.
  • DNA Labeling techniques are well known to the skilled technician.
  • any of the polynucleotides, primers and probes of the present invention can be conveniently immobilized on a solid support.
  • Solid supports are known to those skilled in the art and include the walls of wells of a reaction tray, test tubes, polystyrene beads, magnetic beads, nitrocellulose strips, membranes, microparticles such as latex particles, sheep (or other animal) red blood cells, duracytest) and others.
  • the solid support is not critical and can be selected by one skilled in the art.
  • latex particles, microparticles, magnetic or nonmagnetic beads, membranes, plastic tubes, walls of microtiter wells, glass or silicon chips, sheep (or other suitable animal's) red blood cells and duracytes are all suitable examples.
  • a solid support refers to any material which is insoluble, or can be made insoluble by a subsequent reaction.
  • the solid support can be chosen for its intrinsic ability to attract and immobilize the capture reagent.
  • the solid phase can retain an additional receptor which has the ability to attract and immobilize the capture reagent.
  • the additional receptor can include a charged substance that is oppositely charged with respect to the capture reagent itself or to a charged substance conjugated to the capture reagent.
  • the receptor molecule can be any specific binding member which is immobilized upon (attached to) the solid support and which has the ability to immobilize the capture reagent through a specific binding reaction.
  • the receptor molecule enables the indirect binding of the capture reagent to a solid support material before the performance of the assay or during the performance of the assay.
  • the solid phase thus can be a plastic, derivatized plastic, magnetic or non-magnetic metal, glass or silicon surface of a test tube, microtiter well, sheet, bead, microparticle, chip, sheep (or other suitable animal's) red blood cells, duracytes and other configurations known to those of ordinary skill in the art.
  • polynucleotides of the invention can be attached to or immobilized on a solid support individually or in groups of at least 2, 5, 8, 10, 12, 15, 20, or 25 distinct polynucleotides of the inventions to a single solid support.
  • polynucleotides other than those of the invention may be attached to the same solid support as one or more polynucleotides of the invention.
  • any polynucleotide provided herein may be attached in overlapping areas or at random locations on the solid support.
  • the polynucleotides of the invention may be attached in an ordered array wherein each polynucleotide is attached to a distinct region of the solid support which does not overlap with the attachment site of any other polynucleotide.
  • such an ordered array of polynucleotides is designed to be "addressable" where the distinct locations are recorded and can be accessed as part of an assay procedure.
  • Addressable polynucleotide arrays typically comprise a plurality of different oligonucleotide probes that are coupled to a surface of a substrate in different known locations.
  • VLSIPS Nery Large Scale Immobilized Polymer Synthesis
  • Oligonucleotide arrays may comprise at least one of the sequences selected from the group consisting of SEQ ID ⁇ os. 1, 2 or 3 or 5 to 23, and the sequences complementary thereto or a fragment thereof of at least 8, 10, 12, 15, 18, 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 consecutive nucleotides, to the extent that fragments of these lengths is consistent with the lengths of the particular Sequence ID, for determining whether a sample contains one or more alleles of the polymo ⁇ hisms of the present invention. Oligonucleotide arrays may also comprise at least one of the sequences selected from the group consisting of SEQ ID ⁇ os.
  • arrays may also comprise at least one of the sequences selected from the group consisting of SEQ ID ⁇ os.
  • the oligonucleotide array may comprise at least one of the sequences selecting from the group consisting of SEQ ID Nos.
  • the present invention also encompasses diagnostic kits comprising one or more polynucleotides of the invention, optionally with a portion or all of the necessary reagents and instructions for genotyping a test subject by determining the identity of a nucleotide at a Kalpa- related polymorphism.
  • the polynucleotides of a kit may optionally be attached to a solid support, or be part of an array or addressable array of polynucleotides.
  • the kit may provide for the determination of the identity of the nucleotide at a marker position by any method known in the art including, but not limited to, a sequencing assay method, a microsequencing assay method, a hybridization assay method, an allele specific amplification method, or a mismatch detection assay based on polymerases and/or ligases.
  • such a kit may include instructions for scoring the results of the determination with respect to the test subjects' risk of contracting a diseases involving cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation, or likely response to an agent acting on cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation, or chances of suffering from side effects to an agent acting on cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation.
  • diseases include heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
  • nucleotide can be adenine, guanine, cytosine or thymine.
  • Kalpa polypeptides and their use in drug screening assays and therapy for cholesterol lowering are described.
  • polypeptides encoded by the polynucleotides of the invention are also relating to the invention.
  • the invention relates to Kalpa from humans, including isolated or purified Kalpa consisting of, consisting essentially of, or comprising the sequence of SEQ ID No 4.
  • the invention concerns the polypeptides encoded by a nucleotide sequence of SEQ ID No 2 or 3, a complementary sequence thereof and a fragment thereof.
  • the present invention embodies isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100, 150, 200, 300 or 500 amino acids, of SEQ ID No 4.
  • the contiguous stretch of amino acids comprises the site of a mutation or functional mutation, including a deletion, addition, swap or truncation of the amino acids in the Kalpa sequence.
  • a Kalpa protein of the invention may comprise, consist or consist essentially of a functional domain referred to in Table 4 or Table B. Table 4 provides the positions of the respective domain (referred to in Table 4 as "feature") on the amino acid sequences of SEQ ID No 4.
  • Polypeptides of the invention include isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100 amino acids of a sequence selected from the group consisting of the positions of amino acid positions 399 to 414 and 501 to 772 or of the positions of amino acid positions of the functional domains listed in Table 4 on SEQ ID No 4.
  • Kalpa polypeptides at least 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99% or 99.8% identical to the amino acid sequences of SEQ ID No 4, or to a fragment or functional domain thereof.
  • the functional domain is a functional domain listed in [Table 4].
  • the Kalpa protein has 10 transmembrane domains.
  • the invention thus encompasses isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100 amino acids of a sequence selected from the group consisting of transmembrance and extracelluular domains of amino acid positions 1 to 129 (outside); 130 to 152 (Tmhelix); 153 to 184 (outside); 185 to 207 (Tmhelix); 208 to 221 (outside); 222 to 239 (Tmhelix); 240 to 245 (outside); 246 to 268 (Tmhelix); 269 to 287 (outside); 288 to 307 (Tmhelix); 308 to 336 (outside); 337 to 356 (Tmhelix); 357 to 370 (outside); 371 to 393 (Tmhelix); 394 to 404 (outside);
  • Kalpa can be isolated from cells or tissue sources by an appropriate purification scheme using standard protein purification techniques.
  • Kalpa are produced by recombinant DNA techniques.
  • a Kalpa or polypeptide can be synthesized chemically using standard peptide synthesis techniques.
  • Biologically active portions of a Kalpa include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the Kalpa, e.g., an amino acid sequence shown in SEQ ID No 4, which include less amino acids than the respective full length Kalpa, and exhibit at least one activity of the Kalpa.
  • the present invention also embodies isolated, purified, and recombinant portions or fragments of a Kalpa polypeptide comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100,150, 200, 300 or 500 amino acids, to the extent that said span is consistent with the particular SEQ ID NO, of a sequence selected from the group consisting of SEQ ID No 4. Also encompassed are Kalpa polypeptides which comprise between 10 and 20, between 20 and 50, between 30 and 60, between 50 and 100, or between 100 and 200 amino acids of a sequence selected from the group consisting of SEQ ID No 4.
  • the contiguous stretch of amino acids comprises the site of a mutation or functional mutation, including a deletion, addition, swap or truncation of the amino acids in the Kalpa sequence.
  • a biologically active Kalpa may, for example, comprise at least 1, 2, 3, 5, 10, 20 or 30 amino acid changes from the sequence of SEQ ID No 4, or may encode a biologically active Kalpa comprising at least 1%, 2%, 3%, 5%>, 8%>, 10%> or 15%> changes in amino acids from the sequence of SEQ ID No 4.
  • the invention further provides methods of testing the activity of, or obtaining, functional fragments and variants of Kalpa nucleotide sequences involving providing a variant or modified Kalpa nucleic acid and assessing whether a polypeptide encoded thereby displays a Kalpa activity of the invention.
  • a method of assessing the function of a Kalpa polypeptide comprising: (a) providing a Kalpa polypeptide, or a biologically active fragment or homologue thereof; and (b) testing said Kalpa polypeptide, or a biologically active fragment or homologue therefor a Kalpa activity.
  • any suitable format may be used, including cell free, cell-based and in vivo formats.
  • said assay may comprise expressing a Kalpa nucleic acid in a host cell, and observing Kalpa activity in said cell.
  • a Kalpa polypeptide, or a biologically active fragment or homologue thereof is introduced to a cell, and a Kalpa activity is observed.
  • Kalpa activity may be any activity as described herein.
  • variants including 1) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue and such substituted amino acid residue may or may not be one encoded by the genetic code, or 2) one in which one or more of the amino acid residues includes a substituent group, or 3) one in which the mutated Kalpa polypeptide is fused with another compound, such as a compound to increase the half-life of the polypeptide (for example, polyethylene glycol), or 4) one in which the additional amino acids are fused to the mutated Kalpa polypeptide, such as a leader or secretory sequence or a sequence which is employed for purification of the mutated Kalpa polypeptide or a preprotein sequence.
  • variants are deemed to be within the scope of those skilled in the art.
  • nucleotide substitutions leading to amino acid substitutions can be made in the sequence of SEQ ID No 2 or 3 that do not substantially change the biological activity of the protein.
  • amino acid residues that are conserved among the homologs of the respective Kalpa of the present invention are predicted to be less amenable to alteration.
  • additional conserved amino acid residues may be amino acids that are conserved between the proteins or variants related to the respective Kalpa.
  • the invention pertains to nucleic acid molecules encoding Kalpa polypeptides, or biologically active fragments or homologues thereof that contain changes in amino acid residues that are not essential for activity. Such Kalpa differ in amino acid sequence from SEQ ID No 4 yet retain biological activity.
  • the isolated nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises an amino acid sequence at least about 60% homologous to an amino acid sequence of SEQ ID NO 4.
  • the protein encoded by the nucleic acid molecule is at least about 65-70% homologous to an amino acid sequence of SEQ ID NO 4, more preferably sharing at least about 75-80% identity with an amino acid sequence of SEQ ID NO 4, even more preferably sharing at least about 85%, 90%, 92%, 95%, 97%, 98%, 99% or 99.8% identity with an amino acid sequence selected from the group consisting of SEQ ID NO 4.
  • the invention pertains to nucleic acid molecules encoding Kalpa that contain changes in amino acid residues that result in increased biological activity, or a modified biological activity.
  • the invention pertains to nucleic acid molecules encoding Kalpa that contain changes in amino acid residues that are essential for a Kalpa activity. Such Kalpa differ in amino acid sequence from SEQ ID NO 4 and display reduced or essentially lack one or more Kalpa biological activities.
  • the invention also encompasses a Kalpa polypeptide, or a biologically active fragment or homologue thereof which may be useful as dominant negative mutant of a Kalpa polypeptide.
  • An isolated nucleic acid molecule encoding a Kalpa polypeptide, or a biologically active fragment or homologue thereof homologous to a protein of any one of SEQ ID NO 4 can be created by introducing one or more nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID NO 4 such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein. Mutations can be introduced into any of SEQ ID NO 4, by standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis. For example, conservative amino acid substitutions may be made at one or more predicted non-essential amino acid residues.
  • a “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain.
  • Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine).
  • a predicted nonessential amino acid residue in a Kalpa polypeptide, or a biologically active fragment or homologue thereof may be replaced with another amino acid residue from the same side chain family.
  • mutations can be introduced randomly along all or part of a Kalpa coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for Kalpa biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO 4, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.
  • a mutant Kalpa polypeptide, or a biologically active fragment or homologue thereof encoded by a Kalpa polypeptide, or a biologically active fragment or homolog thereof can be assayed for a Kalpa activity in any suitable assay, examples of which are provided herein.
  • the invention also provides Kalpa chimeric or fusion proteins. As used herein, a Kalpa
  • chimeric protein or “fusion protein” comprises a Kalpa polypeptide of the invention operatively linked, preferably fused in frame, to a non- Kalpa or non- Kalpa domain polypeptide.
  • a Kalpa fusion protein comprises at least one biologically active portion of a Kalpa.
  • a Kalpa fusion protein comprises at least two biologically active portions of a Kalpa.
  • the fusion protein is a GST- Kalpa fusion protein in which the Kalpa sequences are fused to the C-te ⁇ ninus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant Kalpa polypeptides.
  • the fusion protein is a Kalpa containing a heterologous signal sequence at its N-terminus, such as for example to allow for a desired cellular localization in a certain host cell.
  • the Kalpa fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. Moreover, the Kalpa -fusion proteins of the invention can be used as immunogens to produce anti- Kalpa antibodies in a subject, to purify Kalpa ligands and in screening assays to identify molecules which inhibit the interaction of a Kalpa with a Kalpa-target molecule.
  • isolated peptidyl portions of the subject Kalpa can also be obtained by screening peptides recombinantly produced from the corresponding fragment of the nucleic acid encoding such peptides.
  • fragments can be chemically synthesized using techniques known in the art such as conventional Merrifield solid phase f-Moc or t-Boc chemistry.
  • a Kalpa of the present invention may be arbitrarily divided into fragments of desired length with no overlap of the fragments, or preferably divided into overlapping fragments of a desired length.
  • the fragments can be produced (recombinantly or by chemical synthesis) and tested to identify those peptidyl fragments which can function as either agonists or antagonists of a Kalpa activity, such as by microinjection assays or in vitro protein binding assays.
  • peptidyl portions of a Kalpa, a Kalpa target binding region can be tested for Kalpa activity by expression as thioredoxin fusion proteins, each of which contains a discrete fragment of the Kalpa (see, for example, U.S. Patents 5, 270,181 and 5,292,646; and PCT publication W094/ 02502, the disclosures of which are inco ⁇ orated herein by reference).
  • the present invention also pertains to variants of a Kalpa protein which function as either Kalpa mimetics or as Kalpa inhibitors.
  • Variants of a Kalpa protein can be generated by mutagenesis, e.g., discrete point mutation or truncation of a Kalpa protein.
  • An agonist of a Kalpa protein can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a Kalpa protein.
  • An antagonist of a Kalpa protein can inhibit one or more of the activities of the naturally occurring form of the Kalpa protein by, for example, competitively inhibiting the association of a Kalpa with a Kalpa target molecule.
  • specific biological effects can be elicited by treatment with a variant of limited function.
  • variants of a Kalpa which function as either Kalpa agonists (mimetics) or as Kalpa antagonists can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a Kalpa for Kalpa agonist or antagonist activity.
  • a variegated library of Kalpa variants is generated by combinatorial mutagenesis at the nucleic acid level and is encoded by a variegated gene library.
  • a variegated library of Kalpa variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential Kalpa sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display) containing the set of Kalpa sequences therein.
  • a degenerate set of potential Kalpa sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display) containing the set of Kalpa sequences therein.
  • libraries of fragments of a Kalpa coding sequence can be used to generate a variegated population of Kalpa fragments for screening and subsequent selection of variants of a Kalpa.
  • a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of a Kalpa coding sequence with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with SI nuclease, and ligating the resulting fragment library into an expression vector.
  • an expression library can be derived which encodes N-terminal, C-terminal and internal fragments of various sizes of the Kalpa.
  • Modified Kalpa can be used for such pu ⁇ oses as enhancing therapeutic or prophylactic efficacy, or stability (e.g., ex vivo shelf life and resistance to proteolytic degradation in vivo).
  • modified peptides when designed to retain at least one activity of the naturally occurring form of the protein, are considered functional equivalents of the Kalpa described in more detail herein.
  • modified peptide can be produced, for instance, by amino acid substitution, deletion, or addition. Whether a change in the amino acid sequence of a peptide results in a functional Kalpa homolog (e.g.
  • This invention further contemplates a method of generating sets of combinatorial mutants of the presently disclosed Kalpa, as well as truncation and fragmentation mutants, and is especially useful for identifying potential variant sequences which are functional in binding to a Kalpa - target protein but differ from a wild-type form of the protein by, for example, efficacy, potency and/or intracellular half-life.
  • One pu ⁇ ose for screening such combinatorial libraries is, for example, to isolate novel Kalpa homologs which function as either an agonist or an antagonist of the biological activities of the wild-type protein, or alternatively, possess novel activities all together.
  • mutagenesis can give rise to Kalpa homologs which have intracellular half-lives dramatically different than the corresponding wild-type protein.
  • the altered protein can be rendered either more stable or less stable to proteolytic degradation or other cellular process which result in destruction of, or otherwise inactivation of, a Kalpa.
  • Such Kalpa homologs, and the genes which encode them can be utilized to alter the envelope of expression for a particular recombinant Kalpa by modulating the half-life of the recombinant protein.
  • a short half-life can give rise to more transient biological effects associated with a particular recombinant Kalpa and, when part of an inducible expression system, can allow tighter control of recombinant protein levels within a cell.
  • such proteins, and particularly their recombinant nucleic acid constructs can be used in gene therapy protocols.
  • the amino acid sequences for a population of Kalpa homologs or other related proteins are aligned, preferably to promote the highest homology possible.
  • a population of variants can include, for example Kalpa homologs from one or more species, or Kalpa homologs from the same species but which differ due to mutation.
  • Amino acids which appear at each position of the aligned sequences are selected to create a degenerate set of combinatorial sequences.
  • the library of potential Kalpa homologs can be generated from a degenerate oligonucleotide sequence.
  • degenerate gene sequence can be carried out in an automatic DNA synthesizer, and the synthetic genes then be ligated into an appropriate gene for expression.
  • the purpose of a degenerate set of genes is to provide, in one mixture, all of the sequences encoding the desired set of potential Kalpa sequences.
  • the synthesis of degenerate oligonucleotides is well known in the art (see for example. Narang, SA (1983) Tetrahedron 393; Itakura et al. (1981) Recombinant DNA, Proc 3rd Cleveland Sympos. Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp. 273-289; Itakura et al. (1984) Annu. Rev.
  • Kalpa homologs can be generated and isolated from a library by screening using, for example, alanine scanning mutagenesis and the like (Ruf et al. (1994) Biochemistry 33:1565-1572; Wang et al. (1994) J Biol. Chem. 269:3095-3099; Balint et al. (1993) Gene 137:109-118; Grodberg et al. (1993) Eur. J Biochem.
  • a wide range of techniques are known in the art for screening gene products of combinatorial libraries made by point mutations, as well as for screening cDNA libraries for gene products having a certain property. Such techniques will be generally adaptable for rapid screening of the gene libraries generated by the combinatorial mutagenesis of Kalpa.
  • the most widely used techniques for screening large gene libraries typically comprises cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates relatively easy isolation of the vector encoding the gene whose product was detected.
  • each of the illustrative assays described below are amenable to high through-put analysis as necessary to screen large numbers of degenerate Kalpa sequences created by combinatorial mutagenesis techniques.
  • the candidate gene products are displayed on the surface of a cell or viral particle, and the ability of particular cells or viral particles to bind a Kalpa target molecule (protein or DNA) via this gene product is detected in a "panning assay".
  • the gene library can be cloned into the gene for a surface membrane protein of a bacterial cell, and the resulting fusion protein detected by panning (Ladner et al., WO 88/06630; Fuchs et al.
  • Kalpa target can be used to score for potentially functional Kalpa homologs.
  • Cells can be visually inspected and separated under a fluorescence microscope, or, where the mo ⁇ hology of the cell permits, separated by a fluorescence- activated cell sorter.
  • the gene library is expressed as a fusion protein on the surface of a viral particle.
  • foreign peptide sequences can be expressed on the surface of infectious phage, thereby conferring two significant benefits.
  • coli filamentous phages M13, fd, and fl are most often used in phage display libraries, as either of the phage gill or gVffl coat proteins can be used to generate fusion proteins without disrupting the ultimate packaging of the viral particle (Ladner et al. PCT publication WO 90/02909; Garrard et al., PCT publication WO 92/09690; Marks et al. (1992) J Biol. Chem. 267:16007-16010; Griffiths et al. (1993) EMBO J 12:725- 734; Clackson et al. (1991) Nature 352:624-628; and Barbas et al.
  • the recombinant phage antibody system (RPAS, Pharmacia Catalog number 27-9400-01) can be easily modified for use in expressing Kalpa combinatorial libraries, and the Kalpa phage library can be panned on immobilized Kalpa target molecule (glutathione immobilized Kalpa target-GST fusion proteins or immobilized DNA).
  • Successive rounds of phage amplification and panning can greatly enrich for Kalpa homologs which retain an ability to bind a Kalpa target and which can subsequently be screened further for biological activities in automated assays, in order to distinguish between agonists and antagonists.
  • the invention also provides for identification and reduction to functional minimal size of the Kalpa domains, particularly a TRP repeat, glycosyl transferase or aldoketo-reductase domain of the subject Kalpa to generate mimetics, e.g. peptide or non-peptide agents, which are able to disrupt binding of a polypeptide of the present invention with a Kalpa target molecule (protein or DNA).
  • mimetics e.g. peptide or non-peptide agents
  • mutagenic techniques as described above are also useful to map the determinants of Kalpa which participate in protein-protein or protein-DNA interactions involved in, for example, binding to a Kalpa target protein or DNA.
  • the critical residues of a Kalpa which are involved in molecular recognition of the Kalpa target can be determined and used to generate Kalpa target- 13 P-derived peptidomimetics that competitively inhibit binding of the Kalpa to the Kalpa target.
  • peptidomimetic compounds can be generated which mimic those residues in binding to a Kalpa target, and which, by inhibiting binding of the Kalpa to the Kalpa target molecule, can interfere with the function of a Kalpa in transcriptional regulation of one or more genes.
  • non hydrolyzable peptide analogs of such residues can be generated using retro- inverse peptides (e.g., see U.S. Patents 5,116,947 and 5,219,089; and Pallai et al. (1983) Int J Pept Protein Res 21 :84-92), benzodiazepine (e.g., see Freidinger et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), azepine (e.g., see Huffman et al. in Peptides.- Chemistry and Biology, G.R.
  • retro- inverse peptides e.g., see U.S. Patents 5,116,947 and 5,219,089; and Pallai et al. (1983) Int J Pept Protein Res 21 :84-92
  • benzodiazepine e.g., see Freidinger et al. in Peptides:
  • an isolated Kalpa, or a portion or fragment thereof, can be used as an immunogen to generate antibodies that bind Kalpa using standard techniques for polyclonal and monoclonal antibody preparation.
  • a full-length Kalpa can be used or, alternatively, the invention provides antigenic peptide fragments of Kalpa for use as immunogens. Any fragment of the Kalpa which contains at least one antigenic determinant may be used to generate antibodies.
  • the antigenic peptide of a Kalpa comprises at least 8 amino acid residues of an amino acid sequence selected from the group consisting of SEQ ID No 4 and encompasses an epitope of a Kalpa such that an antibody raised against the peptide forms a specific immune complex with a Kalpa.
  • the antigenic peptide comprises at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.
  • Preferred epitopes encompassed by the antigenic peptide are regions of a Kalpa that are located on the surface of the protein, e.g., hydrophilic regions.
  • a Kalpa immunogen typically is used to prepare antibodies by immunizing a suitable subject, (e.g., rabbit, goat, mouse or other mammal) with the immunogen.
  • An appropriate immunogenic preparation can contain, for example, recombinantly expressed Kalpa or a chemically synthesized Kalpa polypeptide.
  • the preparation can further include an adjuvant, such as Freund's complete or incomplete adjuvant, or similar immunostimulatory agent. Immunization of a suitable subject with an immunogenic Kalpa preparation induces a polyclonal anti-Kalpa antibody response.
  • antibody compositions for use in accordance with the invention include either polyclonal or monoclonal antibodies capable of selectively binding, or which selectively bind to an epitope-containing a polypeptide comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100, or more than 100 amino acids of an amino acid sequence of a functional domain of a Kalpa of SEQ ID NOS 4.
  • the invention also concerns a purified or isolated antibody capable of specifically binding to a mutated or variant Kalpa or to a fragment thereof comprising an epitope of the mutated Kalpa.
  • vectors preferably expression vectors, containing a nucleic acid encoding a Kalpa (or a portion thereof).
  • vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
  • plasmid refers to a circular double stranded DNA loop into which additional DNA segments can be ligated.
  • viral vector Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome.
  • Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors).
  • vectors e.g., non-episomal mammalian vectors
  • Other vectors are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome.
  • certain vectors are capable of directing the expression of genes to which they are operatively linked.
  • Such vectors are referred to herein as "expression vectors".
  • expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
  • plasmid and vector can be used interchangeably as the plasmid is the most commonly used form of vector.
  • the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions.
  • viral vectors e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses
  • the recombinant expression vectors of the invention comprise a Kalpa nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed.
  • operably linked is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).
  • regulatory sequence is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990), the disclosure of which is inco ⁇ orated herein by reference in its entirety.
  • Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cell and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc.
  • the expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., Kalpa, mutant forms of Kalpa, fusion proteins, or fragments of any of the preceding proteins, etc.).
  • the recombinant expression vectors of the invention can be designed for expression of Kalpa in prokaryotic or eukaryotic cells.
  • Kalpa can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors) yeast cells, or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990), the disclosure of which is incorporated herein by reference in its entirety.
  • the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. Expression of proteins in prokaryotes is most often carried out in E.
  • Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein.
  • Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification.
  • a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein.
  • enzymes, and their cognate recognition sequences include Factor Xa, thrombin and enterokinase.
  • Typical fusion expression vectors include pGEX (Pha ⁇ nacia Biotech Inc; Smith, D. B. and Johnson, K. S.
  • fusion proteins can be utilized in Kalpa activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for Kalpa, for example.
  • a Kalpa fusion protein expressed in a retroviral expression vector of the present invention can be utilized to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g six (6) weeks).
  • Suitable inducible non-fusion E. coli expression vectors include pTrc (Amann et al., (1988) Gene 69:301-315) and pET l id (Studier et al., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 60-89), the disclosures of which are inco ⁇ orated herein by reference in their entireties.
  • Target gene expression from the pTrc vector relies on host RNA polymerase transcription from a hybrid trp- lac fusion promoter.
  • Target gene expression from the pET l id vector relies on transcription from a T7 gnlO-lac fusion promoter mediated by a coexpressed viral RNA polymerase (T7 gn 1).
  • This viral polymerase is supplied by host strains BL21 (DE3) or HMS174(DE3) from a resident prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 promoter.
  • One strategy to maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128, the disclosure of which is inco ⁇ orated herein by reference in its entirety).
  • Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111- 2118, the disclosure of which is incorporated herein by reference in its entirety). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.
  • the Kalpa expression vector is a yeast expression vector.
  • yeast expression vectors for expression in yeast S. cerivisae include pYepSec 1 (Baldari, et al., (1987) Embo J. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 30:933-943), pJRY88 (Schultz et al., (1987) Gene 54:113-123), pYES2 (Invitrogen Co ⁇ oration, San Diego, Calif), and picZ (InVitrogen Co ⁇ , San Diego, Calif), the disclosures of which are inco ⁇ orated herein by reference in their entireties.
  • Kalpa can be expressed in insect cells using baculovirus expression vectors.
  • Baculovirus vectors available for expression of proteins in cultured insect cells include the pAc series (Smith et al. (1983) Mol. Cell Biol. 3:2156-2165) and the pVL series (Lucklow and Summers (1989) Virology 170:31-39), the disclosures of which are incorporated herein by reference in their entireties.
  • Kalpa are expressed according to Kamiski et al, Am. J. Physiol. (1998) 275: F79-87, the disclosure of which is incorporated herein by reference in its entirety.
  • a nucleic acid of the invention is expressed in mammalian cells using a mammalian expression vector.
  • mammalian expression vectors include pCDM8 (Seed, B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J. 6:187-195), the disclosures of which are inco ⁇ orated herein by reference in their entireties.
  • the expression vector's control functions are often provided by viral regulatory elements.
  • commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.
  • the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid).
  • tissue-specific regulatory elements are known in the art.
  • suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1 :268-277, the disclosure of which is inco ⁇ orated herein by reference in its entirety), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J.
  • promoters Developmentally-regulated promoters are also encompassed, for example the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the alpha-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546), the disclosures of which are inco ⁇ orated herein by reference in their entireties.
  • the invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in a manner which allows for expression (by transcription of the DNA molecule) of an RNA molecule which is antisense to Kalpa mRNA.
  • Regulatory sequences operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the continuous expression of the antisense RNA molecule in a variety of cell types, for instance viral promoters and/or enhancers, or regulatory sequences can be chosen which direct constitutive, tissue specific or cell type specific expression of antisense RNA.
  • the antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus in which antisense nucleic acids are produced under the control of a high efficiency regulatory region, the activity of which can be determined by the cell type into which the vector is introduced.
  • a high efficiency regulatory region the activity of which can be determined by the cell type into which the vector is introduced.
  • host cell and "recombinant host cell” are used interchangeably herein. It is understood that such term refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
  • a host cell can be any prokaryotic or eukaryotic cell.
  • a Kalpa can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells or human cells).
  • bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells or human cells).
  • CHO Chinese hamster ovary cells
  • COS cells or human cells such as Chinese hamster ovary cells (CHO) or COS cells or human cells.
  • Other suitable host cells are known to those skilled in the art, including Xenopus laevis oocytes as further described in the Examples.
  • Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques.
  • transformation and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual.
  • a gene that encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells along with the gene of interest.
  • selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methotrexate.
  • Nucleic acid encoding a selectable marker can be introduced into a host cell on the same vector as that encoding a Kalpa or can be introduced on a separate vector.
  • Cells stably transfected with the introduced nucleic acid can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die).
  • a host cell of the invention such as a prokaryotic or eukaryotic host cell in culture, can be used to produce (i.e., express) a Kalpa. Accordingly, the invention further provides methods for producing a Kalpa using the host cells of the invention.
  • the method comprises culturing the host cell of invention (into which a recombinant expression vector encoding a Kalpa has been introduced) in a suitable medium such that a Kalpa is produced. In another embodiment, the method further comprises isolating a Kalpa from the medium or the host cell.
  • the invention encompasses providing a cell capable of expressing a Kalpa, culturing said cell in a suitable medium such that a Kalpa is produced, and isolating or purifying the Kalpa from the medium or cell.
  • the host cells of the invention can also be used to produce nonhuman transgenic animals.
  • a host cell of the invention is a fertilized oocyte or an embryonic stem cell into which Kalpa-coding sequences have been introduced.
  • Such host cells can then be used to create non-human transgenic animals in which exogenous Kalpa sequences have been introduced into their genome or homologous recombinant animals in which endogenous Kalpa sequences have been altered.
  • Such animals are useful for studying the function and/or activity of a Kalpa polypeptide or fragment thereof and for identifying and/or evaluating modulators of Kalpa activity.
  • a "transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene.
  • Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, etc.
  • a transgene is exogenous DNA which is integrated into the genome of a cell from which a transgenic animal develops and which remains in the genome of the mature animal, thereby directing the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal.
  • a "homologous recombinant animal” is a non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous Kalpa gene has been altered by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.
  • a transgenic animal of the invention can be created by introducing a Kalpa-encoding nucleic acid into the male pronuclei of a fertilized oocyte, e.g., by microinjection or retroviral infection, and allowing the oocyte to develop in a pseudopregnant female foster animal.
  • the Kalpa cDNA sequence or a fragment thereof such as a sequence of SEQ ID No 2 or 3 can be introduced as a transgene into the genome of a non-human animal.
  • a nonhuman homologue of a huma Kalpa gene such as a mouse or rat Kalpa gene, can be used as a transgene.
  • Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene.
  • a tissue-specific regulatory sequence(s) can be operably linked to a Kalpa transgene to direct expression of a Kalpa to particular cells.
  • a transgenic founder animal can be identified based upon the presence of a Kalpa transgene in its genome and/or expression of Kalpa mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a Kalpa can further be bred to other transgenic animals carrying other transgenes.
  • a vector which contains at least a portion of a Kalpa gene into which a deletion, addition or substitution has been introduced to thereby alter, e.g., functionally disrupt, the Kalpa gene.
  • the Kalpa gene can be a human gene (e.g., the cDNAs of SEQ ID No 2 or 3), but more preferably, is a non-human homologue of a huma Kalpa gene (e.g., a cDNA isolated by stringent hybridization with a nucleotide sequence of SEQ ID No 2 or 3).
  • a mouse Kalpa gene can be used to construct a homologous recombination vector suitable for altering an endogenous Kalpa gene in the mouse genome.
  • the vector is designed such that, upon homologous recombination, the endogenous Kalpa gene is functionally disrupted (i.e., no longer encodes a functional protein; also referred to as a "knock out" vector).
  • the vector can be designed such that, upon homologous recombination, the endogenous Kalpa gene is mutated or otherwise altered but still encodes functional protein (e.g., the upstream regulatory region can be altered to thereby alter the expression of the endogenous Kalpa).
  • the altered portion of the Kalpa gene is flanked at its 5' and 3' ends by additional nucleic acid sequence of the Kalpa gene to allow for homologous recombination to occur between the exogenous Kalpa gene carried by the vector and an endogenous Kalpa gene in an embryonic stem cell.
  • the additional flanking Kalpa nucleic acid sequence is of sufficient length for successful homologous recombination with the endogenous gene.
  • flanking DNA both at the 5' and 3' ends
  • are included in the vector see e.g., Thomas, K. R. and Capecchi, M. R.
  • the vector is introduced into an embryonic stem cell line (e.g., by electroporation) and cells in which the introduced Kalpa gene has homologously recombined with the endogenous Kalpa gene are selected (see e.g., Li, E. et al. (1992) Cell 69:915, the disclosure of which is incorporated herein by reference in its entirety).
  • the selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to form aggregation chimeras (see e.g., Bradley, A.
  • a chimeric embryo can then be implanted into a suitable pseudopregnant female foster animal and the embryo brought to term.
  • Progeny harboring the homologously recombined DNA in their germ cells can be used to breed animals in which all cells of the animal contain the homologously recombined DNA by germline transmission of the transgene. Methods for constructing homologous recombination vectors and homologous recombinant animals are described further in Bradley, A.
  • transgenic non-human animals can be produced which contain selected systems which allow for regulated expression of the transgene.
  • One example of such a system is the cre/loxP recombinase system of bacteriophage PI.
  • cre/loxP recombinase system see, e.g., Lakso et al. (1992) PNAS 89:6232-6236, the disclosure of which is incorporated herein by reference in its entirety.
  • Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355, the disclosure of which is inco ⁇ orated herein by reference in its entirety).
  • mice containing transgenes encoding both the Cre recombinase and a selected protein are required.
  • Such animals can be provided through the construction of "double" transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase.
  • the invention provides a method (also referred to herein as a "screening assay") for identifying modulators, i.e., candidate or test compounds or agents (e.g., preferably small molecules, but also peptides, peptidomimetics or other drugs) which bind to Kalpa, have an inhibitory or activating effect on, for example, Kalpa expression or preferably Kalpa activity, or have an inhibitory or activating effect on, for example, the activity of a Kalpa target molecule.
  • small molecules can be generated using combinatorial chemistry or can be obtained from a natural products library.
  • Assays may be cell based or non-cell based assays.
  • Drug screening assays may be binding assays or more preferentially functional assays, as further described.
  • an assay is a cell-based assay in which a cell which expresses a Kalpa or biologically active portion thereof is contacted with a test compound and the ability of the test compound to inhibit, activate, or increase Kalpa activity determined. Determining the ability of the test compound to inhibit, activate, or increase Kalpa activity can be accomplished by monitoring the bioactivity of the Kalpa or biologically active portion thereof.
  • the cell for example, can be of mammalian origin, bacterial origin or a yeast cell.
  • the cell can be a mammalian cell, bacterial cell or yeast cell which has been engineered to lack Kalpa activity or which naturally lacks Kalpa activity.
  • the invention further encompasses compounds capable of inhibiting or activating Kalpa activity.
  • a Kalpa inhibitor or activator is a selective Kalpa inhibitor or activator.
  • a Kalpa inhibitor is capable of inhibiting or increasing the activity of or binding to more than one (e.g. at least two, three, four) Kalpa -like proteins.
  • Assays of the invention may be used to screen any suitable collection of compounds.
  • an inhibitor or activator is capable of inhibiting or increasing a Kalpa activity or preferably a Kalpa activity. Assays may thus be designed according to well known means to detect a Kalpa activities described herein. In other aspects, an inhibitor or activator is capable of modulating cholesterol regulation (CHOL), HDL-cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation. Methods for detecting said endpoints are known in the art.
  • an inhibitor or activator is capable of modulating (e.g. activating or inhibiting) the expression of the LDL receptor (LDLR).
  • the inhibitor or activator is preferably able to increase the level of expression of LDLR.
  • the method involves identifying compounds capable of enhancing the expression of the LDLR, preferably involving detecting modulation of Kalpa activity. Any suitable assay endpoint indicating activity of LDLR function or expression can be used.
  • the invention provides assays for screening candidate or test compounds which are target molecules of a Kalpa or polypeptide or biologically active portion thereof.
  • the invention provides assays for screening candidate or test compounds which bind to or modulate the activity of a Kalpa or polypeptide or biologically active portion thereof.
  • test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection.
  • biological libraries is used with peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145, the disclosure of which is inco ⁇ orated herein by reference in its entirety).
  • Biotechniques 13:412-421 or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.), the disclosures of which are inco ⁇ orated herein by reference in their entireties.
  • Determining the ability of the test compound to inhibit or increase Kalpa activity can also be accomplished, for example, by coupling the Kalpa or biologically active portion thereof with a radioisotope or enzymatic label such that binding of the Kalpa or biologically active portion thereof to its cognate target molecule can be determined by detecting the labeled Kalpa or biologically active portion thereof in a complex.
  • compounds e.g., Kalpa or biologically active portion thereof
  • compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.
  • the labeled molecule is placed in contact with its cognate molecule and the extent of complex formation is measured.
  • the extent of complex formation may be measured by immuno precipitating the complex or by performing gel electrophoresis.
  • a microphysiometer can be used to detect the interaction of a compound with its cognate target molecule without the labeling of either the compound or the target molecule. McConnell, H. M. et al. (1992) Science 257:1906-1912, the disclosure of which is incorporated herein by reference in its entirety.
  • a microphysiometer such as a cytosensor is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between compound and receptor.
  • LAPS light-addressable potentiometric sensor
  • the assay comprises contacting a cell which expresses a
  • Kalpa or biologically active portion thereof with a target molecule to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to inhibit or increase the activity of the Kalpa or biologically active portion thereof, wherein determining the ability of the test compound to inhibit or increase the activity of the Kalpa or biologically active portion thereof, comprises determining the ability of the test compound to inhibit or increase a biological activity of the Kalpa expressing cell (e.g., for example determining the ability of the test compound to inhibit or increase transduction, proteimprotein interactions).
  • the assay comprises contacting a cell which is responsive to a Kalpa or biologically active portion thereof, with a Kalpa or biologically-active portion thereof, to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to modulate the activity of the Kalpa or biologically active portion thereof, wherein determining the ability of the test compound to modulate the activity of the Kalpa or biologically active portion thereof comprises determining the ability of the test compound to modulate a biological activity of the Kalpa-responsive cell.
  • an assay is a cell-based assay comprising contacting a cell expressing a Kalpa target molecule (i.e. a molecule with which Kalpa interacts) with a test compound and determining the ability of the test compound to modulate (e.g. stimulate or inhibit) the activity of the Kalpa target molecule. Determining the ability of the test compound to modulate the activity of a Kalpa target molecule can be accomplished, for example, by determining the ability of the Kalpa to bind to or interact with the Kalpa target molecule. Examples of such cell and non-cell based interaction assays are described in Degterev et al, Nature Cell Biol. 3: 173-182 (2001) and Dandliker et al, Methods Enzymol. 74: 3-28 (1981), the disclosures of which are incorporated herein by reference.
  • cells used in the cellular assays of the invention are cultured HepG2 cells, skin fibroblast, NIH 3T3 or muscle cells.
  • the cell is a Kalpa mutant, preferably having reduced or lacking Kalpa protein activity, or lacking or having reduced Kalpa expression.
  • Determining the ability of the Kalpa to bind to or interact with a Kalpa target molecule can be accomplished by one of the methods described above for determining direct binding.
  • determining the ability of the Kalpa to bind to or interact with a Kalpa target molecule can be accomplished by determining the activity of the target molecule.
  • the activity of the target molecule can be determined by contacting the target molecule with the Kalpa or a fragment thereof and measuring induction of a cellular second messenger of the target (i.e.
  • a reporter gene comprising a target-responsive regulatory element operatively linked to a nucleic acid encoding a detectable marker, e.g., luciferase
  • a target-regulated cellular response for example, signal transduction or protei protein interactions.
  • an assay of the present invention is a cell-free assay in which a Kalpa or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the Kalpa or biologically active portion thereof is determined. Binding of the test compound to the Kalpa can be determined either directly or indirectly as described above.
  • the assay includes contacting the Kalpa or biologically active portion thereof with a known compound which binds a Kalpa protein (e.g., a Kalpa target molecule) to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a Kalpa protein, wherein determining the ability of the test compound to interact with a Kalpa protein comprises determining the ability of the test compound to preferentially bind to a Kalpa protein or biologically active portion thereof as compared to the known compound.
  • a known compound which binds a Kalpa protein e.g., a Kalpa target molecule
  • the assay is a cell-free assay in which a Kalpa protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to modulate (e.g., stimulate or inhibit) the activity of the Kalpa protein or biologically active portion thereof is determined.
  • Determining the ability of the test compound to modulate the activity of a Kalpa protein can be accomplished, for example, by determining the ability of the Kalpa protein to bind to a Kalpa target molecule by one of the methods described above for determining direct binding. Determining the ability of the Kalpa protein to bind to a Kalpa target molecule can also be accomplished using a technology such as real-time Biomolecular Interaction Analysis (BIA). Sjolander, S.
  • BIOS Biomolecular Interaction Analysis
  • BIOA is a technology for studying biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the optical phenomenon of surface plasmon resonance (SPR) can be used as an indication of real-time reactions between biological molecules.
  • SPR surface plasmon resonance
  • determining the ability of the test compound to modulate the activity of a Kalpa protein can be accomplished by determining the ability of the Kalpa protein to further modulate the activity of a downstream effector (e.g., a growth factor mediated signal transduction pathway component) of a Kalpa target molecule.
  • a downstream effector e.g., a growth factor mediated signal transduction pathway component
  • the activity of the effector molecule on an appropriate target can be determined or the binding of the effector to an appropriate target can be determined as previously described.
  • the cell-free assay involves contacting a Kalpa protein or biologically active portion thereof with a known compound which binds the Kalpa protein to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with the Kalpa, wherein determining the ability of the test compound to interact with the Kalpa protein comprises determining the ability of the Kalpa protein to preferentially bind to or modulate the activity of a Kalpa target molecule.
  • the cell-free assays of the present invention are amenable to use of both soluble and/or membrane-bound forms of isolated proteins (e.g. Kalpa protein or biologically active portions thereof or molecules to which Kalpa targets bind).
  • solubilizing agent such that the membrane-bound form of the isolated protein is maintained in solution.
  • solubilizing agents include non-ionic detergents such as n-octylglucoside, n- dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N- methylglucamide, Triton X-100, Triton X-l 14, Thesit, Isotridecypoly(ethylene glycol ether)n,3- [(3-cholamidopropyl)dimethylamminio]- 1-propane sulfonate (CHAPS), 3-[(3- cholamidopropyl)dimethylamminio]-2-hydroxy-l -propane sulfonate (CHAPSO), or N-
  • Binding of a test compound to a Kalpa protein, or interaction of a Kalpa protein with a target molecule in the presence and absence of a candidate compound can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtitre plates, test tubes, and micro-centrifuge tubes.
  • a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix.
  • glutathione-S-transferase/Kalpa fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtitre plates, which are then combined with the test compound or the test compound and either the non- adsorbed target protein or Kalpa protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtitre plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above.
  • glutathione sepharose beads Sigma Chemical, St. Louis, Mo.
  • glutathione derivatized microtitre plates which are then combined with the test compound or the test compound and either the non- adsorbed target protein or Kalpa protein, and the mixture incubated under
  • the complexes can be dissociated from the matrix, and the level of Kalpa protein binding or activity determined using standard techniques.
  • Other techniques for immobilizing proteins on matrices can also be used in the screening assays of the invention.
  • a Kalpa protein or a Kalpa target molecule can be immobilized utilizing conjugation of biotin and streptavidin.
  • Biotinylated Kalpa protein or target molecules can be prepared from biotin-NHS (N-hydroxy-succinimide) using techniques well known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, 111.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).
  • antibodies reactive with Kalpa protein or target molecules but which do not interfere with binding of the Kalpa protein to its target molecule can be derivatized to the wells of the plate, and unbound target or Kalpa trapped in the wells by antibody conjugation.
  • Methods for detecting such complexes include immunodetection of complexes using antibodies reactive with the Kalpa protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the Kalpa protein or target molecule.
  • modulating Kalpa activity comprises moduating Kalpa expression.
  • Modulators of Kalpa expression are identified in a method wherein a cell is contacted with a candidate compound and the expression of Kalpa mRNA or protein in the cell is determined. The level of expression of Kalpa mRNA or protein in the presence of the candidate compound is compared to the level of expression of Kalpa mRNA or protein in the absence of the candidate compound. The candidate compound can then be identified as a modulator of Kalpa expression based on this comparison. For example, when expression of Kalpa mRNA or protein is greater (statistically significantly greater) in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of Kalpa mRNA or protein expression.
  • the candidate compound when expression of Kalpa mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of Kalpa mRNA or protein expression.
  • the level of Kalpa mRNA or protein expression in the cells can be determined by methods described herein for detecting Kalpa mRNA or protein.
  • the Kalpa can be used as "bait proteins" in a two- hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al.
  • Kalpa-binding proteins bind to or interact with Kalpa
  • Kalpa-binding proteins bind to or interact with Kalpa
  • Kalpa-binding proteins are also likely to be involved in the propagation of signals by the Kalpa or Kalpa targets as, for example, downstream elements of a Kalpa-mediated signaling pathway.
  • Kalpa-binding proteins are likely to be Kalpa inhibitors.
  • the two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains.
  • the assay utilizes two different DNA constructs.
  • the gene that codes for a Kalpa or a fragment thereof is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4).
  • a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey" or "sample”) is fused to a gene that codes for the activation domain of the known transcription factor.
  • the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the Kalpa protein.
  • a reporter gene e.g., LacZ
  • SREBPs are bound to the ER membrane and nuclear envelope.
  • the NH 2 - terminal and COOH-terminal domains each about 500 amino acids in length, project into the cytosol. They are linked by a pair of membrane-spanning sequences that flank a short 31-amino acid hydrophilic loop that projects into the lumen of the ER and nuclear envelope (Brown and Goldstein, Cell, 89:331-340, 1997).
  • site-1 protease SIP cleaves the SREBPs at a leucine-serine bond in the luminal loop, thereby separating the proteins into halves, each with a single membrane spanning sequence (Duncan et al., J. Biol.
  • S2P Site-2 protease
  • nSREBP nuclear SREBP
  • SCAP SREBP cleavage-activating protein
  • the Site-1 processing reaction is the target for feedback regulation of lipid biosynthesis and uptake in animal cells.
  • the Site-1 cleavage reaction is blocked (Brown and Goldstein, 1997).
  • the Site-2 cleavage reaction is blocked secondarily since it requires prior cleavage by SIP.
  • the sterol effect appears to be mediated by five of the eight membrane spanning sequences of SCAP, which are designated as the sterol sensor (Hua et al., 1996). Point mutations at two positions within the sterol sensor render SCAP constitutively active and prevent sterol-mediated suppression of Site-1 cleavage (Nohturfft et al., Proc. Natl. Acad. Sci.
  • the human gene for S2P has been described by Rawson et al. (Molecular Cell, 1:47-57, 1997). This gene encodes a unique hydrophobic zinc metalloprotease that cleaves the intermediate forms of SREBPs within their transmembrane sequences.
  • the invention thus provides several methods for identifying or selecting candidates among compounds that modulate Kalpa activity.
  • a method of inducing LDL receptor expression through the modulation of Kalpa activity by contacting a cell with a compound that modulates Kalpa activity (e.g. activity of the protein itself and/or expression of the protein), wherein the modulation of said Kalpa protein results in the induction of low density lipoprotein receptor expression.
  • a compound that modulates Kalpa activity e.g. activity of the protein itself and/or expression of the protein
  • an indirect Kalpa activity can be measured using means known in the art.
  • the invention comprises contacting a cell with a candidate Kalpa modulator and measuring LDL uptake in a cell.
  • the binding, internalization and degradation of LDL in fibroblasts can be measured using known means in the art. Examples of suitable methods are provided in Goldstein, J.L., et al. ((1983) Methods Enzymol. 98, 241), and International Patent Publication No WO 99/47566, the disclosures of which are incorporated herein by reference.
  • LDL uptake is measured in cultured HepG2 cells, skin f ⁇ broblast, NIH 3T3 or muscle cells.
  • HepG2 cells are seeded in Biocat slides and pretreated with or without 25-hydroxycholesterol for 20 hours. Cells are incubated for 4 hours with 6 ⁇ g/ml fluorescent Di-LDL (fluorescent dye 3,3'-dioctadecylindocarbocyanine). Intracellular fluorescent dye is detected by microscopy using rhodamine filters.
  • rimary human and monkey hepatocytes are seeded for 24 hours in collagen-coated plates in William E medium (0.1%) FCS, 0.4 ⁇ g/ml insulin, 0.1 ⁇ M dexamethasome). FACS is also used in parallel to quantify fluorescence.
  • an indirect Kalpa activity can be assessed in an animal model.
  • the invention comprises administering a candidate Kalpa modulator to an animal, preferably a non-human animal, and measuring blood cholesterol levels, preferably the depletion of blood chlesterol levels.
  • Kalpa activity itself can be administered, and blood cholesterol levels are measured.
  • the method may comprise providing animals which transiently overexpress Kalpa, and measuring blood cholesterol levels.
  • a method of identifying or selecting a candidate compound that modulates Kalpa activity comprising administering a candidate Kalpa modulator to an animal and measuring blood cholesterol levels.
  • said animal is a non-human animal.
  • the present invention is also directed to a method of identifying or selecting candidate compounds that induce Kalpa-mediated LDL receptor expression.
  • candidate compounds are compounds that are known to or suspected of modulating Kalpa activity.
  • the method comprises contacting a cell with a candidate Kalpa modulator and measuring low- density lipoprotein receptor expression. Activation of Kalpa and induction of low-density lipoprotein receptor expression in the presence of the compound is indicative of the compound's ability in inducing Kalpa -mediated LDL receptor expression.
  • Kalpa may be involved in generating active SREBP, which in turn is known to mediate the transcription of several proteins involved in cholesterol regulation including proteins involved in cholesterol uptake and synthesis.
  • assays comprise contacting a cell with a candidate Kalpa modulator, and detecting SREBP-mediated transcription, or levels of a protein encoded by a SREBP-mediated transcript.
  • the method comprises detecting LDL receptor expression (e.g. detecting protein and/or mRNA). Examples of assays for the detection of LDL receptor expression are described in International Patent Publication No. WO 94/26922, the disclosure of which is inco ⁇ orated herein by reference.
  • Detection and/or the quantitation of human LDL receptor can also be carried out using specific monoclonal antibodies.
  • International Patent Publication No WO 01/68710 for example provides monoclonal antibodies, hybridomas producing said antibodies and methods for detecting LDL receptor.
  • LDL receptor can be detected using Mabs that can be used as a pair in an ELISA (Enzyme Linked Immuno Sorbent Assay) for detection of human soluble LDL receptor, or using Mabs for identification of the LDLR in Western Blot analysis.
  • ELISA Enzyme Linked Immuno Sorbent Assay
  • Method of detecting SREBP related expression can be carried out using the methods of International Patent Publication No WO 94/26922. Screening methods are based upon cellular assays in which candidate substances are screened for their ability to stimulate SRE mediated transcription and gene expression, and particularly, reporter gene expression.
  • the preferred cellular assays comprise preparing a recombinant plasmid including a reporter gene, preferably a CAT gene or luciferase gene, under the transcriptional control of a functional SRE-1 sequence and introducing the plasmid into a recombinant host cell, such as a monkey CV-1 cell.
  • the host cell will also include and express an SREBP protein and optionally a Kalpa protein, either naturally, or due to the presence of a recombinant vector that expresses SREBP and/or Kalpa.
  • the host cell is then cultured under conditions effective to allow expression of the reporter gene, which expression is measured, and then the cell is contacted with the candidate substance and the new level of reporter gene expression is measured.
  • An increase in reporter gene expression in thepresence of the candidate substance is indicative of a candidate substance capable of stimulating SRE mediated transcription.
  • Still further embodiments of the invention concernmethods to assay for candidate substances capable of stimulating SRE mediated gene transcription in the presence of sterols.
  • These assays may be employed as a first screen, or a second screen to further analyze the properties of candidate substances which tested positive in earlier Kalpa activity or binding assays.
  • the sterol-responsive cellular screening method involves culturing the host cell in the presence of sterols and then adding the candidate substance, wherein an increase in reporter gene expression is indicative of a substance capable of stimulating SRE mediated transcription even in the presence of sterols.
  • assays contemplated by the inventors involve the co-introduction of SRE mediated reporter genes along with Kalpa-encoding genes into host cells. These recombinant host constructs are then employed to screen for agents that will act to modulate expression of the reporter gene and/or SREBP gene.
  • the use of the co-introduction approach may have advantages in terms of sensitivity, allowing a response that is readily detectable by automated detection means, such as by FACS or related technology.
  • a co-introduction system allows ready manipulation for the identification of selective agents that act specifically e,g contour through modulation of Kalpa levels and/or through modulation of downstream SREBP and LDL receptor modulating function.
  • Kalpa may be involved in cleavage of membrane bound for of SREBP.
  • assays comprising contacting a cell with a candidate Kalpa modulator, and detecting SIP- or S2P-mediated cleavage, preferably S IP-mediated cleavage. Preferably cleavage of an SREPB related substrate is measured.
  • assays comprising contacting a cell with a candidate Kalpa modulator, and detecting SIP recruitment or binding.
  • assays comprising contacting a cell with a candidate Kalpa modulator, and detecting levels of membrane bound or non-membrane bound or DNA- bound SREBP protein.
  • methods comprise contacting a cell with a candidate Kalpa modulator, and detecting detecting levels of or localization of SREBP protein.
  • a Kalpa inhibitor is assessed by screening for modified Site-1 protease cleavage, the method comprising the steps of:
  • the solution further comprises SREBP cleavage activating protein.
  • a Kalpa inhibitor is assessed by screening for modified Site-1 protease cleavage, the method comprising the steps of:
  • said methods may comprise monitoring Site-2 protease cleavage instead or in addition to Site-1 protease cleavage.
  • Kalpa may interact with SCAP and may be involved in mediating SCAP- SREBP interactions.
  • the methods of the invention thus further include assays that comprise contacting a cell with a candidate Kalpa modulator, and detecting binding of Kalpa to a SCAP protein (e.g. levels of), or cellular localization of SCAP protein.
  • the invention provides a method comprising contacting a cell with a candidate Kalpa modulator, and detecting a SCAP-SREBP complex.
  • the invention also provides cells overexpressing Kalpa, or lacking or having diminished Kalpa activity.
  • Cells lacking Kalpa activity or having diminished Kalpa activity e.g. 'knockdown', generated using mRNA interference methods for example
  • the in vivo effects of Kalpa on LDL and cholesterol metabolism can be studied using cells that transiently overexpressed hepatic Kalpa as a result of infection by intravenous infusion with a recombinant, replication defective adenovirus.
  • the Kalpa cDNA is under the control of the cytomegalovirus (CMV) immediate early enhancer/promotor.
  • CMV cytomegalovirus
  • Controls can include mice infected with a replication defective adenovirus lacking a cDNA transgene.
  • Kalpa expression can be determined by immunofluorescence microscopy and by immunoblotting. Kalpa expression can then be monitored in the livers of adenovirus treated animals over the course of several days. Levels and localization of the expression in the liver can then be assessed.
  • hepatic Kalpa overexpression on plasma cholesterol levels can be determined using known methods. Total cholesterol and plasma cholesterol can be measured. Fast pressure liquid chromatography (FPLC) analysis of plasma can be performed to determine specifically the effects of hepatic Kalpa overexpression on the different classes of lipoproteins. Methods include determining the portion of cholesterol contained in the HDL, VLDL and LDL fractions.
  • FPLC Fast pressure liquid chromatography
  • mice are infused with either the control virus or with Kalpa adenovirus.
  • the present invention includes a compound or agent obtainable by a method comprising the steps of any one of the aformentioned screening assays (e.g., cell-based assays or cell-free assays).
  • the invention includes a compound or agent obtainable by a method comprising contacting a cell which expresses a Kalpa target molecule with a test compound and the determining the ability of the test compound to bind to, or modulate the activity of, the Kalpa target molecule.
  • the invention includes a compound or agent obtainable by a method comprising contacting a cell which expresses a Kalpa target molecule with a Kalpa or biologically-active portion thereof, to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with, or modulate the activity of, the Kalpa target molecule.
  • the invention includes a compound or agent obtainable by a method comprising contacting a Kalpa or biologically active portion thereof with a test compound and determining the ability of the test compound to bind to, or modulate (e.g., stimulate or inhibit) the activity of, the Kalpa or biologically active portion thereof.
  • the present invention included a compound or agent obtainable by a method comprising contacting a Kalpa or biologically active portion thereof with a known compound which binds the Kalpa to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with, or modulate the activity of the Kalpa. Accordingly, it is within the scope of this invention to further use an agent identified as described herein in an appropriate animal model.
  • an agent identified as described herein e.g., a Kalpa modulating agent, an antisense Kalpa nucleic acid molecule, a Kalpa- specific antibody, or a Kalpa-binding partner
  • an agent identified as described herein can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent.
  • an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent.
  • this invention pertains to uses of novel agents identified by the above-described screening assays for treatments as described herein.
  • the present inventon also pertains to uses of novel agents identified by the above- described screening assays for diagnoses, prognoses, and treatments as described herein. Accordingly, it is within the scope of the present invention to use such agents in the design, formulation, synthesis, manufacture, and/or production of a drug or pharmaceutical composition for use in diagnosis, prognosis, or treatment, as described herein.
  • the present invention includes a method of synthesizing or producing a drug or pharmaceutical composition by reference to the structure and/or properties of a compound obtainable by one of the above-described screening assays.
  • a drug or pharmaceutical composition can be synthesized based on the structure and/or properties of a compound obtained by a method in which a cell which expresses a Kalpa target molecule is contacted with a test compound and the ability of the test compound to bind to, or modulate the activity of, the Kalpa target molecule is determined.
  • the present invention includes a method of synthesizing or producing a drug or pharmaceutical composition based on the structure and/or properties of a compound obtainable by a method in which a Kalpa or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to, or modulate (e.g., stimulate or inhibit) the activity of, the Kalpa or biologically active portion thereof is determined.
  • the modulatory method of the invention involves contacting a cell or Kalpa protein with an agent that modulates one or more of the activities of Kalpa activity.
  • An agent that modulates Kalpa protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of the candidate protein (e.g., a phosphorylation or cleavage substrate), an antibody, an agonist or antagonist, a peptidomimetic of an agonist or antagonist, or other small molecule.
  • the agent stimulates one or more Kalpa activities.
  • stimulatory agents include active Kalpa protein and a nucleic acid molecule encoding the candidate gene that has been introduced into the cell.
  • the agent inhibits one or more of the candidate activites.
  • inhibitory agents include antisense nucleic acid molecules, and antibodies.
  • the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., upregulates or downregulates) Kalpa expression or activity.
  • the method involves administering the Kalpa protein or nucleic acid molecule as therapy to compensate for reduced or aberrant Kalpa expression or activity.
  • Stimulation of Kalpa activity is desirable in situations in which it is abnormally downregulated and/or in which increased activity is likely to have a beneficial effect.
  • stimulation of Kalpa activity is desirable in situations in which Kalpa is downregulated and/or in which increased activity is likely to have a beneficial effect.
  • inhibition of Kalpa activity is desirable in situations in which the candidate is abnormally upregulated and/or in which decreased activity is likely to have a beneficial effect.
  • the candidate molecules of the present invention as well as agents, or modulators which have a stimulatory or inhibitory effect on Kalpa activity (e.g. gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) disorders (e.g, cholesterol disorders such as hypercholesterolemia) associated with aberrant activity.
  • disorders e.g, cholesterol disorders such as hypercholesterolemia
  • pharmacogenomics i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug
  • Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug.
  • Kalpa modulators including inhibitors and activators, identified according to the methods in the section titled "Drug Screening Assays" can be further tested for their ability to ameliorate or disorder related to cholesterol regulation in a suitable animal model of disease.
  • Kalpa modulators preferably modulators (e.g. inhibitors or activators) of Kalpa activity will be useful in the treatment of cholesterol regulation, HDL-cholesterol regulation, LDL- cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation. Said modulators will preferably also be useful in modulating expression of the LDL receptor. Accordingly, the present invention provides a method for the prophylaxis and/or treatment of a disorder related to cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation, LDL/HDL ratio regulation and/or LDL receptor expression or activity.
  • modulators e.g. inhibitors or activators
  • the present invention provides a method for the prophylaxis and/or treatment of a disorder including for example hyperlipidaemia, hypercholesterolaemia, hypertriglyceridaemia, heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and type II Diabetes.
  • a disorder to be treated is characterized by aberrant protein or nucleic acid expression and is a cellular, metabolic-related disorder, e.g., a cholesterol related disorder or cholesterol homeostasis disorder.
  • the invention also provides methods of the prophylaxis and/or treatment of a disorder comprising modulating (e.g. activating or inhibiting) the activity of the ubiquitin-proteasome pathway which metabolically regulates intra-cellular degradation of apoB, or by modulating the expression of the LDLR gene which regulates the uptake of LDL.
  • Type 2 diabetes mellitus is an increasingly common disorder of carbohydrate and lipid metabolism. Type 2 diabetes is now a major global health problem that affects over 124 million individuals worldwide. In the United States, type 2 diabetes affects 90% of the 15.6 persons with diabetes, of which approximately one half remain undiagnosed. In addition, type 2 diabetes, which is normally associated with older adults, is becoming more common in children and adolescents. 800,000 new cases are identified each year. Patients with non-insulin- dependent diabetes have an increased incidence of ischaemic heart disease (IHD) when compared with nondiabetic subjects. In addition, they have a worse prognosis after their first myocardial infarction (MI).
  • IHD ischaemic heart disease
  • MI myocardial infarction
  • the metabolic syndrome or syndrome X is characterized by the association of various cardiovascular risk factors (among which impaired glucose tolerance, arterial hypertension and dyslipidaemias), all closely linked to insulin resistance which is indeed the core of the syndrome.
  • cardiovascular risk factors among which impaired glucose tolerance, arterial hypertension and dyslipidaemias
  • the first unifying definition for the metabolic syndrome was proposed by WHO in 1998.
  • patients with type 2 diabetes mellitus with impaired glucose tolerance have the syndrome if they fulfill two of the criteria: hypertension, dyslipidaemia, obesity/abdominal obesity and microalbuminuria.
  • presence of the dysmetabolic syndrome is associated with reduced survival, particularly because of increased cardiovascular mortality.
  • the metabolic syndrome most likely results from inte ⁇ lay between several genes and an affluent environment.
  • a few candidate genes encoding proteins of glucose, insulin and lipid metabolism, lipolytic cascade, fatty acid intestinal abso ⁇ tion, glucocorticoid metabolism, haemostasis and blood pressure, have been associated with a clustering of metabolic abnormalities, although the functional significance of these associations remains to be established.
  • genetic polymorphisms such as those detected at several lipoprotein metabolism loci, can modulate the relationships between different components of the metabolic syndrome.
  • a growing understanding of the genetic architecture of the metabolic syndrome may help in the prevention of this condition.
  • Obesity can be defined as BMI > 95 th percentile for age and sex from large surveys that were carried out in the past. Using these cut points, over 10% of all children and adolescents are obese, and another 10% are overweight (BMI > 85th percentile). Obesity is a risk factor for many chronic diseases, including glucose intolerance, lipid disorders, hypertension, and coronary heart disease. Increased BMI results from a cumulative positive energy balance and is favored by both genetic and environmental factors. The prevalence of obesity increases rapidly in developed and developing countries, emphasizing the urgent need to identify new pharmacological and dietary intervention points to reduce excess body fat.
  • An "individual” treated by the methods of this invention is a vertebrate, particularly a mammal (including model animals of human disease, farm animals, sport animals, and pets), and typically a human.
  • “Treatment” refers to clinical intervention in an attempt to alter the natural course of the individual being treated, and may be performed either for prophylaxis or during the course of clinical pathology. Desirable effects include preventing occurrence or recurrence of disease, alleviation of symptoms, diminishment of any direct or indirect pathological consequences of the disease, such as hyperresponsiveness, inflammation, or necrosis, lowering the rate of disease progression, amelioration or palliation of the disease state, and remission or improved prognosis.
  • the "pathology” associated with a disease condition is anything that compromises the well-being, normal physiology, or quality of life of the affected individual.
  • Treatment is performed by administering an effective amount of a Kalpa inhibitor or activator, or preferably a Kalpa inhibitor or activator.
  • An "effective amount” is an amount sufficient to effect a beneficial or desired clinical result, and can be administered in one or more doses.
  • the criteria for assessing response to therapeutic modalities employing the lipid compositions of this invention are dictated by the specific condition, measured according to standard medical procedures appropriate for the condition.
  • compositions suitable for administration can be incorporated into pharmaceutical compositions suitable for administration.
  • Such compositions typically comprise a pharmaceutically acceptable carrier.
  • pharmaceutically acceptable carrier is intended to include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and abso ⁇ tion delaying agents, and the like, compatible with pharmaceutical administration.
  • the use of such media and agents for pharmaceutically active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active compound, use thereof in the compositions is contemplated. Supplementary active compounds can also be inco ⁇ orated into the compositions.
  • a pharmaceutical composition of the invention is formulated to be compatible with its intended route of administration.
  • routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration.
  • Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide.
  • the parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.
  • compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion.
  • suitable carriers include physiological saline, bacteriostatic water, Cremophor ELTM (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS).
  • the composition must be sterile and should be fluid to the extent that easy syringability exists. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi.
  • the carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof.
  • the proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants.
  • Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like.
  • isotonic agents for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition.
  • Prolonged abso ⁇ tion of the injectable compositions can be brought about by including in the composition an agent which delays abso ⁇ tion, for example, aluminum monostearate and gelatin.
  • the active compound is a protein, peptide or anti-Kalpa antibody
  • sterile injectable solutions can be prepared by inco ⁇ orating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization.
  • dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above.
  • sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.
  • Oral compositions generally include an inert diluent or an edible carrier. They can be enclosed in gelatin capsules or compressed into tablets.
  • the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules.
  • the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.
  • a suitable propellant e.g., a gas such as carbon dioxide, or a nebulizer.
  • Systemic administration can also be by transmucosal or transdermal means.
  • penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives.
  • Transmucosal administration can be accomplished through the use of nasal sprays or suppositories.
  • the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art. Most preferably, active compound is delivered to a subject by intravenous injection.
  • the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems.
  • a controlled release formulation including implants and microencapsulated delivery systems.
  • Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art.
  • the materials can also be obtained commercially from Alza Co ⁇ oration and Nova Pharmaceuticals, Inc.
  • Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers.
  • Dosage unit form refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.
  • the specification for the dosage unit forms of the invention are dictated by and directly dependent on the unique characteristics of the active compound and the particular therapeutic effect to be achieved, and the limitations inherent in the art of compounding such an active compound for the treatment of individuals.
  • Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50%> of the population).
  • the dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50.
  • Compounds which exhibit large therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.
  • the data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans.
  • the dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity.
  • the dosage may vary within this range depending upon the dosage form employed and the route of administration utilized.
  • the therapeutically effective dose can be estimated initially from cell culture assays.
  • a dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture.
  • IC50 i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms
  • levels in plasma may be measured, for example, by high performance liquid chromatography.
  • the pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.
  • the nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics; and in drug screening and methods of treatment (e.g., therapeutic and prophylactic) as further described herein. Activities of the various Kalpa are as described herein.
  • the isolated nucleic acid molecules of the invention can be used, for example, to express Kalpa (e.g., via a recombinant expression vector in a host cell or in gene therapy applications), to detect Kalpa mRNA (e.g., in a biological sample) or a genetic alteration in a Kalpa gene, and to modulate Kalpa activity, as described further below.
  • the Kalpa can be used to treat disorders characterized by insufficient or excessive production of a Kalpa or Kalpa target molecules.
  • the Kalpa can be used to screen for naturally occurring Kalpa target molecules, to screen for drugs or compounds which modulate, preferably inhibit Kalpa activity, as well as to treat disorders characterized by insufficient or excessive production of Kalpa or production of Kalpa forms which have decreased or aberrant activity compared to Kalpa wild type protein.
  • the anti-Kalpa antibodies of the invention can be used to detect and isolate Kalpa, regulate the bioavailability of Kalpa, and modulate Kalpa activity.
  • disorders in which the diagnostic and prognostic method may be useful include disorders related to cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation, LDL/HDL ratio regulation, and LDL receptor expression or activity.
  • the present invention provides a method for the prophylaxis and/or treatment of a disorder including for example hyperlipidaemia, hypercholesterolaemia, hypertriglyceridaemia, heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
  • one embodiment of the present invention involves a method of use (e.g., a diagnostic assay, prognostic assay, or a prophylactic/therapeutic method of treatment) wherein a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or most preferably a Kalpa inhibitor or activator) is used, for example, to diagnose, prognose and/or treat a disease and/or condition in which any of the aforementioned Kalpa activities is indicated.
  • a method of use e.g., a diagnostic assay, prognostic assay, or a prophylactic/therapeutic method of treatment
  • a molecule of the present invention e.g., a Kalpa polypeptide, Kalpa nucleic acid, or most preferably a Kalpa inhibitor or activator
  • the present invention involves a method of use (e.g., a diagnostic assay, prognostic assay, or a prophylactic/therapeutic method of treatment) wherein a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator) is used, for example, for the diagnosis, prognosis, and/or treatment of subjects, preferably a human subject, in which any of the aforementioned activities is pathologically perturbed.
  • a method of use e.g., a diagnostic assay, prognostic assay, or a prophylactic/therapeutic method of treatment
  • a molecule of the present invention e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator
  • the methods of use involve administering to a subject, preferably a human subject, a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator) for the diagnosis, prognosis, and/or therapeutic treatment.
  • a subject preferably a human subject
  • a molecule of the present invention e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator
  • the methods of use involve administering to a human subject a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator).
  • a molecule of the present invention e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator.
  • the invention encompasses a method of determining whether a Kalpa polypeptide is expressed within a biological sample comprising: a) contacting said biological sample with: ii) a polynucleotide that hybridizes under stringent conditions to a Kalpa nucleic acid; or iii) a detectable polypeptide that selectively binds to a Kalpa polypeptide; and b) detecting the presence or absence of hybridization between said polynucleotide and an RNA species within said sample, or the presence or absence of binding of said detectable polypeptide to a polypeptide within said sample.
  • a detection of said hybridization or of said binding indicates that said Kalpa is expressed within said sample.
  • the polynucleotide is a primer, and wherein said hybridization is detected by detecting the presence of an amplification product comprising said primer sequence, or the detectable polypeptide is an antibody.
  • Also envisioned is a method of determining whether a mammal, preferably human, has an elevated or reduced level of Kalpa expression comprising: a) providing a biological sample from said mammal; and b) comparing the amount of a Kalpa polypeptide or of a Kalpa RNA species encoding a Kalpa polypeptide within said biological sample with a level detected in or expected from a control sample.
  • An increased amount of said Kalpa polypeptide or said Kalpa RNA species within said biological sample compared to said level detected in or expected from said control sample indicates that said mammal has an elevated level of Kalpa expression
  • a decreased amount of said Kalpa polypeptide or said Kalpa RNA species within said biological sample compared to said level detected in or expected from said control sample indicates that said mammal has a reduced level of Kalpa expression
  • Gene Regions Associated with Genetic Disease Differences in the DNA sequences between individuals affected and unaffected with a disease associated with the Kalpa gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease or phenotype (e.g. low LDL level phenotype). Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymo ⁇ hisms. Predictive Medicine:
  • the present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, one aspect of the present invention relates to diagnostic assays for determining Kalpa protein and/or nucleic acid expression as well as activity, in the context of a biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether an individual is afflicted with a disease or disorder, or is at risk of developing a disorder, associated with aberrant expression or activity.
  • a biological sample e.g., blood, serum, cells, tissue
  • the invention also provides for prognostic (or predictive) assays for determining whether an individual is at risk of developing a disorder associated with a Kalpa protein, nucleic acid expression or activity. For example, mutations in the Kalpa gene can be assayed in a biological sample. Such assays can be used for prognostic or predictive purpose to thereby prophylactically treat an individual prior to the onset of a disorder characterized by or associated with the Kalpa protein, nucleic acid expression or activity.
  • An exemplary method for detecting the presence or absence of the Kalpa protein or nucleic acid in a biological sample involves obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting Kalpa protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes Kalpa protein such that the presence of Kalpa protein or nucleic acid is detected in the biological sample.
  • a compound or an agent capable of detecting Kalpa protein or nucleic acid e.g., mRNA, genomic DNA
  • a preferred agent for detecting Kalpa mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to Kalpa mRNA or genomic DNA.
  • the nucleic acid probe can be, for example, a human nucleic acid, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to Kalpa mRNA or genomic DNA.
  • oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to Kalpa mRNA or genomic DNA.
  • Other suitable probes for use in the diagnostic assays of the invention are described herein.
  • a preferred agent for detecting the Kalpa protein is an antibody capable of binding to the Kalpa protein, preferably an antibody with a detectable label.
  • Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used.
  • the term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled.
  • Examples of indirect labeling include detection of a primary antibody using a fluorescently labeled secondary antibody and end- labeling of a DNA probe with biotin such that it can be detected with fluorescently labeled streptavidin.
  • biological sample is intended to include tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. That is, the detection method of the invention can be used to detect candidate mRNA, protein, or genomic DNA in a biological sample in vitro as well as in vivo.
  • in vitro techniques for detection of candidate mRNA include Northern hybridizations and in situ hybridizations.
  • In vitro techniques for detection of the candidate protein include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence.
  • In vitro techniques for detection of candidate genomic DNA include Southern hybridizations.
  • in vivo techniques for detection of the Kalpa protein include introducing into a subject a labeled anti- antibody.
  • the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques.
  • the biological sample contains protein molecules from the test subject.
  • the biological sample can contain mRNA molecules from the test subject or genomic DNA molecules from the test subject.
  • a preferred biological sample is a serum sample isolated by conventional means from a subject.
  • the methods further involve obtaining a control biological sample from a control subject, contacting the control sample with a compound or agent capable of detecting the Kalpa protein, mRNA, or genomic DNA, such that the presence of Kalpa protein, mRNA or genomic DNA is detected in the biological sample, and comparing the presence of Kalpa protein, mRNA or genomic DNA in the control sample with the presence of Kalpa protein, mRNA or genomic DNA in the test sample.
  • the invention also encompasses kits for detecting the presence of the Kalpa protein, mRNA, or genomic DNA in a biological sample.
  • the kit can comprise a labeled compound or agent capable of detecting Kalpa protein or mRNA in a biological sample; means for determining the amount of Kalpa protein or mRNA in the sample; and means for comparing the amount of Kalpa protein, mRNA, or genomic DNA in the sample with a standard.
  • the compound or agent can be packaged in a suitable container.
  • the kit can further comprise instructions for using the kit to detect Kalpa protein or nucleic acid.
  • the diagnostic methods described herein can furthermore be utilized to identify subjects having or at risk of developing a disease, disorder or trait associated with aberrant expression or activity of the Kalpa gene or protein.
  • the assays described herein such as the preceding diagnostic assays or the following assays, can be utilized to identify a subject having or at risk of developing a disorder associated with the Kalpa protein, nucleic acid expression or activity.
  • the present invention provides a method for identifying a disease or disorder associated with aberrant Kalpa expression or activity in which a test sample is obtained from a subject and Kalpa protein or nucleic acid (e.g., mRNA, genomic DNA) is detected, wherein the presence of Kalpa protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant expression or activity.
  • a test sample refers to a biological sample obtained from a subject of interest.
  • a test sample can be a biological fluid (e.g., serum), cell sample, or tissue.
  • the prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant Kalpa expression or activity.
  • an agent e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate
  • the present invention provides methods for determining whether a subject can be effectively treated with an agent for a disorder associated with aberrant Kalpa expression or activity in which a test sample is obtained and Kalpa protein or nucleic acid expression or activity is detected.
  • the methods of the invention can also be used to detect genetic alterations in the Kalpa gene, thereby determining if a subject with the altered gene is at risk for a disorder associated with the Kalpa gene.
  • the methods include detecting, in a sample of cells from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a -protein, or the mis-expression of the Kalpa gene.
  • such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from the Kalpa gene; 2) an addition of one or more nucleotides to the Kalpa gene; 3) a substitution of one or more nucleotides of the Kalpa gene, 4) a chromosomal rearrangement of the Kalpa gene; 5) an alteration in the level of a messenger RNA transcript of the Kalpa gene, 6) aberrant modification of the Kalpa gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of the Kalpa gene, 8) a non- wild type level of the Kalpa protein, 9) allelic loss of the Kalpa gene, and 10) inappropriate post-translational modification of the Kalpa protein.
  • a preferred biological sample is a tissue or serum sample isolated by conventional means from a subject, e.g., a liver tissue sample.
  • detection of the alteration involves the use of a probe/primer in a polymerase chain reaction (PCR), such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegran et al. (1988) Science 241 :23-1080; and Nakazawa et al. (1994) Proc. Natl. Acad. Sci.
  • PCR polymerase chain reaction
  • LCR ligation chain reaction
  • This method can include the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to the Kalpa gene under conditions such that hybridization and amplification of the Kalpa gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein.
  • nucleic acid e.g., genomic, mRNA or both
  • mutations in the Kalpa gene from a sample cell can be identified by alterations in restriction enzyme cleavage patterns.
  • sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA.
  • sequence specific ribozymes can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.
  • genetic mutations in the Kalpa gene can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, to high density arrays containing hundreds or thousands of oligonucleotides probes (Cronin, M.T. et al. (1996) Human Mutation 1: 244-255; Kozal, M.J. et al. (1996) Nature Medicine 2: 753-759).
  • genetic mutations in the Kalpa gene can be identified in two dimensional arrays containing light-generated DNA probes as described in Cronin, M.T. et al. supra.
  • any of a variety of sequencing reactions known in the art can be used to directly sequence the Kalpa gene and detect mutations by comparing the sequence of the sample with the corresponding wild-type (control) sequence.
  • Examples of sequencing reactions include those based on techniques developed by Maxam and Gilbert ((1977) Proc. Natl. Acad. Sci. USA 74:560) or Sanger ((1977) Proc. Natl. Acad. Sci. USA 74:5463). It is also contemplated that any of a variety of automated sequencing procedures can be utilized when performing the diagnostic assays.
  • the methods described herein may be performed, for example, by utilizing prepackaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving the candidate gene.
  • prepackaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving the candidate gene.
  • any cell type or tissue in which the Kalpa gene is expressed may be utilized in the prognostic assays described herein. Expression of the Kalpa gene is further discussed in the examples below.
  • Monitoring the influence of agents (e.g., drugs or compounds) on the expression or activity of the candidate protein can be applied not only in basic drug screening, but also in clinical trials.
  • agents e.g., drugs or compounds
  • the effectiveness of an agent determined by a screening assay as described herein to increase candidate gene expression, protein levels, or upregulate activity can be monitored in clinical trials of subjects exhibiting decreased gene expression, protein levels, or downregulated activity.
  • the effectiveness of an agent determined by a screening assay to decrease Kalpa gene expression, protein levels, or downregulate activity can be monitored in clinical trials of subjects exhibiting increased candidate gene expression, protein levels, or upregulated activity.
  • the expression or activity of the candidate gene, and preferably, other genes that have been implicated in a disorder can be used as a "read out" or markers of the phenotype of a particular cell.
  • LDLR expression is used as a marker of phenotype, wherein a compound preferably increases LDLR expression.
  • cells can be isolated and RNA prepared and analyzed for the levels of expression of Kalpa and other genes implicated such as preferably LDLR in the associated disorder, respectively.
  • the levels of gene expression i.e., a gene expression pattern
  • the levels of gene expression can be quantified by Northern blot analysis or RT-PCR, or alternatively by measuring the amount of protein produced, or by measuring the levels of activity of the candidate gene or other genes.
  • the gene expression pattern can serve as a marker, indicative of the physiological response of the cells to the agent. Accordingly, this response state may be determined before, and at various points during treatment of the individual with the agent.
  • the present invention provides a method for monitoring the effectiveness of treatment of a subject with an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate identified by the screening assays described herein) comprising the steps of (i) obtaining a pre- administration sample from a subject prior to administration of the agent; (ii) detecting the level of expression of the candidate protein, mRNA, or genomic DNA in the pre-administration sample; (iii) obtaining one or more post-administration samples from the subject; (iv) detecting the level of expression or activity of the protein, mRNA, or genomic DNA in the post- administration samples; (v) comparing the level of expression or activity of the protein, mRNA, or genomic DNA in the pre-administration sample with the protein, mRNA, or genomic DNA in the post administration sample or samples; and (vi) altering the administration of the agent to the subject accordingly.
  • an agent e.g., an agonist, antagonist, peptid
  • increased administration of the agent may be desirable to increase the expression or activity of the Kalpa to higher levels than detected, i.e., to increase the effectiveness of the agent.
  • decreased administration of the agent may be desirable to decrease expression or activity of Kalpa to lower levels than detected, i.e. to decrease the effectiveness of the agent.
  • expression or activity may be used as an indicator of the effectiveness of an agent, even in the absence of an observable phenotypic response.
  • the polymorphisms of the present invention can also be used to develop diagnostics tests capable of identifying individuals who express a detectable trait as the result of a specific genotype or individuals whose genotype places them at risk of developing a detectable trait at a subsequent time.
  • the trait analyzed using the present diagnostics may be any detectable trait, including a disease involving cholesterol regulation, a response to an agent acting on cholesterol regulation or side effects to an agent acting on cholesterol regulation.
  • the diagnostic techniques of the present invention may employ a variety of methodologies to determine whether a test subject has a polymorphism pattern associated with an increased risk of developing a detectable trait or whether the individual suffers from a detectable trait as a result of a particular mutation, including methods which enable the analysis of individual chromosomes for haplotyping, such as family studies, single sperm DNA analysis or somatic hybrids.
  • the present invention provides diagnostic methods to determine whether an individual is at risk of developing a disease or suffers from a disease resulting from a mutation or a polymorphism in a candidate gene of the present invention.
  • the present invention also provides methods to determine whether an individual is likely to respond positively to an agent acting on cholesterol-related disorder or whether an individual is at risk of developing an adverse side effect to an agent acting on cholesterol-related disorder.
  • These methods involve obtaining a nucleic acid sample from the individual and, determining, whether the nucleic acid sample contains at least one allele or at least one polymo ⁇ hism, indicative of a risk of developing the trait or indicative that the individual expresses the trait as a result of possessing a particular candidate gene polymo ⁇ hism or mutation (trait-causing allele).
  • a nucleic acid sample is obtained from the individual and this sample is genotyped using methods described above in "Methods of Genotyping an Individual for Polymorphisms".
  • the diagnostics may be based on a single polymo ⁇ hism or a on group of polymorphisms.
  • a nucleic acid sample is obtained from the test subject and the polymorphism pattern of one or more of the polymorphisms (for example a polymorphism of SEQ ID Nos 1, 2 or 3 or 5 to 23, or a polymo ⁇ hism listed in Table 3) is determined.
  • the polymorphism pattern of one or more of the polymorphisms for example a polymorphism of SEQ ID Nos 1, 2 or 3 or 5 to 23, or a polymo ⁇ hism listed in Table 3
  • a PCR amplification is conducted on the nucleic acid sample to amplify regions in which polymorphisms associated with a detectable phenotype have been identified.
  • the amplification products are sequenced to determine whether the individual possesses one or more polymorphisms associated with a detectable phenotype.
  • a preferred set of primers includes primes derived from described in SEQ ID Nos. 5 to 23.
  • the nucleic acid sample is subjected to microsequencing reactions as described above to determine whether the individual possesses one or more polymo ⁇ hisms associated with a detectable phenotype resulting from a mutation or a polymorphism in a candidate gene.
  • the nucleic acid sample is contacted with one or more allele specific oligonucleotide probes which, specifically hybridize to one or more candidate gene alleles associated with a detectable phenotype.
  • the present invention provides methods of determining whether an individual is at risk of developing cholesterol-related disorder or whether said individual suffers from a cholesterol- related disorder, comprising: a) genotyping said individual for at least one Kalpa-related polymo ⁇ hism; and b) correlating the result of step a) with a risk of developing cholesterol- related disorder.
  • said Kalpa-related polymo ⁇ hism is selected from the group consisting of polymorphisms of a Kalpa gene.
  • said Kalpa-related polymorphism is selected from the polymorphisms described in Table 3. In large part because of the risk of complications such as atherosclerosis, the detection of susceptibility to cholesterol-related disorders in individuals is very important.
  • the invention concerns a method for the treatment of cholesterol-related disorders, or a related disorder comprising the following steps: - selecting an individual whose DNA comprises alleles of a Kalpa-related marker or group of polymorphisms associated with a cholesterol-related disorder;
  • the present invention concerns a method for the treatment ofcholesterol-related disorder comprising the following steps: - selecting an individual whose DNA comprises alleles of a polymo ⁇ hism or of a group of Kalpa-related polymorphisms associated with a cholesterol-related disorder;
  • a preventive treatment of cholesterol-related disorder to said individual;- following up said individual for the appearance and the development of a cholesterol-related disorder symptoms; and optionally - administering a treatment acting against said cholesterol-related disorder or against symptoms thereof to said individual at the appropriate stage of the disease.
  • the present invention also concerns a method for the treatment of a cholesterol-related disorder comprising the following steps: - selecting an individual whose DNA comprises alleles of a polymo ⁇ hism or of a group of Kalpa-related polymo ⁇ hisms associated with the gravity of a cholesterol-related disorder or of the symptoms thereof; and
  • the invention also concerns a method for the treatment of a cholesterol-related disorder in a selected population of individuals.
  • the method comprises:
  • a "positive response" to a medicament can be defined as comprising a reduction of the symptoms related to the disease.
  • a "negative response" to a medicament can be defined as comprising either a lack of positive response to the medicament which does not lead to a symptom reduction or which leads to a side-effect observed following administration of the medicament.
  • the invention also relates to a method of determining whether a subject is likely to respond positively to treatment with a medicament.
  • the method comprises identifying a first population of individuals who respond positively to said medicament and a second population of individuals who respond negatively to said medicament.
  • One or more polymo ⁇ hisms isidentif ⁇ ed in the first population which is associated with a positive response to said medicamentor one or more polymorphisms is identified in the second population which is associated with anegative response to said medicament.
  • the polymorphisms may be identified using the techniques described herein.
  • a DNA sample is then obtained from the subject to be tested.
  • the DNA sample is analyzed to determine whether it comprises alleles of one or more polymo ⁇ hisms associated with a positive response to treatment with the medicament and/or alleles of one or more polymo ⁇ hisms associated with a negative response to treatment with the medicament.
  • the medicament may be administered to the subject in a clinical trial if the DNA sample contains alleles of one or more polymorphisms associated with a positive response to treatment with the medicament and/or if the DNA sample lacks alleles of one or more polymo ⁇ hisms associated with a negative response to treatment with the medicament.
  • the medicament is a drug acting against a cholesterol- related disorder.
  • the evaluation of drug efficacy may be conducted in a population of individuals likely to respond favorably to the medicament.
  • Another aspect of the invention is a method of using a medicament comprising obtaining a DNA sample from a subject, determining whether the DNA sample contains alleles of one or more polymo ⁇ hisms associated with a positive response to the medicament and/or whether the DNA sample contains alleles of one or more polymo ⁇ hisms associated with a negative response to the medicament, and administering the medicament to the subject if the
  • DNA sample contains alleles of one or more polymorphisms associated with a positive response to the medicament and/or if the DNA sample lacks alleles of one or more polymo ⁇ hisms associated with a negative response to the medicament.
  • the invention also concerns a method for the clinical testing of a medicament, preferably a medicament acting against a cholesterol-related disorder or symptoms thereof.
  • the method comprises the following steps: - administering a medicament, preferably a medicament susceptible of acting against a cholesterol-related disorder or symptoms thereof to a heterogeneous population of individuals,
  • Kalpa may be involved in the SCAP/SREBP pathway which mediates LDL receptor expression
  • individuals might differ in Kalpa levels and therefore in their LDL receptor levels.
  • Measurement of Kalpa and/or downstream proteins involved in LDL receptor regulation can then be used to explain differential responses within human population to high cholesterol diets.
  • a method of determining the level of LDL receptor expression in an individual by examining the level of Kalpa expression in said individual.
  • methods of assessing expression of acitivity of a downstream protein comprising examining Kalpa activity or expression.
  • said downstream protien is an LDL receptor, SREBP, SCAP, SIP or S2P protein.
  • Further preferred methods comprise determining the allele present at a Kalpa-related polymo ⁇ hism, and correlating said allele with a change in expression of a Kalpa protein, expression of an LDL receptor, SREBP, SCAP, SIP or S2P protein, or with blood or cellular cholesterol, HDL, VLDL or LDL levels.
  • Polymorphisms of the Kalpa gene of the invention offer a number of important advantages over other genetic markers such as RFLP (Restriction fragment length polymo ⁇ hism) and VNTR (Variable Number of Tandem Repeats) markers.
  • the first generation of markers were RFLPs, which are variations that modify the length of a restriction fragment. But methods used to identify and to type RFLPs are relatively wasteful of materials, effort, and time.
  • the second generation of genetic markers were VNTRs, which can be categorized as either minisatellites or microsatellites. Minisatellites are tandemly repeated DNA sequences present in units of 5-50 repeats which are distributed along regions of the human chromosomes ranging from 0.1 to 20 kilobases in length. Since they present many possible alleles, their informative content is very high. Minisatellites are scored by performing Southern blots to identify the number of tandem repeats present 4 in a nucleic acid sample from the individual being tested. However, there are only 10 potential VNTRs that can be typed by Southern blotting. Moreover, both
  • RFLP and VNTR markers are costly and timeconsuming to develop and assay in large numbers.
  • Single nucleotide polymo ⁇ hism or polymorphisms can be used in the same manner as RFLPs and VNTRs but offer several advantages.
  • Single nucleotide polymo ⁇ hisms are densely spaced in the human genome and represent the most frequent type of variation. An estimated number of more than 10' sites are scattered along the 3x1 ⁇ " base pairs of the human genome.
  • single nucleotide polymo ⁇ hism occur at a greater frequency and with greater uniformity than RFLP or VNTR markers which means that there is a greater probability that such a marker will be found in close proximity to a genetic locus of interest.
  • Single nucleotide polymorphisms are less variable than VNTR markers but are mutationally more stable.
  • polymo ⁇ hisms of the present invention are often easier to distinguish and can therefore be typed easily on a routine basis.
  • Polymo ⁇ hisms have single nucleotide based alleles and they have only two common alleles, which allows highly parallel detection and automated scoring.
  • the polymo ⁇ hisms of the present invention offer the possibility of rapid, high-throughput genotyping of a large number of individuals.
  • Polymorphisms are densely spaced in the genome, sufficiently informative and can be assayed in large numbers. The combined effects of these advantages make polymo ⁇ hisms extremely valuable in genetic studies. Polymorphisms can be used in linkage studies in families, in allele sharing methods, in linkage disequilibrium studies in populations, in association studies of case-control populations. An important aspect of the present invention is that polymorphisms allow association studies to be performed to identify genes involved in complex traits. Association studies examine the frequency of marker alleles in unrelated case- and control-populations and are generally employed in the detection of polygenic or sporadic traits. Association studies may be conducted within the general population and are not limited to studies performed on related individuals in affected families (linkage studies).
  • Biallelic markers in different genes can be screened in parallel for direct association with disease or response to a treatment.
  • This multiple gene approach is a powerful tool for a variety of human genetic studies as it provides the necessary statistical power to examine the synergistic effect of multiple genetic factors on a particular phenotype, drug response, sporadic trait, or disease state with a complex genetic etiology.
  • the genomic DNA samples from which the polymo ⁇ hisms of the present invention are generated are preferably obtained from unrelated individuals corresponding to a heterogeneous population of known ethnic background.
  • the number of individuals from whom DNA samples are obtained can vary substantially, preferably from about 10 to about 1000, more preferably from about 50 to about 200 individuals.
  • DNA samples are collected from at least about 100 individuals in order to have sufficient polymorphic diversity in a given population to identify as many markers as possible and to generate statistically significant results.
  • test samples include biological samples, which can be tested by the methods of the present invention described herein, and include human and animal body fluids such as whole blood, serum, plasma, cerebrospinal fluid, urine, lymph fluids, and various external secretions of the respiratory, intestinal and genitourinary tracts, tears, saliva, milk, white blood cells, myelomas and the like; biological fluids such as cell culture supernatants; fixed tissue specimens including tumor and non-tumor tissue and lymph node tissues; bone marrow aspirates and fixed cell specimens.
  • the preferred source of genomic DNA used in the present invention is from peripheral venous blood of each donor. Techniques to prepare genomic DNA from biological samples are well known to the skilled technician. A person skilled in the art can choose to amplify pooled or unpooled DNA samples.
  • DNA samples can be pooled or unpooled for the amplification step.
  • DNA amplification techniques are well known to those skilled in the art. Various methods to amplify DNA fragments carrying polymo ⁇ hisms are further described herein. The PCR technology is the preferred amplification technique used to identify new polymo ⁇ hisms.
  • polymo ⁇ hisms are identified using genomic sequence information generated by the inventors.
  • Genomic DNA fragments such as the inserts of the BAC clones described above, are sequenced and used to design primers for the amplification of 500 bp fragments. These 500 bp fragments are amplified from genomic DNA and are scanned for polymo ⁇ hisms.
  • Primers may be designed using the OSP software (Hillier L. and Green P., Methods Appl. 1 : 124-8, 1991). All primers may contain, upstream of the specific target bases, a common oligonucleotide tail that serves as a sequencing primer. Those skilled in the art are familiar with primer extensions, which can be used for these pu ⁇ oses.
  • genomic sequences of candidate genes are available in public databases allowing direct screening for biallelic markers.
  • Preferred primers, useful for the amplification of genomic sequences encoding the candidate genes focus on promoters, exons and splice sites of the genes. A polymorphism present in these functional regions of the gene has a higher probability to be a causal mutation.
  • the amplified DNA is subjected to automated dideoxy terminator sequencing reactions using a dye-primer cycle sequencing protocol.
  • the products of the sequencing reactions are run on sequencing gels and the sequences are determined using gel image analysis.
  • the polymo ⁇ hism search is based on the presence of superimposed peaks in the electrophoresis pattern resulting from different bases occurring at the same position. Because each dideoxy terminator is labeled with a different fluorescent molecule, the two peaks corresponding to a biallelic site present distinct colors corresponding to two different nucleotides at the same position on the sequence. However, the presence of two peaks can be an artifact due to background noise. To exclude such an artifact, the two DNA strands are sequenced and a comparison between the peaks is carried out. In order to be registered as a polymo ⁇ hic sequence, the polymorphism has to be detected on both strands.
  • the above procedure permits those amplification products, which contain polymorphisms to be identified.
  • the detection limit for the frequency of biallelic polymorphisms detected by sequencing pools of 100 individuals is approximately 0.1 for the minor allele, as verified by sequencing pools of known allelic frequencies.
  • more than 90% of the biallelic polymo ⁇ hisms detected by the pooling method have a frequency for the minor allele higher than 0.25. Therefore, the polymo ⁇ hisms selected by this method have a frequency of at least 0.1 for the minor allele and less than 0.9 for the major allele.
  • At least 0.2 for the minor allele and less than 0.8 for the major allele Preferably at least 0.2 for the minor allele and less than 0.8 for the major allele, more preferably at least 0.3 for the minor allele and less than 0.7 for the major allele, thus a heterozygosity rate higher than 0.18, preferably higher than 0.32, more preferably higher than 0.42.
  • polymorphisms are detected by sequencing individual DNA samples, the frequency of the minor allele of such a polymorphism may be less than 0.1.
  • the markers carried by the same fragment of genomic DNA need not necessarily be ordered with respect to one another within the genomic fragment to conduct association studies. However, in some embodiments of the present invention, the order of polymo ⁇ hisms carried by the same fragment of genomic DNA are determined.
  • the polymorphisms are evaluated for their usefulness as genetic markers by validating that both alleles are present in a population. Validation of the biallelic markers is accomplished by genotyping a group of individuals by a method of the invention and demonstrating that both alleles are present.
  • Microsequencing is a preferred method of genotyping alleles.
  • the validation by genotyping step may be performed on individual samples derived from each individual in the group or by genotyping a pooled sample derived from more than one individual.
  • the group can be as small as one individual if that individual is heterozygous for the allele in question.
  • the group contains at least three individuals, more preferably the group contains five or six individuals, so that a single validation test will be more likely to result in the validation of more of the polymorphisms that are being tested. It should be noted, however, that when the validation test is performed on a small group it may result in a false negative result if as a result of sampling error none of the individuals tested carries one of the two alleles.
  • the validation process is less useful in demonstrating that a particular initial result is an artifact, than it is at demonstrating that there is a bona fide polymorphism at a particular position in a sequence.
  • Figure 2 For an indication of whether a particular polymo ⁇ hism has been validated see Figure 2.
  • genotyping, haplotyping, association, and interaction study methods of the invention may optionally be performed solely with validated biallelic markers.
  • the validated polymo ⁇ hisms are further evaluated for their usefulness as genetic markers by dete ⁇ nining the frequency of the least common allele at the polymorphism site.
  • the determination of the least common allele is accomplished by genotyping a group of individuals by a method of the invention and demonstrating that both alleles are present. This determination of frequency by genotyping step may be performed on individual samples derived from each individual in the group or by genotyping a pooled sample derived from more than one individual.
  • the group must be large enough to be representative of the population as a whole.
  • the group contains at least 20 individuals, more preferably the group contains at least 50 individuals, most preferably the group contains at least 100 individuals.
  • a polymo ⁇ hism wherein the frequency of the less common allele is 30% or more is termed a"high quality polymorphism.
  • ' ⁇ ll of the genotyping, haplotyping, association, and interaction study methods of the invention may optionally be performed solely with high quality polymo ⁇ hisms.
  • Methods are provided to genotype a biological sample for one or more polymorphisms of the present invention, all of which may be performed in vitro.
  • Such methods of genotyping comprise dete ⁇ nining the identity of a nucleotide at a Kalpa-related polymo ⁇ hism by any method known in the art. These methods find use in genotyping case-control populations in association studies as well as individuals in the context of detection of alleles of biallelic markers which, are known to be associated with a given trait, in which case both copies of the polymo ⁇ hism present in individual's genome are determined so that an individual may be classified as homozygous or heterozygous for a particular allele.
  • These genotyping methods can be performed nucleic acid samples derived from a single individual or pooled DNA samples.
  • Genotyping can be performed using similar methods as those described above for the identification of the polymo ⁇ hisms, or using other genotyping methods such as those further described below.
  • the comparison of sequences of amplified genomic fragments from different individuals is used to identify new polymorphisms whereas microsequencing is used for genotyping known polymo ⁇ hisms in diagnostic and association study applications.
  • A. Source of DNA for Genotyping Any source of nucleic acids, in purified or non-purified form, can be utilized as the starting nucleic acid, provided it contains or is suspected of containing the specific nucleic acid sequence desired. DNA or RNA may be extracted from cells, tissues, body fluids and the like as described above in II. A. While nucleic acids for use in the genotyping methods of the invention can be derived from any mammalian source, the test subjects and individuals from which nucleic acid samples are taken are generally understood to be human.
  • Methods and polynucleotides are provided to amplify a segment of nucleotides comprising one or more polymo ⁇ hism of the present invention. It will be appreciated that amplification of DNA fragments comprising polymorphisms may be used in various methods and for various purposes and is not restricted to genotyping. Nevertheless, many genotyping methods, although not all, require the previous amplification of the DNA region carrying the polymorphism of interest. Such methods specifically increase the concentration or total number of sequences that span the polymorphism or include that site and sequences located either distal or proximal to it. Diagnostic assays may also rely on amplification of DNA segments carrying a polymo ⁇ hism of the present invention.
  • Amplification of DNA may be achieved by any method known in the art.
  • Amplification methods which can be utilized herein include but are not limited to Ligase Chain Reaction (LCR) as described in EP A 320 308 and EP A 439 182, Gap LCR (Wolcott, M. J., Clin. Mcrobiol. Rev. 5: 370-386), the socalled “NASBA” or "3SR” technique described in Guatelli J. C et al. (Proc. Natl. Acad. Sci. USA 87: 1874-1878, 1990) and in Compton J.
  • LCR Ligase Chain Reaction
  • NASBA socalled "NASBA” or "3SR” technique described in Guatelli J. C et al. (Proc. Natl. Acad. Sci. USA 87: 1874-1878, 1990) and in Compton J.
  • LCR and Gap LCR are exponential amplification techniques, both depend on DNA ligase to join adjacent primers annealed to a DNA molecule.
  • probe pairs are used which include two primary (first and second) and two secondary (third and fourth) probes, all of which are employed in molar excess to target.
  • the first probe hybridizes to a first segment of the target strand and the second probe hybridizes to a second segment of the target strand, the first and second segments being contiguous so that the primary probes abut one another in 5'phosphate-3'hydroxyl relationship, and so that a ligase can covalently fuse or ligate the two probes into a fused product.
  • a third (secondary) probe can hybridize to a portion of the first probe and a fourth (secondary) probe can hybridize to a portion of the second probe in a similar abutting fashion.
  • the secondary probes also will hybridize to the target complement in the first instance.
  • the third and fourth probes which can be ligated to form a complementary, secondary ligated product. It is important to realize that the ligated products are functionally equivalent to either the target or its complement. By repeated cycles of hybridization and ligation, amplification of the target sequence is achieved.
  • a method for multiplex LCR has also been described (WO 9320227).
  • Gap LCR is a version of LCR where the probes are not adjacent but are separated by 2 to 3 bases.
  • RT-PCR polymerase chain reaction
  • AGLCR is a modification of GLCR that allows the amplification of RNA.
  • PCR technology is the preferred amplification technique used in the present invention.
  • a variety of PCR techniques are familiar to those skilled in the art. For a review of PCR technology, see Molecular Cloning to Genetic Engineering White, B. A. Ed. in Methods in Molecular Biology 67: Humana Press, Totowa (1997) and the publication entitled' CR Methods and Applications” (1991, Cold Spring Harbor Laboratory Press).
  • PCR primers on either side of the nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid sample along with dNTPs and a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent polymerase.
  • a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent polymerase.
  • the nucleic acid in the sample is denatured and the PCR primers are specifically hybridized to complementary nucleic acid sequences in the sample.
  • the hybridized primers are extended. Thereafter, another cycle of denaturation, hybridization, and extension is initiated. The cycles are repeated multiple times to produce an amplified fragment containing the nucleic acid sequence between the primer sites.
  • PCR has further been described in several patents including US Patents 4,683,195; 4,683,202 and 4,965,188.
  • the identification of polymorphisms as described above allows the design of appropriate oligonucleotides, which can be used as primers to amplify DNA fragments comprising the polymorphisms of the present invention.
  • Amplification can be performed using the primers initially used to discover new biallelic markers which are described herein or any set of primers allowing the amplification of a DNA fragment comprising a polymorphism of the present invention.
  • Primers can be prepared by any suitable method. As for example, direct chemical synthesis by a method such as the phosphodiester method of Narang S. A. et al. (Methods Enzymol.
  • the present invention provides primers for amplifying a DNA fragment containing one or more polymo ⁇ hisms of the present invention.
  • the primers can be obtained from the genomic SEQ ID No 1, or SEQ ID Nos 1001 to 12 or 3, or from sequences from the selected locus as available in databases.
  • the primers are selected to be substantially complementary to the different strands of each specific sequence to be amplified.
  • the length of the primers of the present invention can range from 8 to 100 nucleotides, preferably from 8 to 50, 8 to 30 or more preferably 8 to 25 nucleotides. Shorter primers tend to lack specificity for a target nucleic acid sequence and generally require cooler temperatures to form sufficiently stable hybrid complexes with the template.
  • the formation of stable hybrids depends on the melting temperature (Tm) of the DNA.
  • Tm melting temperature
  • the Tm depends on the length of the primer, the ionic strength of the solution and the G+C content.
  • the G+C content of the amplification primers of the present invention preferably ranges between 10 and 75 %>, more preferably between 35 and 60 %>, and most preferably between 40 and 55 %>.
  • the appropriate length for primers under a particular set of assay conditions may be empirically determined by one of skill in the art.
  • amplified segments carrying biallelic markers can range in size from at least about 25 bp to 35 kbp. Amplification fragments from 25-3000 bp are typical, fragments from 50-1000 bp are preferred and fragments from 100-600 bp are highly preferred. It will be appreciated that amplification primers for the polymo ⁇ hisms may be any sequence which allow the specific amplification of any DNA fragment carrying the markers.
  • Amplification primers may be labeled or immobilized on a solid support as described in "Polymo ⁇ hisms and Polynucleotides Comprising Polymorphisms" or "Methods of Genotyping DNA Samples for Polymorphisms". Any method known in the art can be used to identify the nucleotide present at a polymorphism site. Since the polymorphism allele to be detected has been identified and specified in the present invention, detection will prove simple for one of ordinary skill in the art by employing any of a number of techniques. Many genotyping methods require the previous amplification of the DNA region carrying the polymo ⁇ hism of interest.
  • Another method for determining the identity of the nucleotide present at a particular polymorphic site employs a specialized exonuclease- resistant nucleotide derivative as described in US Patent 4, 656, 127. Preferred methods involve directly determining the identity of the nucleotide present at a polymo ⁇ hism site by sequencing assay, enzyme-based mismatch detection assay, or hybridization assay.
  • sampling assay is used herein to refer to polymerase extension of duplex primer/template complexes and includes both traditional sequencing and microsequencing.
  • the nucleotide present at a polymo ⁇ hic site can be determined by sequencing methods.
  • DNA samples are subjected to PCR amplification before sequencing as described above.
  • DNA sequencing methods are well known in the art.
  • the amplified DNA is subjected to automated dideoxy terminator sequencing reactions using a dye-primer cycle sequencing protocol. Sequence analysis allows the identification of the base present at the polymo ⁇ hism site.
  • a nucleotide at the polymo ⁇ hic site that is unique to one of the alleles in a target DNA is detected by a single nucleotide primer extension reaction.
  • This method involves appropriate microsequencing primers which, hybridize just upstream of a polymo ⁇ hic base of interest in the target nucleic acid.
  • a polymerase is used to specifically extend the 3' end of the primer with one single ddNTP (chain terminator) complementary to the selected nucleotide at the polymorphic site.
  • ddNTP chain terminator
  • microsequencing reactions are carried out using fluorescent ddNTPs and the extended microsequencing primers are analyzed by electrophoresis on ABI 377 sequencing machines to determine the identity of the inco ⁇ orated nucleotide as described in EP 412 883.
  • capillary electrophoresis can be used in order to process a higher number of assays simultaneously.
  • An example of a typical microsequencing procedure that can be used in the context of the present invention is provided in [Example 3].
  • a homogeneous phase detection method based on fluorescence resonance energy transfer has been described by Chen and Kwok (Nucleic Acids Research 25: 347-353 1997) and Chen et al. (Proc. Natl. Acad. Sci. USA 94/20 216-221, 1997).
  • amplified genomic DNA fragments containing polymorphic sites are incubated with a 5'-fluorescein- labeled primer in the presence of allelic dye-labeled dideoxyribonucleoside triphosphates and a modified Taq polymerase.
  • the dye labeled primer is extended one base by the dye-terminator specific for the allele present on the template.
  • the fluorescence intensities of the two dyes in the reaction mixture are analyzed directly without separation or purification. All these steps can be performed in the same tube and the fluorescence changes can be monitored in real time.
  • the extended primer may be analyzed by MALDI-TOF Mass Spectrometry. The base at the polymo ⁇ hic site is identified by the mass added onto the microsequencing primer (see Haff L. A. and Smiraov I. P., Genome Research, 7: 378-388, 1997).
  • Microsequencing may be achieved by the established microsequencing method or by developments or derivatives thereof.
  • Alternative methods include several solid-phase microsequencing techniques.
  • the basic microsequencing protocol is the same as described previously, except that the method is conducted as a heterogeneous phase assay, in which the primer or the target molecule is immobilized or captured onto a solid support.
  • oligonucleotides are attached to solid supports or are modified in such ways that permit affinity separation as well as polymerase extension.
  • the 5' ends and internal nucleotides of synthetic oligonucleotides can be modified in a number of different ways to permit different affinity separation approaches, e. g., biotinylation.
  • the oligonucleotides can be separated from the incorporated terminator reagent. This eliminates the need of physical or size separation. More than one oligonucleotide can be separated from the terminator reagent and analyzed simultaneously if more than one affinity group is used. This permits the analysis of several nucleic acid species or more nucleic acid sequence infonnation per extension reaction.
  • the affinity group need not be on the priming oligonucleotide but could alternatively be present on the template. For example, immobilization can be carried out via an interaction between biotinylated DNA and streptavidin-coated microtitration wells or avidin-coated polystyrene particles.
  • oligonucleotides or templates may be attached to a solid support in a high-density format.
  • inco ⁇ orated ddNTPs can be radiolabeled (Syvanen, Clinica Chimica Acta 226: 225-236, 1994) or linked to fluorescein (Livak and Hainer, Human Mutation 3: 379-385, 1994).
  • the detection of radiolabeled ddNTPs can be achieved through scintillation-based techniques.
  • the detection of fluorescein-linked ddNTPs can be based on the binding of antifluorescein antibody conjugated with alkaline phosphatase, followed by incubation with a chromogenic substrate (such as p-nitrophenyl phosphate).
  • a chromogenic substrate such as p-nitrophenyl phosphate.
  • Other possible reporterdetection pairs include: ddNTP linked to dinitrophenyl (DNP) and anti-DNP alkaline phosphatase conjugate (Harju et al., Clin. Chem. 39/11 2282-2287, 1993) or biotinylated ddNTP and horseradish peroxidase-conjugated streptavidin with o-phenylenediamine as a substrate (WO 92/15712).
  • Nyren et al. (Analytical Biochemistry 208: 171-175, 1993) described a method relying on the detection of DNA polymerase activity by an enzymatic luminometric inorganic pyrophosphate detection assay (ELIDA).
  • ELIDA luminometric inorganic pyrophosphate detection assay
  • Pastinen et al. (Genome research 7: 606-614, 1997) describe a method for multiplex detection of single nucleotide polymorphism in which the solid phase minisequencing principle is applied to an oligonucleotide array format. High-density arrays of DNA probes attached to a solid support (DNA chips) are further described herein.
  • the present invention provides polynucleotides and methods to genotype one or more polymorphisms of the present invention by performing a microsequencing assay.
  • microsequencing analysis may be performed for any polymo ⁇ hism or any combination of polymorphisms of the present invention.
  • the present invention provides polynucleotides and methods to determine the allele of one or more polymo ⁇ hisms of the present invention in a biological sample, by mismatch detection assays based on polymerases and/or ligases. These assays are based on the specificity of polymerases and ligases. Polymerization reactions places particularly stringent requirements on co ⁇ ect base pairing of the 3' end of the amplification primer and the joining of two oligonucleotides hybridized to a target DNA sequence is quite sensitive to mismatches close to the ligation site, especially at the 3' end.
  • the terms "enzyme based mismatch detection assay” are used herein to refer to any method of determining the allele of a polymorphism based on the specificity of ligases and polymerases.
  • Allele specific amplification Discrimination between the two alleles of a polymo ⁇ hism can also be achieved by allele specific amplification, a selective strategy, whereby one of the alleles is amplified without amplification of the other allele. This is accomplished by placing a polymo ⁇ hic base at the 3' end of one of the amplification primers. Because the extension forms from the 3' end of the primer, a mismatch at or near this position has an inhibitory effect on amplification. Therefore, under appropriate amplification conditions, these primers only direct amplification on their complementary allele. Designing the appropriate allele-specific primer and the corresponding assay conditions are well with the ordinary skill in the art.
  • OLA Oligonucleotide Ligation Assay
  • OLA uses two oligonucleotides which are designed to be capable of hybridizing to abutting sequences of a single strand of a target molecules.
  • One of the oligonucleotides is biotinylated, and the other is detectably labeled. If the precise complementary sequence is found in a target molecule, the oligonucleotides will hybridize such that their termini abut, and create a ligation substrate that can be captured and detected.
  • OLA is capable of detecting polymo ⁇ hisms and may be advantageously combined with PCR as described by Nickerson D. A. et al. (Proc. Natl. Acad. Sci. U.S.A. 87: 8923-8927, 1990). In this method, PCR is used to achieve the exponential amplification of target DNA, which is then detected using OLA.
  • LCR ligase chain reaction
  • GLCR Gap LCR
  • LCR uses two pairs of probes to exponentially amplify a specific target. The sequences of each pair of oligonucleotides, is selected to permit the pair to hybridize to abutting sequences of the same strand of the target. Such hybridization forms a substrate for a template- dependant ligase.
  • LCR can be performed with oligonucleotides having the proximal and distal sequences of the same strand of a polymorphism site.
  • either oligonucleotide will be designed to include the polymorphism site.
  • the reaction conditions are selected such that the oligonucleotides can be ligated together only if the target molecule either contains or lacks the specific nucleotide (s) that is complementary to the polymorphism on the oligonucleotide.
  • the oligonucleotides will not include the polymo ⁇ hism, such that when they hybridize to the target molecule, a"gap"is created as described in WO 90/015. This gap is then "filled”with complementary dNTPs (as mediated by DNA polymerase), or by an additional pair of oligonucleotides. Thus at the end of each cycle, each single strand has a complement capable of serving as a target during the next cycle and exponential allele-specific amplification of the desired sequence is obtained.
  • Ligase/Polymerase-mediated Genetic Bit Analysis is another method for determining the identity of a nucleotide at a preselected site in a nucleic acid molecule (WO 95/21271).
  • This method involves the inco ⁇ oration of a nucleoside triphosphate that is complementary to the nucleotide present at the preselected site onto the terminus of a primer molecule, and their subsequent ligation to a second oligonucleotide.
  • the reaction is monitored by detecting a specific label attached to the reaction's solid phase or by detection in solution.
  • a preferred method of determining the identity of the nucleotide present at a polymorphism site involves nucleic acid hybridization.
  • the hybridization probes which can be conveniently used in such reactions, preferably include the probes defined herein. Any hybridization assay may be used including Southern hybridization, Northern hybridization, dot blot hybridization and solid-phase hybridization (see Sambrook et al., Molecular Cloning-A Laboratory Manual, Second Edition, Cold Spring Harbor Press, N. Y., 1989).
  • Hybridization refers to the formation of a duplex structure by two single stranded nucleic aci procedures using conditions of high stringency are as follows: Prehybridization of filters containing DNA is carried out for 8 h to overnight at 65 C in buffer composed of 6X SSC, 50 mM Tris-HCI (pH 7.5), I mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 g/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65 C, the preferred hybridization temperature, in prehybridization mixture containing 100 g/ml denatured salmon sperm DNA and 5-20 X 106 cpm of 32p_labeled probe.
  • the hybridization step can be performed at 65 C in the presence of SSC buffer, 1 x SSC corresponding to 0.1 SM NaCI and 0.05 M Na citrate. Subsequently, filter washes can be done at 37 C for 1 h in a solution containing 2X SSC, 0.01% PVP, 0.01% Ficoll, and 0.01% BSA, followed by a wash in 0.1X SSC at 50 C for 45 min. Alternatively, filter washes can be performed in a solution containing 2 x SSC and 0.1% SDS, or 0.5 x SSC and 0.1% SDS, or 0.1 x SSC and 0.1% SDS at 68C for 15 minute intervals. Following the wash steps, the hybridized probes are detectable by autoradiography.
  • procedures using conditions of intermediate stringency are as follows: Filters containing DNA are prehybridized, and then hybridized at a temperature of 60 ⁇ , C in the presence of a 5 x SSC buffer and labeled probe. Subsequently, filters washes are performed in a solution containing 2x SSC at 50 C and the hybridized probes are detectable by autoradiography.
  • Other conditions of high and intermediate stringency which may be used are well known in the art and as cited in Sambrook et al. (Molecular Cloning-A Laboratory Manual, Second Edition, Cold Spring Harbor Press, N. Y., 1989) and Ausubel et al. (Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.
  • hybridizations can be performed in solution, it is prefe ⁇ ed to employ a solidphase hybridization assay.
  • the target DNA comprising a biallelic marker of the present invention may be amplified prior to the hybridization reaction.
  • the presence of a specific allele in the sample is determined by detecting the presence or the absence of stable hybrid duplexes formed between the probe and the target DNA.
  • the detection of hybrid duplexes can be carried out by a number of methods.
  • Various detection assay formats are well known which utilize detectable labels bound to either the target or the probe to enable detection of the hybrid duplexes.
  • hybridization duplexes are separated from unhybridized nucleic acids and the labels bound to the duplexes are then detected.
  • wash steps may be employed to wash away excess target DNA or probe.
  • Standard heterogeneous assay formats are suitable for detecting the hybrids using the labels present on the primers and probes.
  • the TaqMan assay takes advantage of the 5' nuclease activity of Taq DNA polymerase to digest a DNA probe annealed specifically to the accumulating amplification product.
  • TaqMan probes are labeled with a donor-acceptor dye pair that interacts via fluorescence energy transfer. Cleavage of the TaqMan probe by the advancing polymerase during amplification dissociates the donor dye from the quenching acceptor dye, greatly increasing the donor fluorescence.
  • molecular beacons are used for allele discriminations.
  • Molecular beacons are hai ⁇ inshaped oligonucleotide probes that report the presence of specific nucleic acids in homogeneous solutions. When they bind to their targets they undergo a conformational reorganization that restores the fluorescence of an internally quenched fluorophore (Tyagi et al., Nature Biotechnology, 16: 49-53, 1998).
  • the polynucleotides provided herein can be used in hybridization assays for the detection of polymo ⁇ hism alleles in biological samples. These probes are characterized in that they preferably comprise between 8 and 50 nucleotides, and in that they are sufficiently complementary to a sequence comprising a polymo ⁇ hism of the present invention to hybridize thereto and preferably sufficiently specific to be able to discriminate the targeted sequence for only one nucleotide variation.
  • the GC content in the probes of the invention usually ranges between 10 and 75 %, preferably between 35 and 60 %, and more preferably between 40 and 55 %.
  • the length of these probes can range from 10, 15, 20, or 30 to at least 100 nucleotides, preferably from 10 to 50, more preferably from 18 to 35 nucleotides.
  • a particularly preferred probe is 25 nucleotides in length.
  • the polymo ⁇ hism is within 4 nucleotides of the center of the polynucleotide probe.
  • the polymorphism is at the center of said polynucleotide. Shorter probes may lack specificity for a target nucleic acid sequence and generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. Longer probes are expensive to produce and can sometimes self-hybridize to form hairpin structures. Methods for the synthesis of oligonucleotide probes have been described above and can be applied to the probes of the present invention.
  • the probes of the present invention are labeled or immobilized on a solid support. Labels and solid supports are further described herein.
  • Detection probes are generally nucleic acid sequences or uncharged nucleic acid analogs such as, for example peptide nucleic acids which are disclosed in International Patent Application WO 92/20702, mo ⁇ holino analogs which are described in U. S. Patents Numbered 5,034,506 and 5,142,047.
  • the probe may have to be rendered"nonextendable"in that additional dNTPs cannot be added to the probe.
  • analogs usually are non-extendable and nucleic acid probes can be rendered non-extendable by modifying the 3' end of the probe such that the hydroxyl group is no longer capable of participating in elongation.
  • the 3 'end of the probe can be functionalized with the capture or detection label to thereby consume or otherwise block the hydroxyl group.
  • the probes of the present invention are useful for a number of pu ⁇ oses. They can be used in Southern hybridization to genomic DNA or Northern hybridization to mRNA. The probes can also be used to detect PCR amplification products. By assaying the hybridization to an allele specific probe, one can detect the presence or absence of a polymo ⁇ hism allele in a given sample. High-Throughput parallel hybridizations in array format are specifically encompassed within "hybridization assays" and are described below.
  • Hybridization to addressable arrays of oligonucleotides Hybridization assays based on oligonucleotide a ⁇ ays rely on the differences in hybridization stability of short oligonucleotides to perfectly matched and mismatched target sequence variants. Efficient access to polymo ⁇ hism information is obtained through a basic structure comprising highdensity arrays of oligonucleotide probes attached to a solid support (the chip) at selected positions.
  • Each DNA chip can contain thousands to millions of individual synthetic DNA probes a ⁇ anged in a grid-like pattern and miniaturized to the size of a dime.
  • the chip technology has already been applied with success in numerous cases. For example, the screening of mutations has been undertaken in the BRCA 1 gene, in S. cerevisiae mutant strains, and in the protease gene of HIV-I virus (Hacia et al., Nature Genetics, 14 (4): 441-447, 1996; Shoemaker et al., Nature Genetics, 14 (4): 450-456, 1996; Kozal et al., Nature Medicine, 2: 753-759, 1996).
  • Chips of various formats for use in detecting biallelic polymo ⁇ hisms can be produced on a customized basis by Affymetrix (GeneChipT””), Hyseq (HyChip and HyGnostics), and Protogene Laboratories.
  • EP785280 describes a tiling strategy for the detection of single nucleotide polymorphisms. Briefly, arrays may generally be"tiled"for a large number of specific polymorphisms.
  • tileing is generally meant the synthesis of a defined set of oligonucleotide probes which is made up of a sequence complementary to the target sequence of interest, as well as preselected variations of that sequence, e. g., substitution of one or more given positions with one or more members of the basis set of monomers, i. e. nucleotides. Tiling strategies are further described in PCT application No. WO 95/11995.
  • arrays are tiled for a number of specific, identified polymo ⁇ hism sequences.
  • the array is tiled to include a number of detection blocks, each detection block being specific for a specific polymo ⁇ hism or a set of polymorphisms.
  • a detection block may be tiled to include a number of probes, which span the sequence segment that includes a specific polymo ⁇ hism. To ensure probes that are complementary to each allele, the probes are synthesized in pairs differing at the polymo ⁇ hism. In addition to the probes differing at the polymo ⁇ hic base, monosubstituted probes are also generally tiled within the detection block. These monosubstituted probes have bases at and up to a certain number of bases in either direction from the polymo ⁇ hism, substituted with the remaining nucleotides (selected from A, T, G, C and U).
  • the probes in a tiled detection block will include substitutions of the sequence positions up to and including those that are 5 bases away from the polymo ⁇ hism.
  • the monosubstituted probes provide internal controls for the tiled array, to distinguish actual hybridization from artefactual crosshybridization.
  • the array Upon completion of hybridization with the target sequence and washing of the a ⁇ ay, the array is scanned to determine the position on the a ⁇ ay to which the target sequence hybridizes.
  • the hybridization data from the scanned array is then analyzed to identify which allele or alleles of the polymorphism are present in the sample. Hybridization and scanning may be carried out as described in PCT application No. WO 92/10092 and WO 95/11995 and US patent No. 5, 424, 186.
  • the chips may comprise an array of nucleic acid sequences of fragments of about 15 nucleotides in length.
  • the chip may comprise an array including at least one of the sequences selected from the group consisting of SEQ ID Nos. 1, 2 or 3 or 5 to 23, and the sequences complementary thereto, or a fragment thereof at least about 8 consecutive nucleotides, preferably 10, 15, 20, more preferably 25, 30, 40, 49, or 50 consecutive nucleotides.
  • the chip may comprise an array of at least 2, 3, 4, 5, 6, 7, 8 or more of these polynucleotides of the invention.
  • Solid supports and polynucleotides of the present invention attached to solid supports are further described in herein. Polymorphisms and Polynucleotides Comprising Polymo ⁇ hisms.
  • Another technique which may be used to analyze polymo ⁇ hisms, includes multicomponent integrated systems, which miniaturize and compartmentalize processes such as PCR and capillary electrophoresis reactions in a single functional device.
  • multicomponent integrated systems which miniaturize and compartmentalize processes such as PCR and capillary electrophoresis reactions in a single functional device.
  • An example of such technique is disclosed in US patent 5,589,136, which describes the integration of PCR amplification and capillary electrophoresis in chips.
  • microfluidic systems can be envisaged mainly when microfluidic systems are used. These systems comprise a pattern of microchannels designed onto a glass, silicon, quartz, or plastic wafer included on a microchip. The movements of the samples are controlled by electric, electroosmotic or hydrostatic forces applied across different areas of the microchip.
  • the microfluidic system may integrate nucleic acid amplification, microsequencing, capillary electrophoresis and a detection method such as laser-induced fluorescence detection,
  • the polymorphisms may be used in parametric and non-parametric linkage analysis methods.
  • the polymorphisms of the present invention are used to identify genes associated with detectable traits using association studies, an approach which does not require the use of affected families and which permits the identification of genes associated with complex and sporadic traits.
  • the genetic analysis using the polymorphisms of the present invention may be conducted on any scale.
  • the whole set of polymorphisms of the present invention or any subset of polymorphisms of the present invention may be used.
  • a subset of polymo ⁇ hisms corresponding to one or several candidate genes of the present invention may be used.
  • a subset of polymorphisms corresponding to candidate genes from a given pathway of cholesterol regulation may be used, for example the ubiquitin- proteasome pathway which metabolically regulates intra-cellular degradation of apoB .
  • any set of genetic markers including a polymo ⁇ hism of the present invention may be used in studies relating to cholesterol regulation.
  • Linkage analysis is based upon establishing a correlation between the transmission of genetic markers and that of a specific trait throughout generations within a family.
  • the aim of linkage analysis is to detect marker loci that show cosegregation with a trait of interest in pedigrees.
  • loci When data are available from successive generations there is the opportunity to study the degree of linkage between pairs of loci.
  • Estimates of the recombination fraction enable loci to be ordered and placed onto a genetic map. With loci that are genetic markers, a genetic map can be established, and then the strength of linkage between markers and traits can be calculated and used to indicate the relative positions of markers and genes affecting those traits (Weir, 1996).
  • the classical method for linkage analysis is the logarithm of odds (lod) score method (see Morton, 1955; Ott, 1991). Calculation of lod scores requires specification of the mode of inheritance for the disease (parametric method).
  • the length of the candidate region identified using linkage analysis is between 2 and 20Mb.
  • Linkage analysis has been successfully applied to map simple genetic traits that show clear Mendelian inheritance patterns and which have a high penetrance (i.e., the ratio between the number of trait positive carriers of allele a and the total number of a carriers in the population).
  • parametric linkage analysis suffers from a variety of drawbacks. First, it is limited by its reliance on the choice of a genetic model suitable for each studied trait. Furthermore, as already mentioned, the resolution attainable using linkage analysis is limited, and complementary studies are required to refine the analysis of the typical 2Mb to 20Mb regions initially identified through linkage analysis. In addition, parametric linkage analysis approaches have proven difficult when applied to complex genetic traits, such as those due to the combined action of multiple genes and/or environmental factors.
  • non-parametric methods for linkage analysis are that they do not require specification of the mode of inheritance for the disease, they tend to be more useful for the analysis of complex traits.
  • non-parametric methods one tries to prove that the inheritance pattern of a chromosomal region is not consistent with random Mendelian segregation by showing that affected relatives inherit identical copies of the region more often than expected by chance. Affected relatives should show excess "allele sharing" even in the presence of incomplete penetrance and polygenic inheritance.
  • the degree of agreement at a marker locus in two individuals can be measured either by the number of alleles identical by state (IBS) or by the number of alleles identical by descent (IBD).
  • IBS number of alleles identical by state
  • IBD number of alleles identical by descent
  • polymo ⁇ hisms of the present invention may be used in both parametric and non- parametric linkage analysis.
  • polymo ⁇ hisms may be used in non-parametric methods which allow the mapping of genes involved in complex traits.
  • the polymorphisms of the present invention may be used in both IBD- and IBS- methods to map genes affecting a complex trait. In such studies, taking advantage of the high density of polymo ⁇ hisms, several adjacent polymo ⁇ hism loci may be pooled to achieve the efficiency attained by multi-allelic markers (Zhao et al., 1998).
  • the present invention comprises methods for identifying if a Kalpa gene is associated with a detectable trait using the polymorphisms of the present invention.
  • the present invention comprises methods to detect an association between a polymorphism allele or a polymorphism haplotype and a trait. Further, the invention comprises methods to identify a trait causing allele in linkage disequilibrium with any polymorphism allele of the present invention.
  • the polymorphisms of the present invention are used to perform candidate gene association studies.
  • the candidate gene analysis clearly provides a short-cut approach to the identification of genes and gene polymorphisms related to a particular trait when some information concerning the biology of the trait is available.
  • the polymo ⁇ hisms of the present invention may be inco ⁇ orated in any map of genetic markers of the human genome in order to perform genome-wide association studies.
  • the polymo ⁇ hisms of the present invention may further be inco ⁇ orated in any map of a specific candidate region of the genome (a specific chromosome or a specific chromosomal segment for example).
  • association studies may be conducted within the general population and are not limited to studies performed on related individuals in affected families. Association studies are extremely valuable as they permit the analysis of sporadic or multifactor traits. Moreover, association studies represent a powerful method for fine-scale mapping enabling much finer mapping of trait causing alleles than linkage studies. Studies based on pedigrees often only narrow the location of the trait causing allele. Association studies using the polymorphisms of the present invention can therefore be used to refine the location of a trait causing allele in a candidate region identified by Linkage Analysis methods.
  • a candidate gene such as a candidate gene of the present invention
  • polymorphisms of the present invention can be used to demonstrate that a candidate gene is associated with a trait. Such uses are specifically contemplated in the present invention.
  • Allelic frequencies of the polymo ⁇ hisms in a populations can be determined using one of the methods described above under the heading "Methods for genotyping an individual for polymo ⁇ hisms", or any genotyping procedure suitable for this intended purpose.
  • Genotyping pooled samples or individual samples can determine the frequency of a polymo ⁇ hism allele in a population.
  • One way to reduce the number of genotypings required is to use pooled samples.
  • a major obstacle in using pooled samples is in terms of accuracy and reproducibility for determining accurate DNA concentrations in setting up the pools.
  • Genotyping individual samples provides higher sensitivity, reproducibility and accuracy and; is the preferred method used in the present invention.
  • each individual is genotyped separately and simple gene counting is applied to determine the frequency of an allele of a polymorphism or of a genotype in a given population.
  • the invention also relates to methods of estimating the frequency of an allele in a population comprising: a) genotyping individuals from said population for said polymorphism according to the method of the present invention; b) determining the proportional representation of said polymo ⁇ hism in said population.
  • the methods of estimating the frequency of an allele in a population of the invention encompass methods with any further limitation described in this disclosure, or those following, specified alone or in any combination; optionally, wherein said Kalpa-related polymo ⁇ hism is selected from the group consisting of the polymorphisms of [Table 3] and SEQ ID NOS 5 to 23, and the complements thereof, or optionally the polymo ⁇ hisms in linkage disequilibrium therewith; optionally, wherein said Kalpa-related polymo ⁇ hism is a polymo ⁇ hism located in a gene selected from the group consisting of the Kalpa genes, and the complements thereof, or optionally the polymo ⁇ hisms in linkage disequilibrium therewith.
  • determining the frequency of a polymo ⁇ hism allele in a population may be accomplished by determining the identity of the nucleotides for both copies of said polymorphism present in the genome of each individual in said population and calculating the proportional representation of said nucleotide at said Kalpa- related polymorphism for the population;
  • determining the proportional representation may be accomplished by performing a genotyping method of the invention on a pooled biological sample derived from a representative number of individuals, or each individual, in said population, and calculating the proportional amount of said nucleotide compared with the total.
  • the gametic phase of haplotypes is unknown when diploid individuals are heterozygous at more than one locus.
  • genealogical information in families gametic phase can sometimes be infe ⁇ ed (Perlin et al., 1994).
  • different strategies may be used.
  • One possibility is that the multiple-site heterozygous diploids can be eliminated from the analysis, keeping only the homozygotes and the single-site heterozygote individuals, but this approach might lead to a possible bias in the sample composition and the underestimation of low-frequency haplotypes.
  • single chromosomes can be studied independently, for example, by asymmetric PCR amplification (see Newton et al, 1989; Wu et al., 1989) or by isolation of single chromosome by limit dilution followed by PCR amplification (see Ruano et al., 1990). Further, a sample may be haplotyped for sufficiently close polymorphisms by double PCR amplification of specific alleles (Sarkar, G. and Sommer S. S., 1991). These approaches are not entirely satisfying either because of their technical complexity, the additional cost they entail, their lack of generalization at a large scale, or the possible biases they introduce.
  • an algorithm to infer the phase of PCR-amplified DNA genotypes introduced by Clark, A.G.(1990) may be used. Briefly, the principle is to start filling a preliminary list of haplotypes present in the sample by examining unambiguous individuals, that is, the complete homozygotes and the single-site heterozygotes. Then other individuals in the same sample are screened for the possible occurrence of previously recognized haplotypes. For each positive identification, the complementary haplotype is added to the list of recognized haplotypes, until the phase information for all individuals is either resolved or identified as unresolved.
  • This method assigns a single haplotype to each multiheterozygous individual, whereas several haplotypes are possible when there are more than one heterozygous site.
  • a method based on an expectation-maximization (EM) algorithm (Dempster et al., 1977) leading to maximum-likelihood estimates of haplotype frequencies under the assumption of Hardy- Weinberg proportions (random mating) is used (see Excoffier L. and Slatkin M., 1995).
  • the EM algorithm is a generalized iterative maximum-likelihood approach to estimation that is useful when data are ambiguous and/or incomplete.
  • the EM algorithm is used to resolve heterozygotes into haplotypes. Haplotype estimations are further described below under the heading "Statistical Methods.” Any other method known in the art to determine or to estimate the frequency of a haplotype in a population may be used.
  • the invention also encompasses methods of estimating the frequency of a haplotype for a set of polymorphisms in a population, comprising the steps of: a) genotyping at least one Kalpa-related polymorphism according to a method of the invention for each individual in said population; b) genotyping a second polymorphism by determining the identity of the nucleotides at said second polymorphism for both copies of said second polymorphism present in the genome of each individual in said population; and c) applying a haplotype determination method to the identities of the nucleotides determined in steps a) and b) to obtain an estimate of said frequency.
  • the methods of estimating the frequency of a haplotype of the invention encompass methods with any further limitation described in this disclosure, or those following, specified alone or in any combination: optionally, wherein said Kalpa-related polymo ⁇ hism is selected from the group consisting of the polymo ⁇ hisms of Table 3, and the complements thereof, or optionally the polymorphisms in linkage disequilibrium therewith; optionally, wherein said Kalpa-related polymorphism is a polymorphism located in a gene selected from the group consisting of the Kalpa genes, and the complements thereof, or optionally the polymorphisms in linkage disequilibrium therewith.
  • said haplotype determination method is performed by asymmetric PCR amplification, double PCR amplification of specific alleles, the Clark algorithm, or an expectation-maximization algorithm.
  • Linkage disequilibrium is the non-random association of alleles at two or more loci and represents a powerful tool for mapping genes involved in disease traits (see Ajioka R.S. et al., 1997). Polymo ⁇ hisms, because they are densely spaced in the human genome and can be genotyped in greater numbers than other types of genetic markers (such as RFLP or VNTR markers), are particularly useful in genetic analysis based on linkage disequilibrium.
  • a disease mutation When a disease mutation is first introduced into a population (by a new mutation or the immigration of a mutation ca ⁇ ier), it necessarily resides on a single chromosome and thus on a single "background” or “ancestral” haplotype of linked markers. Consequently, there is complete disequilibrium between these markers and the disease mutation: one finds the disease mutation only in the presence of a specific set of marker alleles. Through subsequent generations recombination events occur between the disease mutation and these marker polymorphisms, and the disequilibrium gradually dissipates. The pace of this dissipation is a function of the recombination frequency, so the markers closest to the disease gene will manifest higher levels of disequilibrium than those that are further away.
  • the pattern or curve of disequilibrium between disease and marker loci is expected to exhibit a maximum that occurs at the disease locus. Consequently, the amount of linkage disequilibrium between a disease allele and closely linked genetic markers may yield valuable information regarding the location of the disease gene.
  • For fine-scale mapping of a disease locus it is useful to have some knowledge of the patterns of linkage disequilibrium that exist between markers in the studied region. As mentioned above the mapping resolution achieved through the analysis of linkage disequilibrium is much higher than that of linkage studies. The high density of polymorphisms combined with linkage disequilibrium analysis provides powerful tools for fine-scale mapping. Different methods to calculate linkage disequilibrium are described below under the heading "Statistical Methods".
  • linkage disequilibrium the occurrence of pairs of specific alleles at different loci on the same chromosome is not random and the deviation from random is called linkage disequilibrium.
  • Association studies focus on population frequencies and rely on the phenomenon of linkage disequilibrium. If a specific allele in a given gene is directly involved in causing a particular trait, its frequency will be statistically increased in an affected (trait positive) population, when compared to the frequency in a trait negative population or in a random control population. As a consequence of the existence of linkage disequilibrium, the frequency of all other alleles present in the haplotype carrying the trait-causing allele will also be increased in trait positive individuals compared to trait negative individuals or random controls.
  • Case-control populations can be genotyped for polymo ⁇ hisms to identify associations that narrowly locate a trait causing allele. As any marker in linkage disequilibrium with one given marker associated with a trait will be associated with the trait. Linkage disequilibrium allows the relative frequencies in case-control populations of a limited number of genetic polymo ⁇ hisms (specifically polymo ⁇ hisms) to be analyzed as an alternative to screening all possible functional polymorphisms in order to find trait-causing alleles. Association studies compare the frequency of marker alleles in unrelated case-control populations, and represent powerful tools for the dissection of complex traits.
  • Population-based association studies do not concern familial inheritance but compare the prevalence of a particular genetic marker, or a set of markers, in case-control populations. They are case-control studies based on comparison of unrelated case (affected or trait positive) individuals and unrelated control (unaffected, trait negative or random) individuals.
  • the control group is composed of unaffected or trait negative individuals.
  • the control group is ethnically matched to the case population.
  • the control group is preferably matched to the case-population for the main known confusion factor for the trait under study (for example age-matched for an age-dependent trait).
  • individuals in the two samples are paired in such a way that they are expected to differ only in their disease status.
  • the terms "trait positive population”, "case population” and "affected population” are used interchangeably herein.
  • a major step in the choice of case-control populations is the clinical definition of a given trait or phenotype.
  • Any genetic trait may be analyzed by the association method proposed here by carefully selecting the individuals to be included in the trait positive and trait negative phenotypic groups.
  • Four criteria are often useful: clinical phenotype, age at onset, family history and severity.
  • the selection procedure for continuous or quantitative traits involves selecting individuals at opposite ends of the phenotype distribution of the trait under study, so as to include in these trait positive and trait negative populations individuals with non-overlapping phenotypes.
  • case-control populations comprise phenotypically homogeneous populations.
  • Trait positive and trait negative populations comprise phenotypically uniform populations of individuals representing each between 1 and 98%), preferably between 1 and 80%, more preferably between 1 and 50%, and more preferably between 1 and 30%, most preferably between 1 and 20% of the total population under study, and preferably selected among individuals exhibiting non-overlapping phenotypes.
  • the selection of those drastically different but relatively uniform phenotypes enables efficient comparisons in association studies and the possible detection of marked differences at the genetic level, provided that the sample sizes of the populations under study are significant enough.
  • a first group of between 50 and 300 trait positive individuals preferably about 100 individuals, are recruited according to their phenotypes. A similar number of control individuals are included in such studies.
  • the invention also comprises methods of detecting an association between a genotype and a phenotype, comprising the steps of: a) determining the frequency of at least one Kalpa- related polymo ⁇ hism in a trait positive population according to a genotyping method of the invention; b) determining the frequency of said Kalpa-related polymorphism in a control population according to a genotyping method of the invention; and c) determining whether a statistically significant association exists between said genotype and said phenotype.
  • said control population may be a trait negative population, or a random population;
  • each of said genotyping steps a) and b) may be performed on a pooled biological sample derived from each of said populations;
  • each of said genotyping of steps a) and b) is performed separately on biological samples derived from each individual in said population or a subsample thereof.
  • the general strategy to perform association studies using polymorphisms derived from a region carrying a candidate gene is to scan two groups of individuals (case-control populations) in order to measure and statistically compare the allele frequencies of the polymorphisms of the present invention in both groups.
  • a statistically significant association with a trait is identified for at least one or more of the analyzed polymo ⁇ hisms, one can assume that: either the associated allele is directly responsible for causing the trait (i.e. the associated allele is the trait causing allele), or more likely the associated allele is in linkage disequilibrium with the trait causing allele.
  • the specific characteristics of the associated allele with respect to the candidate gene function usually give further insight into the relationship between the associated allele and the trait (causal or in linkage disequilibrium).
  • the trait causing allele can be found by sequencing the vicinity of the associated marker, and performing further association studies with the polymorphisms that are revealed in an iterative manner.
  • association studies are usually run in two successive steps. In a first phase, the frequencies of a reduced number of polymo ⁇ hisms from the candidate gene are determined in the trait positive and control populations. In a second phase of the analysis, the position of the genetic loci responsible for the given trait is further refined using a higher density of markers from the relevant region. Alternatively, a single phase may be sufficient to establish significant associations.
  • the mutant allele when a chromosome carrying a disease allele first appears in a population as a result of either mutation or migration, the mutant allele necessarily resides on a chromosome having a set of linked markers: the ancestral haplotype.
  • This haplotype can be tracked through populations and its statistical association with a given trait can be analyzed. Complementing single point (allelic) association studies with multi-point association studies also called haplotype studies increases the statistical power of association studies.
  • haplotype association study allows one to define the frequency and the type of the ancestral carrier haplotype.
  • a haplotype analysis is important in that it increases the statistical power of an analysis involving individual markers.
  • a haplotype frequency analysis the frequency of the possible haplotypes based on various combinations of the identified polymorphisms of the invention is determined.
  • the haplotype frequency is then compared for distinct populations of trait positive and control individuals.
  • the number of trait positive individuals, which should be, subjected to this analysis to obtain statistically significant results usually ranges between 30 and 300, with a preferred number of individuals ranging between 50 and 150. The same considerations apply to the number of unaffected individuals (or random control) used in the study.
  • the results of this first analysis provide haplotype frequencies in case-control populations, for each evaluated haplotype frequency a p-value and an odd ratio are calculated. If a statistically significant association is found the relative risk for an individual carrying the given haplotype of being affected with the trait under study can be approximated.
  • An additional embodiment of the present invention encompasses methods of detecting an association between a haplotype and a phenotype, comprising the steps of: a) estimating the frequency of at least one haplotype in a trait positive population, according to a method of the invention for estimating the frequency of a haplotype; b) estimating the frequency of said haplotype in a control population, according to a method of the invention for estimating the frequency of a haplotype; and c) determining whether a statistically significant association exists between said haplotype and said phenotype.
  • the methods of detecting an association between a haplotype and a phenotype of the invention encompass methods with any further limitation described in this disclosure, or those following: optionally, wherein said Kalpa-related polymo ⁇ hism is a polymo ⁇ hism located in a sequence of SEQ ID Nos 1 or 2 or 3, the complements thereof, or optionally the polymorphisms in linkage disequilibrium therewith; optionally, said control population is a trait negative population, or a random population.
  • said method comprises the additional steps of determining the phenotype in said trait positive and said control populations prior to step c).
  • the polymorphisms of the present invention may also be used to identify patterns of polymorphisms associated with detectable traits resulting from polygenic interactions.
  • the analysis of genetic interaction between alleles at unlinked loci requires individual genotyping using the techniques described herein.
  • the analysis of allelic interaction among a selected set of polymorphisms with appropriate level of statistical significance can be considered as a haplotype analysis.
  • Interaction analysis comprises stratifying the case-control populations with respect to a given haplotype for the first loci and performing a haplotype analysis with the second loci with each subpopulation. Statistical methods used in association studies are further described below.
  • the polymorphisms of the present invention may further be used in TDT (transmission/disequilibrium test).
  • TDT tests for both linkage and association and is not affected by population stratification.
  • TDT requires data for affected individuals and their parents or data from unaffected sibs instead of from parents (see Spielmann S. et al., 1993; Schaid D.J. et al., 1996, Spielmann S. and Ewens W.J., 1998).
  • Such combined tests generally reduce the false — positive errors produced by separate analyses.
  • any method known in the art to test whether a trait and a genotype show a statistically significant correlation may be used.
  • haplotype frequencies can be estimated from the multilocus genotypic data. Any method known to person skilled in the art can be used to estimate haplotype frequencies (see Lange K., 1997; Weir, B.S., 1996) Preferably, maximum-likelihood haplotype frequencies are computed using an Expectation- Maximization (EM) algorithm (see Dempster et al., 1977; Excoffier L. and Slatkin M., 1995).
  • EM Expectation- Maximization
  • This procedure is an iterative process aiming at obtaining maximum-likelihood estimates of haplotype frequencies from multi-locus genotype data when the gametic phase is unknown.
  • Haplotype estimations are usually performed by applying the EM algorithm using for example the EM-HAPLO program (Hawley M. E. et al., 1994) or the Arlequin program (Schneider et al., 1997).
  • the EM algorithm is a generalized iterative maximum likelihood approach to estimation and is briefly described below.
  • phenotypes will refer to multi-locus genotypes with unknown haplotypic phase. Genotypes will refer to multi-locus genotypes with known haplotypic phase.
  • the E-M algorithm is composed of the following steps: First, the genotype frequencies are estimated from a set of initial values of haplotype frequencies. These haplotype frequencies are denoted P ° Pf , ,..., Pjf - The initial values for the haplotype frequencies may be obtained from a random number generator or in some other way well known in the art. This step is refe ⁇ ed to the Expectation step. The next step in the method, called the Maximization step, consists of using the estimates for the genotype frequencies to re-calculate the haplotype frequencies. The first iteration haplotype frequency estimates are denoted by Pp, P 2 (1) , P ,..., P H . In general, the Expectation step at the 5 th iteration consists of calculating the probability of placing each phenotype into the different possible genotypes based on the haplotype frequencies of the previous iteration:
  • ⁇ a is an indicator variable which counts the number of occurrences that haplotype t is present in i* genotype; it takes on values 0, 1, and 2.
  • the E-M iterations cease when the following criterion has been reached.
  • MLE Maximum Likelihood Estimation
  • linkage disequilibrium between any two genetic positions
  • linkage disequilibrium is measured by applying a statistical association test to haplotype data taken from a population.
  • Linkage disequilibrium (LD) between pairs of polymo ⁇ hisms can also be calculated for every allele combination (ai,aj ; ai,bj j b consultationa, and b beneficiaryb j ), according to the maximum- likelihood estimate (MLE) for delta (the composite genotypic disequilibrium coefficient), as described by Weir (Weir B. S., 1996).
  • MLE maximum- likelihood estimate
  • nj phenotype (a,/aaded a j /a,)
  • n phenotype (a,/adire a b j )
  • n 3 phenotype (a/b dislike a ⁇ /a 3 )
  • n4 phenotype (a,/b dislike a b j ) and N is the number of individuals in the sample. This formula allows linkage disequilibrium between alleles to be estimated when only genotype, and not haplotype, data are available.
  • Another means of calculating the linkage disequilibrium between markers is as follows. For a couple of polymo ⁇ hisms, M, ( ⁇ /b,) and M, ( ⁇ /b j ), fitting the Hardy-Weinberg equilibrium, one can estimate the four possible haplotype frequencies in a given population according to the approach described above.
  • pr( ⁇ ,) is the probability of allele ⁇
  • pr( ⁇ j ) is the probability of allele ⁇
  • pr(h ⁇ plotype ( ⁇ vine ⁇ j) is estimated as in Equation 3 above.
  • Linkage disequilibrium among a set of polymo ⁇ hisms having an adequate heterozygosity rate can be determined by genotyping between 50 and 1000 unrelated individuals, preferably between 75 and 200, more preferably around 100.
  • Methods for determining the statistical significance of a correlation between a phenotype and a genotype may be determined by any statistical test known in the art and with any accepted threshold of statistical significance being required. The application of particular methods and thresholds of significance are well with in the skill of the ordinary practitioner of the art. Testing for association is performed by determining the frequency of a polymo ⁇ hism allele in case and control populations and comparing these frequencies with a statistical test to determine if their is a statistically significant difference in frequency which would indicate a correlation between the trait and the polymo ⁇ hism allele under study.
  • a haplotype analysis is performed by estimating the frequencies of all possible haplotypes for a given set of polymorphisms in case and control populations, and comparing these frequencies with a statistical test to determine if their is a statistically significant correlation between the haplotype and the phenotype (trait) under study.
  • Any statistical tool useful to test for a statistically significant association between a genotype and a phenotype may be used.
  • the statistical test employed is a chi-square test with one degree of freedom. A P- value is calculated (the P-value is the probability that a statistic as large or larger than the observed one would occur by chance).
  • the p value related to a polymo ⁇ hism association is preferably about 1 x 10 "2 or less, more preferably about 1 x 10 "4 or less, for a single polymorphism analysis and about 1 x 10 "3 or less, still more preferably 1 x 10 "6 or less and most preferably of about 1 x 10 "8 or less, for a haplotype analysis involving two or more markers.
  • These values are believed to be applicable to any association studies involving single or multiple marker combinations.
  • the skilled person can use the range of values set forth above as a starting point in order to carry out association studies with polymorphisms of the present invention. In doing so, significant associations between the polymo ⁇ hisms of the present invention and a trait can be revealed and used for diagnosis and drug screening purposes.
  • genotyping data from case-control individuals are pooled and randomized with respect to the trait phenotype.
  • Each individual genotyping data is randomly allocated to two groups, which contain the same number of individuals as the case-control populations used to compile the data obtained in the first stage.
  • a second stage haplotype analysis is preferably run on these artificial groups, preferably for the markers included in the haplotype of the first stage analysis showing the highest relative risk coefficient. This experiment is reiterated preferably at least between 100 and 10000 times. The repeated iterations allow the determination of the probability to obtain the tested haplotype by chance.
  • a risk factor in genetic epidemiology the risk factor is the presence or the absence of a certain allele or haplotype at marker loci
  • AR Attributable risk
  • AR is the risk attributable to a polymorphism allele or a polymo ⁇ hism haplotype.
  • P E is the frequency of exposure to an allele or a haplotype within the population at large; and RR is the relative risk which, is approximated with the odds ratio when the trait under study has a relatively low incidence in the general population.
  • Identification of additional markers in linkage disequilibrium with a given marker involves: (a) amplifying a genomic fragment comprising a first polymorphism from a plurality of individuals; (b) identifying of second polymo ⁇ hisms in the genomic region harboring said first polymorphism; (c) conducting a linkage disequilibrium analysis between said first polymorphism and second polymorphisms; and (d) selecting said second polymorphisms as being in linkage disequilibrium with said first marker. Subcombinations comprising steps (b) and (c) are also contemplated.
  • the present invention then also concerns polymorphisms which are in linkage disequilibrium with the specific biallelic markers shown in Figure 2 and which are expected to present similar characteristics in terms of their respective association with a given trait.
  • the associated candidate gene can be scanned for mutations by comparing the sequences of a selected number of affected individuals and control individuals.
  • functional regions such as exons and splice sites, promoters and other regulatory regions of the candidate gene are scanned for mutations.
  • affected individuals carry the haplotype shown to be associated with the trait and trait negative or control individuals do not carry the haplotype or allele associated with the trait.
  • the mutation detection procedure is essentially similar to that used for SNP identification.
  • the method used to detect such mutations generally comprises the following steps: (a) amplification of a region of the candidate gene comprising a biallelic marker or a group of polymorphisms associated with the trait from DNA samples of affected patients and trait negative controls; (b) sequencing of the amplified region; (c) comparison of DNA sequences from affected trait-positive patients and trait-negative controls; and (d) determination of mutations specific to affected trait-positive patients. Subcombinations which comprise steps (b) and (c) are specifically contemplated.
  • candidate polymo ⁇ hisms be then verified by screening a larger population of cases and controls by means of any genotyping procedure such as those described herein, preferably using a microsequencing technique in an individual test format. Polymorphisms are considered as candidate mutations when present in cases and controls at frequencies compatible with the expected association results.
  • Candidate polymorphisms and mutations of the Kalpa gene suspected of being responsible for the detectable phenotype, such as low cholesterol or LDL levels, can be confirmed by screening a larger population of affected and unaffected individuals using any of the genotyping procedures described herein. Preferably the microsequencing technique is used. Such polymorphisms are considered as candidate"trait-causing"mutations when they exhibit a statistically significant correlation with the detectable phenotype.
  • the sample comprised three groups: 1) A set of 146 unrelated individuals called 'UIS for Unrelated Individual Set'. Due to the plate configuration
  • 'FBS Family Based Set' A set of 382 nuclear families (some of them related) with in general only the sibship genotyped called 'FBS Family Based Set'.
  • a sub-sample named 'FBS-adult' is composed of sets of families whose children are adults between 20-70. This sub-sample is composed of 194 families.
  • the marker coding was the following : the most frequent allele (as identified in the CUS) is coded allele 1, the less frequent 2. Allele 1 is used as the reference allele. All the genetic models and the results presented are thus the effect of allele 2 (or genotypes containing 2) relative to allele 1. For genotypic coding scheme, it is always the genotypes 11 or 11+ 12 that are taken as references.
  • o Factor unconstrained relation between homozygote 11, heterozygotes 12 and homozygotes 22
  • F o Co-dominant effect
  • D o dominant effect
  • R o recessive effect
  • the analysis of variance is the most simplest case with only one factor to test which has either 2 (D,C,R coding schemes) or 3 levels (in the Factor coding scheme).
  • the test of association is thus a usual Fisher test (which test for the importance of one factor in the partition of the overall variance of the trait) and, in the case of the Factor coding scheme, mean tests.
  • the test of association is done using a likelihood ratio test. It is similar in the concept that the co ⁇ elation between related individuals has to be taken in to account as a random component in the model.
  • Two different test can be constructed : a global association test which tests for association within and between families and a within-familial association test which takes into account only within families comparison. This latter test is robust against any ethnic stratification in the sample. If an association is present both test have to be significant.
  • HW Test of Hardy-Weinberg departure (following a chi square with 1 df).
  • markers SNP85.3 to SNP39.3 are in near complete to complete linkage disequilibrium (all D' measures are close to 1). Marker Coverage is sufficient in this gene.
  • haplotype 10 stays. Six haplotypes only are thus left: this shows that 1) recombination occurred between 0 SNP2.3 and the rest of markers, and 2) no recombination is observed between markers SNP85.3 to SNP39.3. This profile of haplotype frequencies explain the LD matrix.
  • V.l Single marker association 5 Association results are presented gene by gene. The adjustment has been done on age, gender and on BMI measurements.
  • C codominant model
  • R recessive model
  • D dominant model
  • This table 9 shows that the family based test are significant for total cholesterol and LDL-cholesterol with the markers SNP5.3, SNP50.3 and SNP81.3 and finally to a lesser extent 5 with SNP39.3.
  • the results are more significant with SNP5.3 and SNP50.3 (pvalues in brackets).
  • This test because it compares difference of genotypes within sibship is less powerful than a global test of association but it is robust to any stratification of the population. So, provided that we are not facing a false positive, the association results can not be explained by a stratification of the population and really reflects a true association.
  • the results reflects the results obtained on the FBS samples.
  • the allele 2 (T) of SNP5.3 is lowering the LDL-cholesterol.
  • the size of the effect is estimated to be between 15 to 20% of the absolute value of LDL-cholesterol.
  • Example I Association between chromosome 13q locus polymorphisms and lipid phenotypes Single Nucleotide polymo ⁇ hisms in genes located in the region defined by markers D13S158 to D13S265 were selected and subjected to a genotype-phenotype association analysis.
  • lipid phenotypes Five lipid phenotypes were studied, e.g. total Cholesterol (CHOL), HDL-cholesterol (HDL), LDL-cholesterol (LDL), Triglycerides (TGRL), the LDL/HDL ratio (LDL/HDL). Values of these five variables in the two sample sets were representative of what is expected in a population-based study. Twenty nine markers spanning the region were genotyped and analyzed in both sample.
  • a marker was reported significant either if at least one of the sample sets reported a significant result at the level of 0.05.
  • hCT 1640442 a gene encoding SOX21
  • hCT23424 a gene coding for ABCC4
  • hCT21820 a gene with homology to the proteasome i-chain gene
  • hCT 1644904 a gene with homology to EBI2 gene
  • hCT15697 a gene encoding propionyl-CoA carboxylase, subunit alpha
  • Example IB Association between Kalpa locus polymorphisms and lipid phenotypes
  • the DNA from individuals is extracted and tested for the detection of the polymorphisms.
  • peripheral venous blood 30 ml of peripheral venous blood are taken from each donor in the presence of EDTA.
  • Cells (pellet) are collected after centrifugation for 10 minutes at 2000 rpm.
  • Red cells are lysed by a lysis solution (50 ml final volume: 10 mM Tris pH7.6; 5 mM MgCl 2 ; 10 mM NaCl).
  • the solution is centrifuged (10 minutes, 2000 rpm) as many times as necessary to eliminate the residual red cells present in the supernatant, after resuspension of the pellet in the lysis solution.
  • the pellet of white cells is lysed overnight at 42°C with 3.7 ml of lysis solution composed of:
  • K-proteinase (2 mg K-proteinase in TE 10-2 / NaKalpa 0.4 M).
  • 1 ml saturated NaKalpa (6M) (1/3.5 v/v) is added. After vigorous agitation, the solution is centrifuged for 20 minutes at 10000 rpm.
  • the OD 260 / OD 280 ratio iss determined. Only DNA preparations having a OD 260 / OD 280 ratio between 1.8 and 2 are used in the subsequent examples described below.
  • the pool was constituted by mixing equivalent quantities of DNA from each individual.
  • the amplification of specific genomic sequences of the DNA samples of example 1 is carried out on the pool of DNA obtained previously. In addition, 50 individual samples are similarly amplified.
  • Each pair of first primers (about 20nt in length) is designed using the sequence information of the resepctive Kalpa gene disclosed herein and the OSP software (Hillier & Green, 1991). After heating at 95°C for 10 min, 40 cycles are performed. Each cycle comprises: 30 sec at 95°C, 54°C for 1 min, and 30 sec at 72°C. For final elongation, 10 min at 72°C ends the amplification. The quantities of the amplification products obtained are determined on 96- well microtiter plates, using a fluorometer and Picogreen as intercalant agent (Molecular Probes).
  • Substantially pure Kalpa protein or polypeptide is obtained.
  • concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per ml.
  • Monoclonal or polyclonal antibodies to the protein can then be prepared as follows: Monoclonal Antibody Production by Hybridoma Fusion Monoclonal antibody to epitopes in the Kalpa or a portion thereof can be prepared from murine hybridomas according to the classical method of Kohler and Milstein (Nature, 256: 495, 1975) or derivative methods thereof (see Harlow and Lane, Antibodies A Laboratory Manual, Cold Spring Harbor Laboratory, pp. 53-242, 1988).
  • a mouse is repetitively inoculated with a few micrograms of the Kalpa or a portion thereof over a period of a few weeks.
  • the mouse is then sacrificed, and the antibody producing cells of the spleen isolated.
  • the spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media).
  • HAT media aminopterin
  • Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as original described by Engvall, E., Meth. Enzymol. 70: 419 (1980). Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al. Basic Methods in Molecular Biology Elsevier, New York. Section 21-2.
  • Polyclonal antiserum containing antibodies to heterogeneous epitopes in the Kalpa or a portion thereof can be prepared by immunizing suitable non-human animal with the Kalpa or a portion thereof, which can be unmodified or modified to enhance immunogenicity.
  • a suitable nonhuman animal is preferably a non-human mammal is selected, usually a mouse, rat, rabbit, goat, or horse.
  • a crude preparation which, has been enriched for Kalpa concentration can be used to generate antibodies.
  • Such proteins, fragments or preparations are introduced into the non-human mammal in the presence of an appropriate adjuvant (e. g. aluminum hydroxide, RIBI, etc.) which is known in the art.
  • an appropriate adjuvant e. g. aluminum hydroxide, RIBI, etc.
  • the protein, fragment or preparation can be pretreated with an agent which will increase antigenicity, such agents are known in the art and include, for example, methylated bovine serum albumin (mBSA), bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin (KLH).
  • agents include, for example, methylated bovine serum albumin (mBSA), bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin (KLH).
  • mBSA methylated bovine serum albumin
  • BSA bovine serum albumin
  • Hepatitis B surface antigen Hepatitis B surface antigen
  • KLH keyhole limpet hemocyanin
  • Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. Techniques for producing and processing polyclonal antisera are known in the art, see for example, Mayer and Walker (1987). An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al. J. Clin. Endocrinol. Metab. 33: 988-991 (1971).
  • Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, O. et al., Chap. 19 in: Handbook of Experimental Immunology D. Wier (ed) Blackwell (1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12: M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of Clinical
  • High throughput two-hybrid screening assay for drugs that modulate Kalpa/Kalpa-target protein interaction To identify drugs that modulate Kalpa/WD-40 domain-containing protein interactions, a two-hybrid based high throughput screening assay is used.
  • AH 109 yeast cells (Clontech) cotransformed with plasmids pGBKT7- Kalpa and pGADT7- WD-40 domain-containing protein are grown in 384-well plates in selective media lacking Histidine and Adenine, according to manufacturer's instructions (MATCHMAKER two-hybrid system 3, Clontech).
  • Kalpa/WD-40 domain-containing protein binding will therefore inhibit yeast cell growth.
  • a high-throughput screen based on fluorescence polarization is used to monitor the displacement of a fluorescently labelled Kalpa protein from a recombinant glutathione-S-transferase (GST)-Kapla binding domain of WD-40 domain fusion protein.
  • Assays are carried out essentially as in Degterev et al, Nature Cell Biol. 3: 173-182 (2001) and Dandliker et al, Methods Enzymol. 74: 3-28 (1981).
  • the assay can be calibrated by titrating a Kalpa peptide labelled with Oregon Green with increasing amounts of GST- WD-40 domain protein. Binding of the peptide is accompanied by an increase in polarization (mP, millipolarization).
  • the Kalpa peptide preferably a peptide comprising a TPR repeat domain
  • the Kalpa peptide is expressed and purified using a QIAexpressionist kit (Qiagen) according to the manufacturer's instructions. Briefly, the entire Kalpa coding sequence is amplified by PCR using pGBKT7- Kalpa as a template and cloned into the Bam ⁇ i site of pQE30 vector (Qiagen). The resulting pQE30-HisKalpa plasmid is transformed in E.coli strain Ml 5 (Qiagen).
  • 6xHis-tagged-Kalpa protein is purified from inclusion bodies on a Ni-Agarose column (Qiagen) under denaturing conditions, and the eluate is used for in vitro interaction assays.
  • WD-40 is amplified by PCR and cloned in frame downstream of the Glutathione S-Transferase (GST) ORF, into the BamHI site of the pGEX-2T prokaryotic expression vector (Amersham Pharmacia Biotech).
  • GST Glutathione S-Transferase
  • GST- WD-40 fusion protein is expressed in E.Coli DH5 ⁇ ⁇ (supE44, DELTAlacU169 (801acZdeltaM15), hsdR17, recAl, endAl, gyrA96, thil, relA 1) and purified by affinity chromatography with glutathione sepharose according to supplier's instructions (Amersham Pharmacia Biotech).
  • Kalpa peptide is labelled with succinimidyl Oregon
  • Kalpa peptide 2 ⁇ M GST- WD-40 protein, 0.1% bovine gamma-globullin (Sigma) and 1 mM dithiothreitol mixed with PBS, pH 7.2 (Gibco), are added to 384-well black plates (Lab Systems) with Multidrop
  • High throughput chip assay to identify inhibitors of Kalpa/Kalpa target interaction A chip based binding assay (Degterev et al, Nature Cell Biol. 3: 173-182 (2001)) using unlabelled Kalpa and Kalpa target protein is be used to identify molecules capable of interfering with Kalpa and Kalpa -family target interactions, providing high sensitivity and avoiding potential interference from label moieties.
  • the Kalpa binding domain of WD-40 domain containing protein (WD-40 protein) is covalently attached to a surface- enhanced laser desroption/ionization (SELDI) chip, and binding of unlabelled Kalpa protein to immobilized protein in the presence of a test compound is monitored by mass spectrometry.
  • Recombinant Kalpa protein and GST- WD-40 fusion proteins are prepared as described in Example 6. Purified recombinant GST- WD-40 protein is coupled through its primary amine to SELDI chip surfaces derivatized with cabonyldiimidazole (Ciphergen). Kalpa protein is incubated in a total volume of 1 ⁇ l for 12 hours at 4 °C in a humidified chamber to allow binding to each spot of the SELDI chip, then washed with alternating high-pH and low-pH buffers (0.1M sodium acetate containing 0.5M NaCl, followed by 0.01 M HEPES, pH 7.3).
  • high-pH and low-pH buffers 0.1M sodium acetate containing 0.5M NaCl, followed by 0.01 M HEPES, pH 7.3.
  • the samples are embedded in alpha-cyano-4-hydroxycinnamic acid matric and analysed for mass by matrix-assisted laser deso ⁇ tion ionization time-of-flight (MALDI-TOF) mass spectrometry. Averages of 100 laser shots at a constant setting are collected over 20 spots in each sample.
  • MALDI-TOF matrix-assisted laser deso ⁇ tion ionization time-of-flight
  • FRET fluoresence resonance energy transfer
  • Kalpa protein is fused to cyan fluorescent protein (CFP) and WD-40 domain protein is fused to yellow fluorescent protein (YFP).
  • CFP cyan fluorescent protein
  • YFP yellow fluorescent protein
  • Vectors containing Kalpa and Kalpa target proteins can be constructed essentially as in Majhan et al (1998).
  • a Kalpa -CFP expression vector is generated by subcloning a Kalpa cDNA into the pECFP-Nl vector (Clontech).
  • a WD-40 domain -YFP expression vector is generated by subcloning a WD-40 domain cDNA into the pEYFP-Nl vector (Clontech).
  • Vectors are cotransfected to HEK-293 cells and cells are treated with test compounds.
  • HEK-293 cells are transfected with Kalpa -CFP and WD-40 domain-YFP expression vectors using Lipofect AMINE Plus (Gibco) or TransLT-1 (PanVera). 24 hours later cells are treated with test compounds and incubated for various time periods, preferably up to 48 hours. Cells are harvested in PBS, optionally supplemented with test compound, and fluorescence is determined with a C-60 fluorimeter (PTI) or a Wallac plate reader. Fluorescence in the samples separately expressing Kalpa -CFP and WD-40 domain-YFP is added together and used to estimate the FRET value in the absence of Kalpa / WD-40 domain binding.
  • PTI C-60 fluorimeter
  • Wallac plate reader Fluorescence in the samples separately expressing Kalpa -CFP and WD-40 domain-YFP is added together and used to estimate the FRET value in the absence of Kalpa / WD-40 domain binding.
  • the extent of FRET between CFP and YFP is determined as the ratio between the fluorescence at 527 nm and that at 475 nm after excitation at 433 nm.
  • the cotransfection of Kalpa protein and WD-40 domain protein results in an increase of FRET ratio over a reference FRET ratio of 1.0 (determined using samples expressing the proteins separately).
  • a change in the FRET ratio upon treatmemt with a test compound indicates a compound capable of modulating the interaction of the Kalpa protein and the WD-40 domain protein.
  • Aldo keto reductase activity is measured using the decrease in absorbance at 340nm as NADPH is consumed.
  • a standard reaction mixture is 135mM sodium phosphate buffer (pH 6.2- 7.2 depending on enzyme), 0.2mM NADPH, 0.3M lithium sulfate, 0.5-2.5 ⁇ g enzyme and an appropriate level of substrate.
  • the reaction is incubated at 30°C and the reaction is monitoredcontinuously with a spectrophotometer.
  • Enzyme activity is calculated as ml NADPH consumed/ ⁇ g of enzyme.
  • the binding of lithocholic acid to the enzymes is directly assessed by determining the unbound fraction of the bile acid at 25°C by ultracentrifugation assay using microconcentrators (Microcon 10; Amicon).
  • a volume of 0.4ml of binding assay mixture (lO ⁇ M enzyme and lO ⁇ M lithocholic acid or 20 ⁇ M NADP + in 0.1M potassium phosphate, pH7.4) was filtered by centrifugation at 13000g for 20min. As a control, the same volume of the mixture without the enzyme was filtered. The concentration of lithocholic acid in the filtrate was determined enzymaically.
  • the filtrate (0.2ml) was added to a reaction mixture (1.0ml) containing 0.1M glycine/NaOH, PH 10.0 and 0.25mM NADP + , and the reaction was started by addition of the enzyme ( ⁇ 10 ⁇ g), which oxidizes the bile acid. The fluorescence of NADPH was recorded until it was unchanged.
  • RT-PCR and Northern blots were carried out to experimentally characterize the product of the Kalpa gene.
  • Two sets of specific PCR primers targeting the 5' end of the Kalpa cDNA were designed.
  • the RT-PCR was performed on commercially available RNA extracts from human brain and human liver.
  • the first PCR reveals a distinctive band at 530 bp, shown in Fig 3A.
  • the cDNA is detected both in the Brain and in the Liver.
  • the specificity of the amplification was checked by performing a nested PCR.
  • the second PCR (B' and L') was been performed with the second set of primers.
  • Figure 3B shows the result of the nested PCR confirming the specificity of the reverse transcription.
  • the sequencing of the first PCR product (530 bp) was carried out, and the consensus sequence was aligned on the annotated cDNA Kalpa sequence. The match is perfect indicating that the product is specific to the Kalpa gene region.
  • This 530bp PCR product will be used as a probe for Northern blot analysis.
  • the labeled probe was purified using a Sephadex G50 column (Pharmacia) to remove the unincorporated nucleotides.
  • the two blots were washed two times for 15min with 2xSSC, 0.05% SDS at room temperature; and two times with O.lxSSC, 0.1 %> SDS at 65°C for 30min.
  • the blots were exposed to Hyperfilm (Amersham) for 2 days at -70°C.
  • the northern analysis reveals one Kalpa transcript at 3.63kb. There appears to be a tissue-specific distribution of the band. The most intense signal is detected in the skeletal muscle. The signals detected in the kidney, liver and brain are quite intense, while the signals detected in the placenta, spleen, small intestine, tongue, thyroid, stomach, spinal cord and prostate are less intense. The signals detected in the brain and liver (blot 1 lanes 1 and
  • TPR-proteins include, Cdcl ⁇ p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pasl0p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the pi 10 subunit of O-GlcNAc transferase.
  • Summary 3 ATP-binding cassette, sub-family C, member 4
  • Summary 4 ABCjmembrane ABC transporter transmembrane region. his family represents a unit of six transmembrane helices .
  • ABC transporter for a large family of proteine responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide, or belong in different polypeptide chains.
  • Summary 6 ATPases associated with a variety of cellular activities
  • Summary 7 SRY (sex determining region Y) - box 21, SRY-box21
  • Summary 8 HMG (High Mobility group) box
  • HMG_box Summary 9 HSP90-like protein

Abstract

The invention relates to a gene and the encoded protein involved in cholesterol lowering, and their use in diagnostics, treatment of disease, and in the identification of molecules for the treatment and prevention of disease. The present invention thus discloses methods of screening for the identification of molecules useful in the treatment of heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.

Description

ASSAYS FOR IDENTIFYING CHOLESTEROL-LO ERTNG MOLECULES
FIELD OF THE INVENTION
The invention relates to a gene and the encoded protein involved in cholesterol- lowering, and their use in diagnostics, treatment of disease, and in the identification of molecules for the treatment and prevention of disease. The present invention thus discloses methods of screening for the identification of molecules useful in the treatment of heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
BACKGROUND OF THE INVENTION
Complications of atherosclerosis, such as myocardial infarction, stroke and peripheral vascular disease are a major cause of mortality and morbidity. In addition, the quality of life of millions of people is adversely affected by angina and heart failure caused by coronary heart disease. Hyperlipidaemia has been associated with an increased risk of developing these conditions. For this reason it is desirable to understand the etiology of hyperlipidaemia and to develop effective treatments for this condition. Hyperlipidaemia has been defined as plasma cholesterol and triglyceride levels that exceed "normal" (95th percentile of levels of the general population) levels. However, the ideal cholesterol level is much less than the normal level of the general population. Many people have cholesterol levels above the ideal (hypercholesterolaemia) and are therefore at an elevated risk of coronary artery disease (CAD). It is known that reducing the cholesterol level in such people is very effective in reducing the risk of CAD. Hypertriglyceridaemia may also be involved in atherosclerosis and can, in extreme cases, cause potentially life-threatening pancreatitis. Hyperlipidaemia can arise through a genetic disorder, as a result of other medical conditions or environmental influences, or a combination of these factors.
There are several treatment modalities for people with high lipid levels can be beneficial. These include lowering the total cholesterol level, lowering the total triglyceride level and increasing the ratio of high density lipoprotein (HDL) cholesterol to low density lipoprotein (LDL) cholesterol. This latter improvement is important because there is evidence that LDL is proatherogenic and HDL is antiatherogenic so that increasing HDL:LDL ratio provides a degree of protection from atherosclerosis and CAD. Elevated serum LDL concentrations are a major cause of coronary atherosclerosis (Grundy et al., 1997, Arch Intern. Med. 157:1177-1184). Support for LDL's fundamental role derives from the discovery of the LDL receptor. Familial hypercholesterolemia (FH) is the most common (frequency 1/500) autosomal-dominant disease affecting lipid metabolism (Brown and Goldstein, 1986, Science 232:34-47); hetero2ygous affected persons have LDL levels twice normal levels and develop premature coronary disease, whereas homozygous individuals have sixfold-elevated LDL levels and often die of cardiovascular disease at age <20 years.
Cholesterol-synthesis inhibitors have exerted a gratifying effect on the course of atherosclerosis (Gould et al., 1998, Circulation 97:946-952). However, preferable would be to stimulate endogenous mechanisms that have a similar cholesterol-lowering effect. Hobbs et al. (1989, J. Clin. Invest., 84:656-664) presented strong evidence supporting the notion that a "lipid-lowering" gene exists. They described a family with FH that featured affected members with lower-than-expected LDL concentrations. Knoblauch, H. et al., (Am. J. Hum. Genet., 66:157-166, 2000), recently identified a large Arab family with FH whose FH-affected family members often have normal LDL concentrations and do not manifest atherosclerosis. Through linkage studies, a chromosomal region of 37 cM has thereby been identified that is linked with a cholesterol-lowering effect in this Arab family as well as a population of healthy twins (Knoblauch et al., 2000).
The present inventors have, by conducting genotype-phenotype association studies and expression studies, narrowed this region and identified the Kalpa gene as demonstrating an association with lowered cholesterol. The inventors have also provided screening assays which can be used to identify molecules for therapeutic treatment and diagnostics for cholesterol- related disorders.
SUMMARY OF THE INVENTION
The invention includes diagnostic and activity assays, and uses in therapeutics, for
Kalpa or portions thereof, as well as drug screening assays for identifying, selecting or assessing the activity of molecules capable of modulating cholesterol regulation (CHOL), HDL- cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, Triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation.
The compositions and methods of the invention are useful for example in the diagnosis and treatment of diseases such as heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
The present invention thus also relates to nucleic acid molecules, including in particular the complete cDNA sequences encoding Kalpa, portions thereof encoding polypeptides homologous thereto, as well as to polypeptides encoded by the Kalpa gene.
BRLEF DESCRIPTION OF THE FIGURES
Figure 1 depicts a cDNA sequence encoding the Kalpa protein, including the sequence referred to herein as SEQ ID No 3.
UDP_V1 from 1 to 3727nt (19 exons)
ORF from 287 to 2664 (stop to stop)
CDS from 385 to 2664 (methionine to stop) TM domains (TM from 1 to 10) are yellow boxes
TPR domains (TPR from 1 to 7) are white boxes
Homology with chromosome 5 starts at nt 976
Figure 2 depicts nucleic acid sequences comprising SNPs in and surrounding the Kalpa gene in the human chromosome 13q region, including the sequences referred to herein as SEQ ID Nos 5 to 23.
Figures 3A and 3B show the results of RT-PCR analysis, demonstrating amplification of a specific product in brain and liver. Figure 3C shows the results of Northern blot analysis, demonstrating the existence of one Kalpa transcript at 3.63 kb. There appears to be a tissue- specific distribution of the band. The most intense signal is detected in the skeletal muscle. The signals detected in the kidney, liver and brain are quite intense, while the signals detected in the placenta, spleen, small intestine, tongue, thyroid, stomach, spinal cord and prostate are less intense. The signals detected in the brain and liver (blot 1 lanes 1 and 8) were in agreement with the RT-PCR results.
Polymorphic bases in the SEQ IDS in the sequence listing or in the Figures are indicated using the single letter codes as indicated in Table A below. Table A
Figure imgf000005_0001
BRIEF DESCRIPTION OF THE SEQUENCE LISTING
SEQ ID No 1 is a genomic DNA sequence encoding the human Kalpa protein.
SEQ ID No 2 and SEQ ID No 3 are cDNA sequences encoding the human Kalpa protein.
SEQ ID No 4 is an amino acid sequence of the human Kalpa protein.
SEQ ID Nos 5 to 23 are nucleic acid sequences comprising SNPs in and surrounding the Kalpa gene in the human chromosome 13q region.
DETAILED DESCRIPTION OF THE INVENTION Overview
The molecular regulation of cellular sterol metabolism has been elucidated by Brown and Goldstein and their colleagues. The LDLR gene promoter contains a sterol response element (SRE) that is required for regulating transcription of the gene encoding LDLR in response to cellular sterol content. Two SRE-binding proteins (SREBP-1 and -2) that contain two transmembrane domains and are localized to the endoplasmic reticulum (ER). Another protein, termed SREBP-cleavage activating protein (SCAP), acts as a chaperone protein that transports the precursor SREBPs from the ER to the Golgi, where two proteases, site 1 and site 2 protease (SIP and S2P), sequentially cleave the SREBPs. The second cleavage liberates the mature SREBP proteins from the membrane, allowing them to enter the nucleus, bind to the SREs of target genes and, along with additional transcription factors, activate gene transcription. Responsiveness of the system to cellular sterol content is accomplished through a 'sterol-sensing domain' in SCAP. When cells are overloaded with sterols, SREBP in the SCAP complex is no longer accessible to SIP cleavage and remains membrane-bound. Conversely, when sterol concentrations are limiting, the current data suggest that the SCAP/SREBP complex can move to a post-ER compartment. At this location, the complex encounters SIP and S2P, leading to the release of soluble SREBP (reviewed by Brown and Goldstein, 1999). The statins inhibit HMGCoA reductase, the rate-limiting enzyme in cholesterol biosynthesis, reducing cellular sterol content and thereby de-repressing the transport of the SCAP-SREBP complex and resulting in the upregulation of LDLR, which in turn transports blood LDL into the liver, reducing blood levels of LDL.
Approximately two-thirds of plasma cholesterol in humans is transported on low- density lipoprotein (LDL) molecules. The concentration of LDL in the bloodstream is strongly correlated with the risk of developing premature heart disease to the extent that drugs are designed to lower serum LDL levels. Drugs that reduce the level of LDLin the bloodstream have been shown in numerous clinical trials to be effective in reducing the risk of developing heart disease. The most notable examples are the statins" (e.g. Zocor, Siravastatin, Lovastatin, Atorvastatin, Pravastatin), drugs that inhibit the activity of 3 -hydroxy methyl-glutaryl- coenzymeA reductase, an enzyme in the cholesterol biosynthetic pathway. However, people vary in their responsiveness to these drugs. In particular, some patients with severe forms of hypercholesterolemia are not very responsive to statins or to any other known drug therapy.
Three sources contribute to the level of cellular cholesterol: 1) dietary consumption, 2) synthesis in the liver, and 3) de novo synthesis in cells. The cellular level of cholesterol is controlled by both up-take and de novo synthesis. Low density (LDL) lipoprotein particles are the major form of plasma cholesterol. Most LDL particles arise from the conversion of very low density (VLDL) particles secreted by the liver. LDL particles are thus not directly synthesized. Rather, the liver produces very low density lipoprotein (VLDL), which is secreted into the bloodstream. While in the bloodstream, VLDL is converted into LDL. This occurs through the action of lipoprotein lipase (LPL), an enzyme residing on the lumenal surface of the capillary endothelium. LPL catalyzes the hydrolysis of the triglycerides in the VLDL particle, thus shrinking the diameter of the particle and enriching it for cholesterol and cholesterol ester (cholesterol ester is not a substrate for LPL). VLDL also acquires cholesterol ester through the action of cholesterol ester transfer protein (CETP). CETP is in the bloodstream and promotes the transfer of cholesterol ester ftom HDL to VLDLand the reciprocal transfer of triglyceride from VLDL to HDL. Thus, the actions of LPLand CETP lead to the conversion of a triglyceride-rich particle, VLDL, to a cholesterol-rich particle, LDL.
Excessive secretion of VLDL can lead to high levels of plasma VLDL and/or high levels of plasma LDL. Overproduction of VLDL has been seen as a metabolic consequence of many mutations in the LDL receptor. In addition, a separate metabolic disorder, termed "familial combined hyperlipidemia", also involves the overproduction of VLDL. Consequently, another strategy for dealing with disorders resulting inexcessive VLDL (hypertriglyceridemia), excessive LDL (hypercholesterolemia), or both (combined hyperlipidemia) is to interfere with the production and/or secretion of VLDL.
An elevation in serum LDL levels can be caused by diminished clearance ofLDL particles from the circulation or by increased production of LDL or both. The clearance of LDL from the circulation is largely mediated by the LDL receptor. Thus, patients with familial hypercholesterolemia, a disease known to be caused by LDL receptor mutations, have LDL levels 8 fold elevated (in the homozygous form) or 2 foldelevated (in the heterozygous form), as compared to patients with normal LDL receptor. This observation provides strong support for the key role of the LDL receptor in LDL metabolism.
Role of Kalpa in cholesterol regulation
The inventors' analysis of polymorphisms on a chromosomal region of 37 cM linked with a cholesterol-lowering effect led to the identification of the Kalpa gene described herein. This 37 cM region, identified in a large Arab family and a population of healthy twins, has been narrowed down to a genetic distance of 12 cM corresponding to 13.6 Mb, between markers D13S265 to D13S158. The invention thus provides method of treating and preventing disease and identifying medicaments for the treatment of disease, the methods involving acting on biological pathways and biological functions mediated by the Kalpa gene. The inventors have further characterized genes involved in cholesterol-lowering using differential expression analysis, thereby independently identifying the gene Kalpa as a candidate involved in cholesterol lowering pathway. Many techniques have been developed to analyze gene expression that differs widely in convenience, cost expense, and sensitivity. Techniques based on direct sequence analysis or specific hybridization of probes to microarrays provide the most comprehensive and sensitive analysis of gene expression. However, in both approaches, the sequences to be analyzed must either be known or cloned and processed individually beforehand, usually with the aid of complex robotics systems. This makes it difficult to isolate and/or monitor many potentially important genes that are differentially expressed at low absolute levels against a background of more abundantly expressed genes. The method used by the inventors involved phenotype specific gene identification designed to identify genes associated (over or under expressed) with the metabolic changes that underline the transition from a "reference" phenotype to an "affected" phenotype. For example, a "reference" phenotype and a related "affected" phenotype can represent a hepatic cell treated and untreated with a drug. In each sample, the total RNA is extracted, the mRNA purified and the cDNA synthesized. The two cDNA's are then mixed together and submitted to a subtractive hybridization process in order to select those genes which are differentially-expressed. The recovered cDNA is inserted in a cloning vector and the corresponding clones are randomly sequenced. The inventors carried out differential expression analysis for the study of the pathogenesis of cardiovascular diseases with a special emphasis on the pharmacogenomics of 3-hydroxy-3-methylglutaryl coenyzme A reductase inhibitors, known as statins. Statins exert the greatest effect on plasma LDL cholesterol. The beneficial effects of Statins have been shown in a series of epidemiological primary and secondary prevention studies. In addition to their action on LDL cholesterol, these drugs also increase HDL cholesterol, reduce triglycerides and have a beneficial effect on some of the fundamental mechanisms involved in the development of arteriosclerosis. Kalpa was identified as differentially expressed in cells which are treated versus untreated by statins, and upon analysis of chromosomal location of all differentially expressed genes was analysed. Kalpa was found to map to the locus of interest between D13S265 and D13S158.
The inventors further carried out an association study to study more closely polymoφhisms in the Kalpa gene that are associated with the low-cholesterol phenotype. Kalpa polymorphisms were identied, including one frequent coding polymorphism (SNP5.3) in exon 10 of the Kalpa gene, a non conservative polymoφhism (Isoleucine to Valine) with an allelic frequency of 0.34. In a sample of 382 nuclear famillies taken from the german population. An association test of LDL-Cholesterol adjusted for age, BMI and gender with this coding polymorphism was performed. An association was found using familial based association method (ref Abecassis et al., 2001). The p-value associated with the test was 0.003. In an independent cohort of german adult of 143 individuals this association was confirmed using analysis of variance with LDL-Cholesterol adjusted for age, BMI and gender, giving a p-value of 0.03. The size of the effect was similar in both population. These two independent associations studies show that the non-conservative coding polymoφhism in the Kalpa gene impact the LDL-cholesterol metabolism in the general population. To further characterize Kalpa, a study of the expression profile was carried out. A northern blot analysis was performed on 24 different human tissues. A probe of the Kalpa gene was obtained by RT-PCR experiments and the sequence was obtained in order to confirm the specificity of the probe to the Kalpa gene sequence. Analysis of the results revealed tissue- specific expression of the gene in: skeletal muscle, kidney, liver, placenta, brain and thyroid. No expression was detected in many different tissues such as heart, colon, thymus, lung and adrenal gland. The messenger RNA was expected to be approximately of 3.8Kb in size, and a single isoform was detected by using the probe.
Definitions
As used interchangeably herein, the terms "oligonucleotides", "nucleic acids" and "polynucleotides"include RNA, DNA, or RNA/DNA hybrid sequences of more than one nucleotide in either single chain or duplex form.
As used herein, the term "nucleic acids" and "nucleic acid molecule" is intended to include DNA molecules (e.g., cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA or RNA generated using nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA. Throughout the present specification, the expression "nucleotide sequence" may be employed to designate indifferently a polynucleotide or a nucleic acid. More precisely, the expression "nucleotide sequence" encompasses the nucleic material itself and is thus not restricted to the sequence information (i.e. the succession of letters chosen among the four base letters) that biochemically characterizes a specific DNA or RNA molecule. Also, used interchangeably herein are terms "nucleic acids", "oligonucleotides", and "polynucleotides".
An "isolated" nucleic acid molecule is one which is separated from other nucleic acid molecules which are present in the natural source of the nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated Kalpa nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. A nucleic acid molecule of the present invention, e.g., a nucleic acid molecule having the nucleotide sequence of SEQ ID Nos 1 or 2 or 3, a portion thereof, can be isolated using standard molecular biology techniques and the sequence information provided herein. Using all or a portion of the nucleic acid sequence of SEQ ID Nos 1 or 2 or 3, as a hybridization probe, Kalpa nucleic acid molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning. A Laboratory Manual., 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).
Moreover, a nucleic acid molecule encompassing all or a portion of e.g. SEQ ID Nos 1 or 2 or 3, can be isolated by the polymerase chain reaction (PCR) using synthetic oligonucleotide primers designed based upon the sequence of SEQ ID No 1 or 2 or 3.
A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis. Furthermore, oligonucleotides corresponding to Kalpa nucleotide sequences can be prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer.
As used herein, the term "hybridizes to" is intended to describe conditions for moderate stringency or high stringency hybridization, preferably where the hybridization and washing conditions permit nucleotide sequences at least 60% homologous to each other to remain hybridized to each other. Preferably, the conditions are such that sequences at least about 70%, more preferably at least about 80%, even more preferably at least about 85%, 90%, 95% or 98% homologous to each other typically remain hybridized to each other. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. A preferred, non-limiting example of stringent hybridization conditions are as follows: the hybridization step is realized at 65°C in the presence of 6 x SSC buffer, 5 x Denhardt's solution, 0,5% SDS and lOOμg/ml of salmon sperm DNA. The hybridization step is followed by four washing steps:
- two washings during 5 min, preferably at 65°C in a 2 x SSC and 0.1%SDS buffer;
- one washing during 30 min, preferably at 65°C in a 2 x SSC and 0.1% SDS buffer, - one washing during 10 min, preferably at 65°C in a 0.1 x SSC and 0.1 %SDS buffer, these hybridization conditions being suitable for a nucleic acid molecule of about 20 nucleotides in length. It will be appreciated that the hybridization conditions described above are to be adapted according to the length of the desired nucleic acid, following techniques well known to the one skilled in the art, for example be adapted according to the teachings disclosed in Hames B.D. and Higgins S.J. (1985) Nucleic Acid Hybridization: A Practical Approach. Hames and Higgins Ed., IRL Press, Oxford; and Current Protocols in Molecular Biolog (supra). Preferably, an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to a sequence of SEQ ID No 1 or 2 or 3 corresponds to a naturally-occurring nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).
To determine the percent homology of two amino acid sequences or of two nucleic acids, the sequences are aligned for optimal comparison puiposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison puφoses is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%>, and even more preferably at least 70%, 80%>, 90% or 95% of the length of the reference sequence, preferably at least 100, preferably at least 200, more preferably at least 300, even more preferably at least 400, and even more preferably at least 500, 600, at least 700, at least 800, at least 900, at least 1000, at least 1200, at least 1400, at least 1600, at least 1800, or at least 2000 nucleotides are aligned. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are homologous at that position (i.e., as used herein amino acid or nucleic acid "identity" is equivalent to amino acid or nucleic acid "homology"). The percent homology between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % homology=# of identical positions/total # of positions 100).
The comparison of sequences and determination of percent homology between two sequences can be accomplished using a mathematical algorithm. A preferred, non-limiting example of a mathematical algorithim utilized for the comparison of sequences is the algorithm of Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2264-68, modified as in Karlin and Altschul, 1993, Proc. Natl. Acad. Sci. USA 90:5873-77, the disclosures of which are incorporated herein by reference in their entireties. Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of Altschul, et al., 1990, J. Mol. Biol. 215:403- 10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to Kalpa nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to Kalpa molecules of the invention. To obtain gapped alignments for comparison puφoses, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Research 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov, the disclosures of which are incorporated herein by reference in their entireties. Another preferred, non-limiting example of a mathematical algorithim utilized for the comparison of sequences is the algorithm of Myers and Miller, CABIOS (1989), the disclosures of which are incorporated herein by reference in their entireties. Such an algorithm is incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. The term "polypeptide" refers to a polymer of amino acids without regard to the length of the polymer; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This term also does not specify or exclude post-expression modifications of polypeptides, for example, polypeptides which include the covalent attachment of glycosyl groups, acetyl groups, phosphate groups, lipid groups and the like are expressly encompassed by the term polypeptide. Also included within the definition are polypeptides which contain one or more analogs of an amino acid (including, for example, non-naturally occurring amino acids, amino acids which only occur naturally in an unrelated biological system, modified amino acids from mammalian systems etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring. The term "polypeptide" refers to a polymer of amino without regard to the length of the polymer; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide.
An "isolated" or "purified" protein or biologically active portion thereof is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the Kalpa polypeptide, or a biologically active fragment or homologue thereof protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. The language "substantially free of cellular material" includes preparations of a protein according to the invention (e.g. Kalpa polypeptide, or a biologically active fragment or homologue thereof) in which the protein is separated from cellular components of the cells from which it is isolated or recombinantly produced. In one embodiment, the language "substantially free of cellular material" includes preparations of a protein according to the invention having less than about 30% (by dry weight) of protein other than the Kalpa (also referred to herein as a "contaminating protein"), more preferably less than about 20% of protein other than the protein according to the invention, still more preferably less than about 10% of protein other than the protein according to the invention, and most preferably less than about 5% of protein other than the protein according to the invention. When the protein according to the invention or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%), more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation.
The language "substantially free of chemical precursors or other chemicals" includes preparations of Kalpa polypeptide, or a biologically active fragment or homologue thereof in which the protein is separated from chemical precursors or other chemicals which are involved in the synthesis of the protein. In one embodiment, the language "substantially free of chemical precursors or other chemicals" includes preparations of a Kalpa having less than about 30% (by dry weight) of chemical precursors or non- Kalpa chemicals, more preferably less than about 20% chemical precursors or non- Kalpa chemicals, still more preferably less than about 10% chemical precursors or non- Kalpa chemicals, and most preferably less than about 5% chemical precursors or non- Kalpa chemicals. The term "recombinant polypeptide" is used herein to refer to polypeptides that have been artificially designed and which comprise at least two polypeptide sequences that are not found as contiguous polypeptide sequences in their initial natural environment, or to refer to polypeptides which have been expressed from a recombinant polynucleotide.
Accordingly, another aspect of the invention pertains to anti-Kalpa antibodies. The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site which specifically binds (immunoreacts with) an antigen, such as a Kalpa polypeptide, or a biologically active fragment or homologue thereof. Examples of immunologically active portions of immunoglobulin molecules include F(ab) and F(ab')2 fragments which can be generated by treating the antibody with an enzyme such as pepsin. The invention provides polyclonal and monoclonal antibodies that bind a Kalpa polypeptide, or a biologically active fragment or homologue thereof. The term "monoclonal antibody" or "monoclonal antibody composition", as used herein, refers to a population of antibody molecules that contain only one species of an antigen binding site capable of immunoreacting with a particular epitope of a Kalpa polypeptide. A monoclonal antibody composition thus typically displays a single binding affinity for a particular Kalpa with which it immunoreacts.
The term "primer" denotes a specific oligonucleotide sequence which is complementary to a target nucleotide sequence and used to hybridize to the target nucleotide sequence. A primer serves as an initiation point for nucleotide polymerization catalyzed by DNA polymerase, RNA polymerase or reverse transcriptase.
The term "probe" denotes a defined nucleic acid segment (or nucleotide analog segment, e. g., polynucleotide as defined herein) which can be used to identify a specific polynucleotide sequence present in samples, said nucleic acid segment comprising a nucleotide sequence complementary of the specific polynucleotide sequence to be identified.
The term "cholesterol-related disorder" refers to any condition arising from, influenced by or influencing regulation of cholesterol or cholesterol levels. For example, a "cholesterol- related disorder" may be a condition arising from, influenced by or characterized by abnormal levels of or regulation of cholesterol. By way of example a "cholesterol-related disorder" includes diseases or disorders arising from, influenced by or incluencing regulation of HDL- cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation. Included by way of example and not limitation are heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as dysmetabolic syndrome, obesity and type II diabetes. As used herein, "lowering cholesterol" or "cholesterol-lowering" may refer to lowering total cholesterol, lowering HDL-cholesterol, lowering LDL-cholesterol, lowering triglycerides and modulating, preferably lowering the LDL/HDL ratio.
The terms "trait" and "phenotype" are used interchangeably herein and refer to any visible, detectable or otherwise measurable property of an organism such as symptoms of, or susceptibility to a disease for example.
The term "allele" is used herein to refer to variants of a nucleotide sequence. A biallelic polymorphism has two forms. Typically the first identified allele is designated as the original allele whereas other alleles are designated as alternative alleles. Diploid organisms may be homozygous or heterozygous for an allelic form. The term "heterozygosity rate" is used herein to refer to the incidence of individuals in a population, which are heterozygous at a particular allele. In a biallelic system the heterozygosity rate is on average equal to 2Pa (1-Pa), where Pa is the frequency of the least common allele. In order to be useful in genetic studies a genetic marker should have an adequate level of heterozygosity to allow a reasonable probability that a randomly selected person will be heterozygous.
The term "genotype" as used herein refers the identity of the alleles present in an individual or a sample. In the context of the present invention a genotype preferably refers to the description of the polymoφhism alleles present in an individual or a sample. The term "genotyping" a sample or an individual for a polymorphism consists of determining the specific allele or the specific nucleotide carried by an individual at a polymoφhism.
The term "mutation" as used herein refers to a difference in DNA sequence between or among different genomes or individuals which has a frequency below 1%>.
The term "haplotype" refers to a combination of alleles present in an individual or a sample. In the context of the present invention a haplotype preferably refers to a combination of polymorphism alleles found in a given individual and which may be associated with a phenotype.
The term "polymoφhism" as used herein refers to the occurrence of two or more alternative genomic sequences or alleles between or among different genomes or individuals. "Polymoφhic" refers to the condition in which two or more variants of a specific genomic sequence can be found in a population. A "polymoφhic site" is the locus at which the variation occurs. A "single nucleotide polymoφhism" is a single base pair change. Typically a single nucleotide polymoφhism is the replacement of one nucleotide by another nucleotide at the polymorphic site. Deletion of a single nucleotide or insertion of a single nucleotide, also give rise to single nucleotide polymoφhisms. In the context of the present invention "single nucleotide polymorphism" preferably refers to a single nucleotide substitution. Typically, between different genomes or between different individuals, the polymoφhic site may be occupied by two different nucleotides.
The term "polymorphism" includes "biallelic marker", which as used herein refers to a polymoφhism having two alleles at a fairly high frequency in the population, preferably a single nucleotide polymoφhism (SNP). The terms "bilallelic marker" and "biallelic polymoφhisms" may also be used interchangeably with the terms "single nucleotide polymorphism". A "polymoφhism allele" refers to the nucleotide variants present a polymoφhic site. Typically the frequency of the less common allele of the polymorphisms of the present invention has been validated to be greater than 1%, preferably the frequency is greater than 10%, more preferably the frequency is at least 20% (i. e. heterozygosity rate of at least 0.32), even more preferably the frequency is at least 30% (i. e. heterozygosity rate of at least 0.42). A polymorphism wherein the frequency of the less common allele is 30%) or more is termed a"high quality polymoφhism." A "promoter" refers to a DNA sequence recognized by the synthetic machinery of the cell required to initiate the specific transcription of a gene.
As used herein, the term "operably linked" refers to a linkage of polynucleotide elements in a functional relationship. For instance, a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the coding sequence. More precisely, two DNA molecules (such as a polynucleotide containing a promoter region and a polynucleotide encoding a desired polypeptide or polynucleotide) are said to be "operably linked" if the nature of the linkage between the two polynucleotides does not (I) result in the introduction of a frame-shift mutation or (2) interfere with the ability of the polynucleotide containing the promoter to direct the transcription of the coding polynucleotide.
The term "upstream" is used herein to refer to a location, which is toward the 5' end of the polynucleotide from a specific reference point.
The terms "base paired" and "Watson & Crick base paired" are used interchangeably herein to refer to nucleotides which can be hydrogen bonded to one another be virtue of their sequence identities in a manner like that found in double-helical DNA with thymine or uracil residues linked to adenine residues by two hydrogen bonds and cytosi refers to an antigenic determinant of a polypeptide. An epitope can comprise as few as 3 amino acids in a spatial conformation which is unique to the epitope. Generally an epitope consists of at least 6 such amino acids, and more usually at least 8-10 such amino acids. Methods for determining the amino acids which make up an epitope include x-ray crystallography, 2-dimensional nuclear magnetic resonance, and epitope mapping e.g. the Pepscan method described by H. Mario Geysen et al., 1984, Proc. Natl. Acad. Sci. U.S.A. 81:3998-4002; PCT Publication No. WO 84/03564; and PCT Publication No. WO 84/03506.
As used herein the term "Kalpa-related polymoφhism" relates to one or a set of polymoφhisms in linkage disequilibrium with the Kalpa gene. The term Kalpa-related polymoφhisms also encompasses polymorphisms located on SEQ ID No 2 or 3, or preferably the polymorphisms disclosed in Table 3 and as identified in SEQ ID Nos 5 to 23. The preferred Kalpa protein-related polymorphisms alleles of the present invention can include each a Kalpa allele, the allele(s) described individually or in groups consisting of all the possible combinations of the alleles.
The term "non-genic" is used herein to describe genomic sequences, as well as polynucleotides and primers which occur outside the nucleotide positions shown in the human Kalpa genomic sequence of SEQ ID No 1. The term "genie" is used herein to describe Kalpa gene as well as polynucleotides and primers which do occur in the nucleotide positions shown in the human Kalpa genomic sequence of SEQ ID No 1.
The terms "polymoφhism described in Figure" and "allele described in Figure" are used herein to refer to any or all alleles which are listed in the allele feature in the appended Sequence Listing for each Sequence ID number referenced.
The Kalpa Gene and Protein
Based on genotype-phenotype association studies and differential expression studies, the inventors have identified a gene involved in the low cholesterol phenotype (Knoblauch et al, 2000). This gene is referred to herein as the Kalpa gene. Furthermore, stuctural features of
10 the Kalpa protein have been expoited to develop screening assays that can be used in the preparation or screening of medicaments capable of modulating Kalpa activity. Table B summarizes the positions of functional domains on the Kalpa protein.
Table B
15
Figure imgf000017_0001
Kalpa is a membrane protein involved in SREBP traffic
The molecular regulation of cellular sterol metabolism has been elucidated by Brown 0 and Goldstein and their colleagues. The LDLR gene promoter contains a sterol response element (SRE) that is required for regulating transcription of the gene encoding LDLR in response to cellular sterol content. Two SRE-binding proteins (SREBP- 1 and -2) that contain two transmembrane domains and are localized to the endoplasmic reticulum (ER). Another protein, termed SREBP-cleavage activating protein (SCAP), acts as a chaperone protein that 5 transports the precursor SREBPs from the ER to the Golgi, where two proteases, site 1 and site 2 protease (SIP and S2P), sequentially cleave the SREBPs. The second cleavage liberates the mature SREBP proteins from the membrane, allowing them to enter the nucleus, bind to the SREs of target genes and, along with additional transcription factors, activate gene transcription. Responsiveness of the system to cellular sterol content is accomplished through a 'sterol-sensing domain1 in SCAP. When cells are overloaded with sterols, SREBP in the SCAP complex is no longer accessible to SIP cleavage and remains membrane-bound. Conversely, when sterol concentrations are limiting, the current data suggest that the SCAP/SREBP complex can move to a post-ER compartment. At this location, the complex encounters S 1 P and S2P, leading to the release of soluble SREBP (reviewed by Brown and Goldstein, Proc. Natl. Acad. Sci. USA., 1999, 96(20): 11041-8).
The statins inhibit HMGCoA reductase, the rate-limiting enzyme in cholesterol biosynthesis, reducing cellular sterol content and thereby de-repressing the transport of the SCAP-SREBP complex and resulting in the upregulation of LDLR, which in turn transports blood LDL into the liver, reducing blood levels of LDL.
The Kalpa protein of the invention contains five transmembrane domains in the N teminal part of the protein and a TPR motif. According to the present invention, the TPR motif may be involved in protein-protein interaction (e.g. interactions with Kalpa-targets or Kalpa- target proteins). It has further been proposed that the TPR protein preferably interacts with WD-40 repeat proteins. The SCAP contain a WD-40 motif which is know to interact with the membrane bound SREBP. Thus, in one aspect, the invention provides assays based on interactions of the TPR motif of the Kalpa protein with a WD-40 domain. In more preferred aspects, the TPR-motif, and hence the Kapla protein interacts with a SCAP polypeptide. Thus, in a preferred aspect, the Kalpa protein of the invention participates in the internalization and the transport of the precursor SREBPs from the ER to the Golgi, where SIP and S2P, sequentially cleave the SREBPs and liberate the mature and active SREBP proteins.
The Kalpa protein also contains an Aldo-keto reductase (AKRs) domain. In further aspects, the assays of the invention involve detecting aldo-keto reductase activity. Aldo-keto reductases (AKRs) form an enzyme superfamily including aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenase and dihydrodiol dehydrogenases. They are monomeric NADPH-dependent oxidoreductase, about 320 residues in size, with broad substrate specificities (Bohren et al., 1989, J.Biol.Chem. 264:9547-51; Jez et al., 1997, Biochem. J. 326:625-36). These enzymes exist in cellular cytoplasm as monomeric 34- to 36-kD proteins. The AKRs are found in mammals, amphibians, plants yeast, protozoa and bacteria. They metabolize a diverse range of substrates, including aliphatic and aromatic aldehydes, monosaccharides, steroid, prostaglandin, polycyclic aromatic hydrocarbons and isoflavinoids. These enzymes catalyze the reduction of carbonyl-containing compounds, like carbonyl- containing sugars and aromatic compounds, to the corresponding alcohols. The aldo-keto reductase enzymes are structurally very similar. They share a common (α/β) 8-barrel three- dimensional fold and have a highly conserved nicotinamide-cofactor-binding pocket. However, they exhibit differences in both substrate and coenzyme specificity. One known reaction catalyzed by a family member, aldose reductase, is the reduction of the circulating glucose to sorbitol, a hyperosmotic sugar, which is then further metabolized to fructose by sorbitol dehydrogenase. Under normal conditions, the reduction of glucose to sorbitol is a minor pathway. In hyperglycemic states, however, the accumulation of sorbitol is implicated in the development of diabetic complications (OMIM* 103880 Aldo-keto reductase family 1, member Bl). Members of this enzyme family are also highly expressed in some liver cancers (Cao et al., 1998, J.Biol. Chem. 273:11429-35). Similarly, the mammalian aldehyde reductases play a significant role in the metabolism of neurotransmitter aldehydes produced by monoamine oxidase, and aldehyde reductases inhibitors may have anti-depressant properties. In addition the hydroxysteroid dehydrogenase of this superfamily have the potential to act as molecular switches, by converting potent steroid hormones into inactive metabolites, thereby regulating the amount of hormone that can bind and activate nuclear receptors. Another member of the aldo-keto reductase family, the bile acidic binding protein (AKR1C2, also named DD2 and 3α-HSD type III) has been isolated from the liver and shown to have high- affinity binding for the bile (Hara et al., 1996, Biochem. J., 313:373-76). This AKR enzyme may assist in the rapid intracellular transport of bile acids from the sinusoidal to the canalicular pole of the cell.
Interestingly, an aldose reductase-like sequence similar to human and rat aldose reductase has also been reported in the mouse, the mouse vas deferens protein (MVDP) which expression is regulated by cyclic AMP at the protein and mRNA levels (Aigueperse et al., 1999, J. Endocrinol., 160:147-54). Since aldose reductase is a major reductase for isoproaldehyde, a product of side-chain cleavage of cholesterol, in human and animal adrenal glands, this may offer a clue to the function of MVDP in adrenals.
Many of the mammalian AKRs are potential therapeutic targets, and structure-based drug design may lead to compounds with the desired specificity and clinical efficacy. The crystallographic structures of aldehyde and aldose reductase are known as well as the aldose reductase homolgues FR-1 and CHO reductase. Current research is focused on the structural differences that allow subtle but distinct substrate specificities and on refining the catalytic mechanism through site-directed mutagenesis of active site residues. The invention thus provides method for the identification of aldo keto reductase inhibitors and activators. Briefly, compounds to be tested are arrayed in the well of a multi-well plate in varying concentrations along with an appropriate buffer and substrate. Aldo keto reductase is measured for each well and the ability of each compound to inhibit the aldo keto reductase activity can be determined, as well as the dose-response profiles. This assay can also be used to identify molecules which enhance aldo keto reductase activity. Furthermore, several known aldo-reductase inhibitors are available and can be used, including: (a) hexoestol analogues; (b) 17 beta-oestradiol; (c) phenolphtalein; (d) flufenamic acid; (e) 3,5,3 ',5'-tetraiodothyropφionic acid analogs.
Examples of functional assays for aldo-keto reductase activity are provided herein. Further examples can be found in the following references, the disclosures of all of which are incoφorated herein by reference.
1. Ciaccio, P. J., Jaiswal, A. K., Tew, K. D.: Regulation of human dihydrodiol dehydrogenase by Michael acceptor xenobiotics. J. Biol. Chem., 269: 15558-15562, 1994.
2. Deyashiki, Y., Ogasawara, A., Nakayama, T., Nakanishi, M., Miyabe, Y., Sato, K., Hara, A.: Molecular cloning of two human liver 3-alpha-hydroxysteroid/dihydrodiol dehydrogenase isoenzymes that are identical with chlordecone reductase and bile-acid binder. Biochem. J., 299:545-552, 1994.
3. Hara, A., Matsuura, K., Tamada, Y., Sato, K., Miyabe, Y., Deyashiki, Y., Ishida, N.: Relationship of human liver dihydrodiol dehydrogenases to hepatic bile-acid-binding protein and oxidoreductase of human colon cells. Biochem. J., 313:373-376, 1996.
4. Khanna, M., Qin, K.-N., Klisak, I., Belkin, S., Sparkes, R. S., Cheng, K.-C: Localization of multiple human dihydrodiol dehydrogenase (DDH1 and DDH2) and chlordecone reductase (CHDR) genes in chromosome 10 by the polymerase chain reaction and fluorescence in situ hybridization. Gen omics 25: 588-590, 1995. 5. Khanna, M., Qin, K.-N., Klisak, I., Belkin, S., Sparkes, R. S., Cheng, K.-C: Localization of multiple human dihydrodiol dehydrogenase (DDH1 and DDH2) and chlordecone reductase (CHDR) genes in chromosome 10 by the polymerase chain reaction and fluorescence in situ hybridization. Genomics 25: 588-590, 1995.
6. Qin, K.-N., Khanna, M., Cheng, K.-C: Structure of a gene coding for human dihydrodiol dehydrogenase/bile acid-binding protein. Gene 149: 357-361, 1994.
7. Qin, K.-N., New, M. I., Cheng, K.-C: Molecular cloning of multiple cDNAs encoding human enzymes structurally related to 3-alpha-hydroxysteroid dehydrogenase. J. Steroid Biochem. Molec. Biol., 46:673-679, 1993.
8. Stolz, A., Hammond, L., Lou, H., Takikawa, H., Ronk, M., Shively, J. E.: cDNA cloning and expression of the human hepatic bile acid-binding protein: a member of the monomeric reductase gene family. J. Biol. Chem., 268:10448-10457, 1993.
As used interchangeably herein, a "Kalpa activity", "biological activity of Kalpa" or "functional activity of Kalpa", refers to an activity exerted by a Kalpa polypeptide or nucleic acid molecule, or a biologically active fragment or homologue thereof as determined in vivo, or in vitro, according to standard techniques. In one embodiment, a Kalpa activity is a direct activity, such as an association with a Kalpa-target molecule, glycosyltransferase activity such as preferably an O-linked glycosylation activity (preferably O-GlcNAc transferase activity) and/or aldo-keto reductase activity. As used herein, a "Kalpa target molecule" is a molecule with which a Kalpa binds or interacts in nature, such that a Kalpa-mediated function is achieved. For example, a Kalpa target molecule can be another Kalpa protein or polypeptide which is substantially identical or which shares structural similarity (e.g. forming a dimer or multimer). In another example, a Kalpa target molecule can be a non-Kalpa comprising protein molecule, or a non-self molecule, such as a polypeptide containing a WD-40 domain, or more preferably a SCAP protein. Binding or interaction with a Kalpa target molecule or with other targets can be detected for example using a two hybrid-based assay in yeast to find drugs that disrupt interaction of the Kalpa bait with the target prey, or an in vitro interaction assay with recombinant Kalpa and target proteins. Alternatively, a Kalpa activity may be an indirect activity, such as an activity mediated by interaction of the Kalpa with a Kalpa target molecule such that the target molecule modulates a downstream cellular activity (e.g., interaction of a Kalpa molecule with a Kalpa target molecule can modulate the activity of that target molecule on an intracellular signaling pathway). For example, interaction of Kalpa molecule with a SCAP protein may modulate a SREBP activity or localization and/or LDL receptor expression.
In other aspects, a Kalpa activity is detected by assessing any of the following activities: cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation, LDL/HDL ratio regulation; LDL-R expression, or any suitable therapeutic endpoint discussed herein in the section titled "Methods of Treatment". Kalpa activity may be assessed either in vitro (cell or non-cell based) or in vivo depending on the assay type and format.
A Kalpa polypeptide, or a biologically active fragment or homologue thereof can be the minimum region of a polypeptide that is necessary and sufficient for activity.
Functional domains of Kalpa are further described herein, and positions in the Kalpa polypeptide of SEQ ID No 4 are listed in Table B. Assays and polypeptide of the inventions preferably comprise, consist essentially of or consist of at least one Kalpa functional domain shown in Table B. However, it will be appreciated that a functional Kalpa protein may be only a small portion of the respective protein, about 10 amino acids to about 15 amino acids, or from about 20 amino acids to about 25 amino acids, or from about 30 amino acids to about 35 amino acids, or from about 40 amino acids to about 45 amino acids, or from about 50 amino acids to about 55 amino acids, or from about 60 amino acids to about 70 amino acids, or from about 80 amino acids to about 90 amino acids, or about 100 amino acids in length. Alternatively, Kalpa or Kalpa polypeptide activity, as defined above, may require a larger portion of the native protein than may be defined by protein-protein interaction, DNA binding, cell assays or by sequence alignment. A portion of a Kalpa -containing polypeptide from about 110 amino acids to about 115 amino acids, or from about 120 amino acids to 130 amino acids, or from about 140 amino acids to about 150 amino acids, or from about 160 amino acids to about 170 amino acids, or from about 180 amino acids to about 190 amino acids, or from about 200 amino acids to about 250 amino acids, or from about 300 amino acids to about 350 amino acids, or from about 400 amino acids to about 450 amino acids, or from about 500 amino acids to about 700 or about 760 amino acids, of SEQ ID No 4, may be required for function.
As discussed, the invention includes novel protein domains of the Kalpa protein, including methods of screening for modulators of Kalpa which act on the novel domains. The invention thus encompasses a Kalpa polypeptide comprising a polypeptide having at least a Kalpa sequence in the protein or corresponding nucleic acid molecule, preferably a Kalpa sequence corresponding of SEQ ID No 4. A Kalpa member may comprise an amino acid sequence of at least about 25, 30, 34, 40, 45, 50, 60, 70, 80 to 90 amino acid residues in length, of which at least about 50-80%, preferably at least about 60-10%, more preferably at least about 65%, 75%) or 90% of the amino acid residues are identical or similar amino acids-to a functional domain of Kalpa of SEQ ID No 4.
Identity or similarity may be determined using any desired algorithm, including the algorithms and parameters for determining homology which are described herein. Isolated proteins of the present invention, preferably Kalpa polypeptides, or a biologically active fragments or homologues thereof, have an amino acid sequence sufficiently homologous to the respective amino acid sequence of SEQ ID No 4. As used herein, the term "sufficiently homologous" refers to a first amino acid or nucleotide sequence which contains a sufficient or minimum number of identical or equivalent (e.g., an amino acid residue which has a similar side chain) amino acid residues or nucleotides to a second amino acid or nucleotide sequence such that the first and second amino acid or nucleotide sequences share common structural domains or motifs and/or a common functional activity. For example, amino acid or nucleotide sequences which share common structural domains have at least about 30-40%) identity, preferably at least about 40-50% identity, more preferably at least about 50-60%, and even more preferably at least about 60-70%, 70-80%, 80%, 90%, 95%, 97%, 98%, 99% or 99.8%) identity across the amino acid sequences of the domains and contain at least one and preferably two structural domains or motifs, are defined herein as sufficiently homologous. Furthermore, amino acid or nucleotide sequences which share at least about 30%), preferably at least about 40%, more preferably at least about 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99% or 99.8% identity and share a common functional activity are defined herein as sufficiently homologous.
It be appreciated that the invention encompasses any of the Kalpa polypeptides, as well as fragment thereof, nucleic acids complementary thereto and nucleic acids capable of hybridizing thereto under stringent conditions.
Genomic Sequences of the Kalpa gene
The genomic sequence of the Kalpa gene for use in accordance with the present invention is provided in SEQ ID No 1. The Kalpa gene genomic sequence comprises exons and introns as well as sequences approximately lOkb upstream and downstream of the first and last Kalpa exon. The positions of the exons on the respective SEQ ID No for the Kalpa genes are provided in Table 1. Particularly preferred genomic sequences of the Kalpa gene include isolated, purified, or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1. Preferably said contiguous span comprises at least 1 one single nucleotide polymorphism, or the complements thereof.
The present invention provides Kalpa intron and exon polynucleotide sequences including polymoφhisms. Particularly preferred polynucleotides of the present invention include purified, isolated or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of a sequence of SEQ ID No 1 or the complements thereof, wherein said span includes a polymoφhism. Optionally said polymoφhism is selected from the polymorphisms at the nucleotide positions on SEQ ID No 1 described in the column titled ["Gene_position"] in Table 3. This column of Table SNP provides the position of each SNP of SEQ ID Nos 5 to 23 on the respective genomic sequence of SEQ ID No 1. It will be appreciated that either allele specified at position 25 in the respective SEQ ID No of SEQ ID Nos: 5 to 23 may be present at the polymoφhic base.
Table 1
Figure imgf000024_0001
The nucleic acids defining the Kalpa gene intronic polynucleotides may be used as oligonucleotide primers or probes in order to detect the presence of a copy of a Kalpa gene in a test sample, or alternatively in order to amplify a target nucleotide sequence within the Kalpa sequences.
The genomic sequences of the Kalpa gene contains regulatory sequences both in the non-coding 5'-flanking region and in the non-coding 3'-flanking region that border the Kalpa transcribed region containing the exons of the gene. The promoter activity of the regulatory regions contained in the Kalpa gene of polynucleotide sequences of SEQ ID No 1 can be assessed by any known method. Methods for identifying the polynucleotide fragments of SEQ ID No 1 involved in the regulation of the expression of the Kalpa gene are well-known to those skilled in the art (see Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989). An example of a typical method, that can be used, involves a recombinant vector carrying a reporter gene and genomic sequences from a Kalpa genomic sequence of SEQ ID No 1. Briefly, the expression of the reporter gene (for example beta galactosidase or chloramphenicol acetyl transferase) is detected when placed under the control of a biologically active polynucleotide fragment. Genomic sequences located upstream of the first exon of a Kalpa gene may be cloned into any suitable promoter reporter vector, such as the pSEAP-Basic, pSEAP-Enhancer, ppgal-Basic, p (3gal- Enhancer, or pEGFP-I Promoter Reporter vectors available from Clontech, or pGL2-basic or pGL3 -basic promoterless luciferase reporter gene vector from Promega. Each of these promoter reporter vectors include multiple cloning sites positioned upstream of a reporter gene encoding a readily assayable protein such as secreted alkaline phosphatase, luciferase, beta galactosidase, or green fluorescent protein. The sequences upstream the first exon of a Kalpa gene are inserted into the cloning sites upstream of the reporter gene in both orientations and introduced into an appropriate host cell. The level of reporter protein is assayed and compared to the level obtained with a vector lacking an insert in the cloning site. The presence of an elevated expression level in the vector containing the insert with respect to the control vector indicates the presence of a promoter in the insert.
Promoter sequences within the 5' non-coding regions of the Kalpa gene may be further defined by constructing nested 5' and/or 3' deletions using conventional techniques such as Exonuclease III or appropriate restriction endonuclease digestion. The resulting deletion fragments can be inserted into the promoter reporter vector to determine whether the deletion has reduced or obliterated promoter activity, such as described, for example, by Coles et al. (Hum. Mol. Genet., 7: 791-800, 1998). In this way, the boundaries of the promoter
The activity and the specificity of the promoter of a Kalpa gene can further be assessed by monitoring the expression level of a detectable polynucleotide operably linked to the respective Kalpa gene promoter in different types of cells and tissues. The detectable polynucleotide may be either a polynucleotide that specifically hybridizes with a predefined oligonucleotide probe, or a polynucleotide encoding a detectable protein, including a Kalpa gene polypeptide or a fragment or a variant thereof. This type of assay is well known to those skilled in the art and is described in US 5,502,176, and US 5,266,488. Polynucleotides carrying the regulatory elements located both at the 5' end and at the 3' end of the Kalpa gene coding region may be advantageously used to control the transcriptional and translational activity of a heterologous polynucleotide of interest, said polynucleotide being heterologous as regards to the Kalpa gene regulatory region.
A "biologically active" fragment of a Kalpa regulatory element, preferably a element comprised in SEQ ID No 1 according to the present invention is a polynucleotide comprising or alternatively consisting of a fragment of said polynucleotide which is functional as a regulatory region for expressing a recombinant polypeptide or a recombinant polynucleotide in a recombinant cell host. For the purpose of the invention, a nucleic acid or polynucleotide is "functional" as a regulatory region for expressing a recombinant polypeptide or a recombinant polynucleotide if said regulatory polynucleotide contains nucleotide sequences which contain transcriptional and translational regulatory information, and such sequences are "operably linked" to nucleotide sequences which encode the desired polypeptide or the desired polynucleotide. The regulatory polynucleotides according to the invention may be advantageously part of a recombinant expression vector that may be used to express a coding sequence in a desired host cell or host organism.
A further object of the invention consists of an isolated polynucleotide comprising: a) a nucleic acid comprising a regulatory nucleotide sequence selected from the group consisting of a nucleotide sequence comprising a polynucleotide of SEQ ED No 1; b) a polynucleotide encoding a desired polypeptide or a nucleic acid of interest, operably linked to the nucleic acid defined in (a) above.
The polypeptide encoded by the nucleic acid described above may be of various nature or origin, encompassing proteins of prokaryotic or eukaryotic origin. Among the polypeptides expressed under the control of a Kalpa gene regulatory region, there may be cited bacterial, fungal or viral antigens.
The desired nucleic acids encoded by the above described polynucleotide, usually a RNA molecule, may be complementary to a desired coding polynucleotide, for example to the Kalpa gene coding sequence, and thus useful as an antisense polynucleotide.
Such a polynucleotide may be included in a recombinant expression vector in order to express the desired polypeptide or the desired nucleic acid in host cell or in a host organism.
cDNA Sequences of the Kalpa gene As mentioned, the invention relates to the use of Kalpa proteins. Kalpa cDNAs for use in accordance with the present invention are described in of SEQ ID No 2 or 3. Figure 2 shows the location on the SEQ ID No 3 cDNA of Kalpa functional domains.
The Open Reading Frame encoding the respective Kalpa proteins spans from the nucleotide positions of SEQ ID No 2 or 3 as shown in Table 2.
Additional preferred cDNA polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, or 1000 nucleotides from a sequence of SEQ ID No 2 or 3 and the complements thereof. Preferably said contiguous span is a contiguous span selected from group consisting of the sequences of nucleic acid positions 112 to 2493 of SEQ ID No 2 or of the sequences of nucleic acid positions 287 to 2664 of SEQ ID No 3, and the complements thereof. Additional preferred polynucleotides include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300, 400, 500, or 1000 nucleotides from a sequence of SEQ ID No 2 or 3, wherein said contiguous span comprises one of the alleles at a polymorphic base.
Table 2
Figure imgf000027_0001
In other aspects, a Kalpa cDNA may comprise, consist or consist essentially of a functional domain referred to in Table 4. Table 4 provides the position of the respective domain (referred to in Table 4 as "feature") on the cDNA of the respective gene. Additional preferred cDNA polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 25, 30, 35, 40, 50, 75, 100, 150 or 200 nucleotides from a sequence selected from the group consisting of the positions on SEQ ID No 2 or 3 of the functional domains listed in Table 4, and the complements thereof.
The polynucleotide disclosed above that contains the coding sequence of the Kalpa gene of the invention may be expressed in a desired host cell or a desired host organism, when this polynucleotide is placed under the control of suitable expression signals. The expression signals may be either the expression signals contained in the regulatory regions in the Kalpa gene of the invention or may be exogenous regulatory nucleic sequences. Such a polynucleotide, when placed under the suitable expression signals, may also be inserted in a vector for its expression.
For use in accordance with the invention is a purified, isolated, or recombinant nucleic acid comprising the nucleotide sequence of SEQ ID No 2 or 3, complementary sequences thereto, and fragments thereof. The invention also pertains to a purified or isolated nucleic acid comprising a polynucleotide having at least 70%, 80%, 85%), 90%> or 95%> nucleotide identity with a polynucleotide of SEQ ID No 2 or 3, or advantageously 99 % nucleotide identity, preferably 99.5%> nucleotide identity and most preferably 99.8% nucleotide identity with a polynucleotide of SEQ ID No 2 or 3, or a sequence complementary thereto or a biologically active fragment thereof. Another object of the invention relates to purified, isolated or recombinant nucleic acids comprising a polynucleotide that hybridizes, under the stringent hybridization conditions defined herein, with a polynucleotide of SEQ ID No 2 or 3, or a sequence complementary thereto or a variant thereof or a biologically active fragment thereof. Also encompassed is a purified, isolated, or recombinant nucleic acid polynucleotide encoding a Kalpa polypeptide, as further described herein.
In another preferred aspect, the invention pertains to purified or isolated nucleic acid molecules that encode a portion or variant of a Kalpa protein, wherein the portion or variant displays a Kalpa activity. Preferably said portion or variant is a portion or variant of a naturally occurring full-length Kalpa comprising, consisting essentially of, or consisting of a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 2 or 3, wherein said nucleic acid encodes a portion or variant having a Kalpa activity described herein. In other embodiment, the invention relates to a polynucleotide encoding a portion consisting of 8-20, 20-50, 50-70, 60-100, 100 - 150, 150- 200, 200-250 or 250 - 350 amino acids, of SEQ ID No 4, or a variant thereof, wherein said portion displays a Kalpa activity described herein. A Kalpa variant nucleic acid may, for example, encode a biologically active Kalpa comprising at least 1, 2, 3, 5, 10, 20 or 30 amino acid changes from the respective sequence selected from the group consisting of SEQ ID No 4, or may encode a biologically active Kalpa comprising at least 1%, 2%, 3%, 5%, 8%, 10% or 15% changes in amino acids from the respective sequence of SEQ ID No 4. Also encompassed are nucleic acid molecules which are complementary to Kalpa nucleic acids described herein. Preferably, a complementary nucleic acid is sufficiently complementary to the nucleotide respective sequence shown in SEQ ID No 2 or 3, such that it can hybridize to said nucleotide sequence shown in SEQ ID No 2 or 3, thereby forming a stable duplex. Another object of the invention is a purified, isolated, or recombinant nucleic acid encoding a Kalpa polypeptide comprising, consisting essentially of, or consisting of an amino acid sequence selected from the group consisting of SEQ ID No 4, or fragments thereof, wherein the isolated nucleic acid molecule encodes a functional domain of a Kalpa (e.g. for example the domains for which amino acid positions on the respective SEQ ID No are described in Table 4). Preferably said functional domain is a Kalpa target binding region, preferably a protein interaction domain or TPR repeat domain. In preferred embodiments, a Kalpa nucleic acid encodes a Kalpa polypeptide comprising at least two Kalpa functional domains, such as for example a glycosyltransferase domain and a TPR repeat domain, or an aldo-keto reductase domain and a TPR repeat domain, or at least two TPR repeat domains.
Particularly preferred nucleic acids of the invention include isolated, purified, or recombinant Kalpa nucleic acids comprising, consisting essentially of, or consisting of a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or
250 nucleotides of a sequence of nucleotide positions coding for the relevant amino acids as given in the SEQ ID No 2 or 3.
A nucleic acid fragment encoding a biologically active portion of a Kalpa can be prepared by isolating a portion of a nucleotide sequence of SEQ ID No 2 or 3, which encodes a polypeptide having a Kalpa biological activity (the biological activities of the Kalpa described herein), expressing the encoded portion of the Kalpa (e.g., by recombinant expression in vitro or in vivo) and assessing the activity of the encoded portion of the Kalpa.
The invention further encompasses nucleic acid molecules that differ from the Kalpa nucleotide sequences of the invention due to degeneracy of the genetic code and encode the same Kalpa, or fragment thereof, of the invention.
Polymorphisms and Polynucleotides Comprising Polymorphisms
The polymorphisms as indicated in SEQ ID Nos 5 to 23 by listing the alternative bases present at the polymophic base. The polymoφhic base is present at nucleotide position 25 in each of the 49mer nucleotide seqeunces of SEQ ID Nos 5 to 23. Table 3 summarizes the position of each SNP in a Kalpa gene, and provides the position of the SNP in the genomic sequence of the Kalpa gene by referring to the genomic sequences of SEQ ID No 1. As seen from Figure 1, the polymoφhism at nucleotide position 1696 of SEQ ID No 3 is an exonic non- conservative polymorphism changing Isoleucine to Valine at amino acid residue 438 of the depicted encoded protein or at amino acid position 472 of SEQ ID No 4.
The present invention encompasses polynucleotides for use as primers and probes in the methods of the invention. These polynucleotides may consist of, consist essentially of, or comprise a contiguous span of nucleotides of a sequence from any sequence in the Sequence Listing as well as sequences which are complementary thereto ("complements thereof). The "contiguous span" may be at least 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that a contiguous span of these lengths is consistent with the lengths of the particular Sequence ID. It should be noted that the polynucleotides of the present invention are not limited to having the exact flanking sequences surrounding the polymoφhic bases, which are enumerated in the Sequence Listing. Rather, it will be appreciated that the flanking sequences surrounding the polymoφhisms, or any of the primers of probes of the invention which, are more distant from the markers, may be lengthened or shortened to . any extent compatible with their intended use and the present invention specifically contemplates such sequences. It will be appreciated that the polynucleotides referred to in the Sequence Listing may be of any length compatible with their intended use. Also the flanking regions outside of the contiguous span need not be homologous to native flanking sequences which actually occur in human subjects. The addition of any nucleotide sequence, which is compatible with the nucleotides intended use is specifically contemplated. The contiguous span may optionally include the Kalpa-related polymorphism in said sequence. SNPs, or polymoφhisms, generally consist of a polymoφhism at one single base position. Each polymoφhism therefore corresponds to two forms of a polynucleotide sequence which, when compared with one another, present a nucleotide modification at one position. Usually, the nucleotide modification involves the substitution of one nucleotide for another. Optionally either the original or the alternative allele of the polymoφhisms disclosed in Table 3, or the first or second allele disclosed in SEQ ID Nos 5 to 23 may be specified as being present at the Kalpa-related SNP. The original allele can be obtained the genomic sequences of SEQ ID No 1. Optionally, the SNPs may be specified which consist of more complex polymorphisms including insertions/deletions of at least one nucleotide.
Preferred polynucleotides may consist of, consist essentially of, or comprise a contiguous span of nucleotides of a sequence from SEQ ID Nos 1, 2 or 3, or 5 to 23 as well as sequences which are complementary thereto. The "contiguous span" may be at least 8, 10, 12, 15, 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that a contiguous span of these lengths is consistent with the lengths of the particular Sequence ID. The contiguous span may optionally comprise a polymorphism selected from the group consisting of polymorphisms of Table 3 or present in SEQ ID Nos 5 to 23.
The invention also relates to polynucleotides that hybridize, under conditions of high or intennediate stringency, to a polynucleotide of a sequence from any sequence in the Sequence Listing as well as sequences, which are complementary thereto. Preferably such polynucleotides are at least 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that a polynucleotide of these lengths is consistent with the lengths of the particular Sequence ID. Preferred polynucleotides comprise a Kalpa-related polymorphism. Optionally either allele (e.g. the original or the alternative allele) of the polymoφhisms disclosed in may be specified as being present at the Kalpa-related polymorphism.
Particularly preferred polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of SEQ ID No 1, wherein said contiguous span comprises at least 1, 2, 3, 4, 5 or 10 of the nucleotide positions of polymoφhic bases listed in Table 3, and the complements thereof. Said nucleotide positions may specify either of the alleles indicated in the sequence of SEQ ID NO 5 to 23 as corresponding to the particular polymorphism. Additional preferred polynucleotides of the invention include isolated, purified or recombinant polynucleotides comprising a contiguous span of at least 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides from a sequence of SEQ ID No 2 or 3, wherein said contiguous span comprises at least one Kalpa-related polymoφhism. The present invention further embodies isolated, purified, and recombinant polynucleotides which encode polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of SEQ ID No 4, wherein said contiguous span comprises at least one Kalpa-related polymorphism.
The primers of the present invention may be designed from the disclosed sequences for any method known in the art. A preferred set of primers is fashioned such that the 3' end of the contiguous span of identity with the sequences of the Sequence Listing is present at the 3' end of the primer. Such a configuration allows the 3' end of the primer to hybridize to a selected nucleic acid sequence and dramatically increases the efficiency of the primer for amplification or sequencing reactions. In a preferred set of primers the contiguous span is found in one of the nucleic sequences described in the sequence listing. Allele specific primers may be designed such that a polymorphism is at the 3' end of the contiguous span and the contiguous span is present at the 3' end of the primer. Such allele specific primers tend to selectively prime an amplification or sequencing reaction so long as they are used with a nucleic acid sample that contains one of the two alleles present at a polymoφhism. The 3' end of primers of the invention may be located within or at least 2, 4, 6, 8, 10, 12, 15, 18, 20, 25, 49, 50, 100, 250, 500, or 1000, to the extent that this distance is consistent with the particular Sequence ID, nucleotides upstream of a Kalpa-related polymoφhism in said sequence or at any other location which is appropriate for their intended use in sequencing, amplification or the location of novel sequences or markers. A preferred set of amplification primers is derived from sequences described in SEQ ID Nos. 5 to 23. Primers with their 3' ends located 1 nucleotide upstream of a Kalpa-related polymorphism have a special utility as microsequencing assays.
The probes of the present invention may be designed from the disclosed sequences for any method known in the art, particularly methods which allow for testing if a particular sequence or marker disclosed herein is present. A preferred set of probes may be designed for use in the hybridization assays of the invention in any manner known in the art such that they selectively bind to one allele of a polymorphism, but not the other under any particular set of assay conditions. Preferred hybridization probes may consists of, consist essentially of, or comprise a contiguous span which ranges in length from 8, 10, 12, 15, 18 or 20 to 25, 35, 40, 50, 60, 70, or 80 nucleotides, or be specified as being 12, 15, 18, 20, 25, 35, 40, or 50 nucleotides in length and including a Kalpa-related polymoφhism of said sequence. Optionally, said polymoφhism may be within 6, 5, 4, 3, 2, or 1 nucleotides of the center of the hybridization probe or at the center of said probe.
The location of nucleotides in a polynucleotide with respect to the center of the polynucleotide is described herein in the following manner. When a polynucleotide has an odd number of nucleotides, the nucleotide at an equal distance from the 3' and 5' ends of the polynucleotide is considered to be"at the center" of the polynucleotide, and any nucleotide immediately adjacent to the nucleotide at the center, or the nucleotide at the center itself is considered to be "within 1 nucleotide of the center." With an odd number of nucleotides in a polynucleotide any of the five nucleotides positions in the middle of the polynucleotide would be considered to be within 2 nucleotides of the center, and so on. When a polynucleotide has an even number of nucleotides, there would be a bond and not a nucleotide at the center of the polynucleotide. Thus, either of the two central nucleotides would be considered to be "within 1 nucleotide of the center" and any of the four nucleotides in the middle of the polynucleotide would be considered to be "within 2 nucleotides of the center", and so on.
For polymoφhisms which involve the substitution, insertion or deletion of I or more nucleotides, the polymoφhism, allele or polymorphism is "at the center" of a polynucleotide if the difference between the distance from the substituted, inserted, or deleted polynucleotides of the polymoφhism and the 3' end of the polynucleotide, and the distance from the substituted, inserted, or deleted polynucleotides of the polymoφhism and the 5' end of the polynucleotide is zero or one nucleotide. If this difference is 0 to 3, then the polymoφhism is considered to be "within 1 nucleotide of the center." If the difference is 0 to 5, the polymorphism is considered to be "within 2 nucleotides of the center." If the difference is 0 to 7, the polymorphism is considered to be "within 3 nucleotides of the center, " and so on. For polymorphisms which involve the substitution, insertion or deletion of I or more nucleotides, the polymoφhism, allele or polymoφhism is "at the center" of a polynucleotide if the difference between the distance from the substituted, inserted, or deleted polynucleotides of the polymoφhism and the 3' end of the polynucleotide, and the distance from the substituted, inserted, or deleted polynucleotides of the polymoφhism and the 5'end of the polynucleotide is zero or one nucleotide. If this difference is 0 to 3, then the polymorphism is considered to be "within 1 nucleotide of the center." If the difference is 0 to 5, the polymorphism is considered to be "within 2 nucleotides of the center." If the difference is 0 to 7, the polymorphism is considered to be "within 3 nucleotides of the center" and so on.
Probes and primers
Any of the polynucleotides of the present invention can be labeled, if desired, by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include radioactive substances, fluorescent dyes or biotin.
Preferably, polynucleotides are labeled at their 3' and 5' ends. A label can also be used to capture the primer, so as to facilitate the immobilization of either the primer or a primer extension product, such as amplified DNA, on a solid support. A capture label is attached to the primers or probes and can be a specific binding member which forms a binding pair with the solid phase reagent's specific binding member (e. g. biotin and streptavidin). Therefore depending upon the type of label carried by a polynucleotide or a probe, it may be employed to capture or to detect the target DNA. Further, it will be understood that the polynucleotides, primers or probes provided herein, may, themselves, serve as the capture label. For example, in the case where a solid phase reagent's binding member is a nucleic acid sequence, it may be selected such that it binds a complementary portion of a primer or probe to thereby immobilize the primer or probe to the solid phase. In cases where a polynucleotide probe itself serves as the binding member, those skilled in the art will recognize that the probe will contain a sequence or "tail" that is not complementary to the target. In the case where a polynucleotide primer itself serves as the capture label, at least a portion of the primer will be free to hybridize with a nucleic acid on a solid phase. DNA Labeling techniques are well known to the skilled technician.
Any of the polynucleotides, primers and probes of the present invention can be conveniently immobilized on a solid support. Solid supports are known to those skilled in the art and include the walls of wells of a reaction tray, test tubes, polystyrene beads, magnetic beads, nitrocellulose strips, membranes, microparticles such as latex particles, sheep (or other animal) red blood cells, duracytest) and others. The solid support is not critical and can be selected by one skilled in the art. Thus, latex particles, microparticles, magnetic or nonmagnetic beads, membranes, plastic tubes, walls of microtiter wells, glass or silicon chips, sheep (or other suitable animal's) red blood cells and duracytes are all suitable examples. Suitable methods for immobilizing nucleic acids on solid phases include ionic, hydrophobic, covalent interactions and the like. A solid support, as used herein, refers to any material which is insoluble, or can be made insoluble by a subsequent reaction. The solid support can be chosen for its intrinsic ability to attract and immobilize the capture reagent. Alternatively, the solid phase can retain an additional receptor which has the ability to attract and immobilize the capture reagent. The additional receptor can include a charged substance that is oppositely charged with respect to the capture reagent itself or to a charged substance conjugated to the capture reagent. As yet another alternative, the receptor molecule can be any specific binding member which is immobilized upon (attached to) the solid support and which has the ability to immobilize the capture reagent through a specific binding reaction. The receptor molecule enables the indirect binding of the capture reagent to a solid support material before the performance of the assay or during the performance of the assay. The solid phase thus can be a plastic, derivatized plastic, magnetic or non-magnetic metal, glass or silicon surface of a test tube, microtiter well, sheet, bead, microparticle, chip, sheep (or other suitable animal's) red blood cells, duracytes and other configurations known to those of ordinary skill in the art. The polynucleotides of the invention can be attached to or immobilized on a solid support individually or in groups of at least 2, 5, 8, 10, 12, 15, 20, or 25 distinct polynucleotides of the inventions to a single solid support. In addition, polynucleotides other than those of the invention may be attached to the same solid support as one or more polynucleotides of the invention.
Any polynucleotide provided herein may be attached in overlapping areas or at random locations on the solid support. Alternatively the polynucleotides of the invention may be attached in an ordered array wherein each polynucleotide is attached to a distinct region of the solid support which does not overlap with the attachment site of any other polynucleotide. Preferably, such an ordered array of polynucleotides is designed to be "addressable" where the distinct locations are recorded and can be accessed as part of an assay procedure. Addressable polynucleotide arrays typically comprise a plurality of different oligonucleotide probes that are coupled to a surface of a substrate in different known locations. The knowledge of the precise location of each polynucleotides location makes these "addressable" arrays particularly useful in hybridization assays. Any addressable array technology known in the art can be employed with the polynucleotides of the invention. One particular embodiment of these polynucleotide arrays is known as the Genechips, and has been generally described in US Patent 5, 143, 854; PCT publications WO 90/15070 and 92/10092. These arrays may generally be produced using mechanical synthesis methods or light directed synthesis methods, which incoφorate a combination of photolithographic methods and solid phase oligonucleotide synthesis (Fodor et al., Science, 251:767-777, 1991). The immobilization of arrays of oligonucleotides on solid supports has been rendered possible by the development of a technology generally identified as'Nery Large Scale Immobilized Polymer Synthesis" (VLSIPS ) in which, typically, probes are immobilized in a high density array on a solid surface of a chip. Examples of VLSIPS technologies are provided in US Patents 5,143,854 and 5,412,087 and in PCT Publications WO 90/15070, WO 92/10092 and WO 95/11995, which describe methods for forming oligonucleotide arrays through techniques such as light directed synthesis techniques. In designing strategies aimed at providing arrays of nucleotides immobilized on solid supports, further presentation strategies were developed to order and display the oligonucleotide arrays on the chips in an attempt to maximize hybridization patterns and sequence information. Examples of such presentation strategies are disclosed in PCT Publications WO 94/12305, WO 94/11530, WO 97/29212 and WO 97/31256.
Oligonucleotide arrays may comprise at least one of the sequences selected from the group consisting of SEQ ID Νos. 1, 2 or 3 or 5 to 23, and the sequences complementary thereto or a fragment thereof of at least 8, 10, 12, 15, 18, 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 consecutive nucleotides, to the extent that fragments of these lengths is consistent with the lengths of the particular Sequence ID, for determining whether a sample contains one or more alleles of the polymoφhisms of the present invention. Oligonucleotide arrays may also comprise at least one of the sequences selected from the group consisting of SEQ ID Νos. 1, 2 or 3 or 5 to 23, and the sequences complementary thereto or a fragment thereof of at least 8, 10, 12, 15, 18, 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 consecutive nucleotides, to the extent that fragments of these lengths is consistent with the lengths of the particular Sequence ID, for amplifying one or more alleles of the polymoφhisms of Table 3. In other embodiments, arrays may also comprise at least one of the sequences selected from the group consisting of SEQ ID Νos. 5 to 23, and the sequences complementary thereto or a fragment thereof of at least 20, 25, 35, 40, 50, 70, 80, 100, 250, 500 or 1000 consecutive nucleotides, to the extent that fragments of these lengths is consistent with the lengths of the particular Sequence ID, for conducting microsequencing analyses to determine whether a sample contains one or more alleles of the polymorphisms of the invention. In still further embodiments, the oligonucleotide array may comprise at least one of the sequences selecting from the group consisting of SEQ ID Nos. 1, 2 or 3 or 5 to 23, and the sequences complementary thereto or a fragment thereof of at least 50, 70, 80, 100, 250, 500 or 1000 nucleotides in length, to the extent that fragments of these lengths is consistent with the lengths of the particular Sequence ID, for determining whether a sample contains one or more alleles of the polymorphisms of the present invention.
The present invention also encompasses diagnostic kits comprising one or more polynucleotides of the invention, optionally with a portion or all of the necessary reagents and instructions for genotyping a test subject by determining the identity of a nucleotide at a Kalpa- related polymorphism. The polynucleotides of a kit may optionally be attached to a solid support, or be part of an array or addressable array of polynucleotides. The kit may provide for the determination of the identity of the nucleotide at a marker position by any method known in the art including, but not limited to, a sequencing assay method, a microsequencing assay method, a hybridization assay method, an allele specific amplification method, or a mismatch detection assay based on polymerases and/or ligases.
Optionally such a kit may include instructions for scoring the results of the determination with respect to the test subjects' risk of contracting a diseases involving cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation, or likely response to an agent acting on cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation, or chances of suffering from side effects to an agent acting on cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation. Examples of such diseases include heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
It should be noted that in the accompanying Sequence Listing, all instances of the symbol "n" in the nucleic acid sequences mean that the nucleotide can be adenine, guanine, cytosine or thymine.
Kalpa Polypeptides
As discussed, Kalpa polypeptides and their use in drug screening assays and therapy for cholesterol lowering are described. Also relating to the invention are polypeptides encoded by the polynucleotides of the invention, as well as fusion polypeptides comprising such polypeptides. The invention relates to Kalpa from humans, including isolated or purified Kalpa consisting of, consisting essentially of, or comprising the sequence of SEQ ID No 4. The invention concerns the polypeptides encoded by a nucleotide sequence of SEQ ID No 2 or 3, a complementary sequence thereof and a fragment thereof. The present invention embodies isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100, 150, 200, 300 or 500 amino acids, of SEQ ID No 4. In other preferred embodiments the contiguous stretch of amino acids comprises the site of a mutation or functional mutation, including a deletion, addition, swap or truncation of the amino acids in the Kalpa sequence. A Kalpa protein of the invention may comprise, consist or consist essentially of a functional domain referred to in Table 4 or Table B. Table 4 provides the positions of the respective domain (referred to in Table 4 as "feature") on the amino acid sequences of SEQ ID No 4. Polypeptides of the invention include isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100 amino acids of a sequence selected from the group consisting of the positions of amino acid positions 399 to 414 and 501 to 772 or of the positions of amino acid positions of the functional domains listed in Table 4 on SEQ ID No 4.
Also encompassed are Kalpa polypeptides at least 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99% or 99.8% identical to the amino acid sequences of SEQ ID No 4, or to a fragment or functional domain thereof. Preferably the functional domain is a functional domain listed in [Table 4].
The Kalpa protein has 10 transmembrane domains. In one aspect, the invention thus encompasses isolated, purified, and recombinant polypeptides comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100 amino acids of a sequence selected from the group consisting of transmembrance and extracelluular domains of amino acid positions 1 to 129 (outside); 130 to 152 (Tmhelix); 153 to 184 (outside); 185 to 207 (Tmhelix); 208 to 221 (outside); 222 to 239 (Tmhelix); 240 to 245 (outside); 246 to 268 (Tmhelix); 269 to 287 (outside); 288 to 307 (Tmhelix); 308 to 336 (outside); 337 to 356 (Tmhelix); 357 to 370 (outside); 371 to 393 (Tmhelix); 394 to 404 (outside); 405 to 427 (Tmhelix); 428 to 431 (outside); 432 to 451 (Tmhelix); 452 to 459 (outside); 460 to 478 (Tmhelix); and 479 to 760 (outside).
One aspect of the invention pertains to isolated Kalpa, and biologically active portions thereof, as well as polypeptide fragments suitable for use as immunogens to raise anti-Kalpa antibodies. In one embodiment, native Kalpa can be isolated from cells or tissue sources by an appropriate purification scheme using standard protein purification techniques. In another embodiment, Kalpa are produced by recombinant DNA techniques. Alternative to recombinant expression, a Kalpa or polypeptide can be synthesized chemically using standard peptide synthesis techniques.
Biologically active portions of a Kalpa include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the Kalpa, e.g., an amino acid sequence shown in SEQ ID No 4, which include less amino acids than the respective full length Kalpa, and exhibit at least one activity of the Kalpa. The present invention also embodies isolated, purified, and recombinant portions or fragments of a Kalpa polypeptide comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100,150, 200, 300 or 500 amino acids, to the extent that said span is consistent with the particular SEQ ID NO, of a sequence selected from the group consisting of SEQ ID No 4. Also encompassed are Kalpa polypeptides which comprise between 10 and 20, between 20 and 50, between 30 and 60, between 50 and 100, or between 100 and 200 amino acids of a sequence selected from the group consisting of SEQ ID No 4. In other preferred embodiments the contiguous stretch of amino acids comprises the site of a mutation or functional mutation, including a deletion, addition, swap or truncation of the amino acids in the Kalpa sequence. A biologically active Kalpa may, for example, comprise at least 1, 2, 3, 5, 10, 20 or 30 amino acid changes from the sequence of SEQ ID No 4, or may encode a biologically active Kalpa comprising at least 1%, 2%, 3%, 5%>, 8%>, 10%> or 15%> changes in amino acids from the sequence of SEQ ID No 4.
Methods of assessing polypeptides, methods for obtaining variant nucleic acids and polypeptides
It will be appreciated that by characterizing the function of Kalpa polypeptides, the invention further provides methods of testing the activity of, or obtaining, functional fragments and variants of Kalpa nucleotide sequences involving providing a variant or modified Kalpa nucleic acid and assessing whether a polypeptide encoded thereby displays a Kalpa activity of the invention. Encompassed is thus a method of assessing the function of a Kalpa polypeptide comprising: (a) providing a Kalpa polypeptide, or a biologically active fragment or homologue thereof; and (b) testing said Kalpa polypeptide, or a biologically active fragment or homologue therefor a Kalpa activity. Any suitable format may be used, including cell free, cell-based and in vivo formats. For example, said assay may comprise expressing a Kalpa nucleic acid in a host cell, and observing Kalpa activity in said cell. In another example, a Kalpa polypeptide, or a biologically active fragment or homologue thereof is introduced to a cell, and a Kalpa activity is observed. Kalpa activity may be any activity as described herein.
In addition to naturally-occurring allelic variants of the Kalpa sequences that may exist in the population, the skilled artisan will appreciate that changes can be introduced by mutation into the nucleotide sequences of SEQ ID NO: 2 or 3, thereby leading to changes in the amino acid sequence of the encoded Kalpa, with or without altering the functional ability of the Kalpa.
Several types of variants are contemplated including 1) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue and such substituted amino acid residue may or may not be one encoded by the genetic code, or 2) one in which one or more of the amino acid residues includes a substituent group, or 3) one in which the mutated Kalpa polypeptide is fused with another compound, such as a compound to increase the half-life of the polypeptide (for example, polyethylene glycol), or 4) one in which the additional amino acids are fused to the mutated Kalpa polypeptide, such as a leader or secretory sequence or a sequence which is employed for purification of the mutated Kalpa polypeptide or a preprotein sequence. Such variants are deemed to be within the scope of those skilled in the art.
For example, nucleotide substitutions leading to amino acid substitutions can be made in the sequence of SEQ ID No 2 or 3 that do not substantially change the biological activity of the protein. An amino acid residue-can be altered from the wild-type sequence encoding a Kalpa polypeptide, or a biologically active fragment or homologue thereof without altering the biological activity. In general, amino acid residues that are conserved among the homologs of the respective Kalpa of the present invention, are predicted to be less amenable to alteration. Furthermore, additional conserved amino acid residues may be amino acids that are conserved between the proteins or variants related to the respective Kalpa.
In one aspect, the invention pertains to nucleic acid molecules encoding Kalpa polypeptides, or biologically active fragments or homologues thereof that contain changes in amino acid residues that are not essential for activity. Such Kalpa differ in amino acid sequence from SEQ ID No 4 yet retain biological activity. In one embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises an amino acid sequence at least about 60% homologous to an amino acid sequence of SEQ ID NO 4. Preferably, the protein encoded by the nucleic acid molecule is at least about 65-70% homologous to an amino acid sequence of SEQ ID NO 4, more preferably sharing at least about 75-80% identity with an amino acid sequence of SEQ ID NO 4, even more preferably sharing at least about 85%, 90%, 92%, 95%, 97%, 98%, 99% or 99.8% identity with an amino acid sequence selected from the group consisting of SEQ ID NO 4.
In another aspect, the invention pertains to nucleic acid molecules encoding Kalpa that contain changes in amino acid residues that result in increased biological activity, or a modified biological activity. In another aspect, the invention pertains to nucleic acid molecules encoding Kalpa that contain changes in amino acid residues that are essential for a Kalpa activity. Such Kalpa differ in amino acid sequence from SEQ ID NO 4 and display reduced or essentially lack one or more Kalpa biological activities. The invention also encompasses a Kalpa polypeptide, or a biologically active fragment or homologue thereof which may be useful as dominant negative mutant of a Kalpa polypeptide.
An isolated nucleic acid molecule encoding a Kalpa polypeptide, or a biologically active fragment or homologue thereof homologous to a protein of any one of SEQ ID NO 4 can be created by introducing one or more nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID NO 4 such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein. Mutations can be introduced into any of SEQ ID NO 4, by standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis. For example, conservative amino acid substitutions may be made at one or more predicted non-essential amino acid residues. A "conservative amino acid substitution" is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a Kalpa polypeptide, or a biologically active fragment or homologue thereof may be replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a Kalpa coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for Kalpa biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO 4, the encoded protein can be expressed recombinantly and the activity of the protein can be determined. In a preferred embodiment, a mutant Kalpa polypeptide, or a biologically active fragment or homologue thereof encoded by a Kalpa polypeptide, or a biologically active fragment or homolog thereof can be assayed for a Kalpa activity in any suitable assay, examples of which are provided herein. The invention also provides Kalpa chimeric or fusion proteins. As used herein, a Kalpa
"chimeric protein" or "fusion protein" comprises a Kalpa polypeptide of the invention operatively linked, preferably fused in frame, to a non- Kalpa or non- Kalpa domain polypeptide. In a preferred embodiment, a Kalpa fusion protein comprises at least one biologically active portion of a Kalpa. In another preferred embodiment, a Kalpa fusion protein comprises at least two biologically active portions of a Kalpa. For example, in one embodiment, the fusion protein is a GST- Kalpa fusion protein in which the Kalpa sequences are fused to the C-teπninus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant Kalpa polypeptides. In another embodiment, the fusion protein is a Kalpa containing a heterologous signal sequence at its N-terminus, such as for example to allow for a desired cellular localization in a certain host cell.
The Kalpa fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. Moreover, the Kalpa -fusion proteins of the invention can be used as immunogens to produce anti- Kalpa antibodies in a subject, to purify Kalpa ligands and in screening assays to identify molecules which inhibit the interaction of a Kalpa with a Kalpa-target molecule.
Furthermore, isolated peptidyl portions of the subject Kalpa can also be obtained by screening peptides recombinantly produced from the corresponding fragment of the nucleic acid encoding such peptides. In addition, fragments can be chemically synthesized using techniques known in the art such as conventional Merrifield solid phase f-Moc or t-Boc chemistry. For example, a Kalpa of the present invention may be arbitrarily divided into fragments of desired length with no overlap of the fragments, or preferably divided into overlapping fragments of a desired length. The fragments can be produced (recombinantly or by chemical synthesis) and tested to identify those peptidyl fragments which can function as either agonists or antagonists of a Kalpa activity, such as by microinjection assays or in vitro protein binding assays. In an illustrative embodiment, peptidyl portions of a Kalpa, a Kalpa target binding region, can be tested for Kalpa activity by expression as thioredoxin fusion proteins, each of which contains a discrete fragment of the Kalpa (see, for example, U.S. Patents 5, 270,181 and 5,292,646; and PCT publication W094/ 02502, the disclosures of which are incoφorated herein by reference). The present invention also pertains to variants of a Kalpa protein which function as either Kalpa mimetics or as Kalpa inhibitors. Variants of a Kalpa protein can be generated by mutagenesis, e.g., discrete point mutation or truncation of a Kalpa protein. An agonist of a Kalpa protein can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a Kalpa protein. An antagonist of a Kalpa protein can inhibit one or more of the activities of the naturally occurring form of the Kalpa protein by, for example, competitively inhibiting the association of a Kalpa with a Kalpa target molecule. Thus, specific biological effects can be elicited by treatment with a variant of limited function. In one embodiment, variants of a Kalpa which function as either Kalpa agonists (mimetics) or as Kalpa antagonists can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a Kalpa for Kalpa agonist or antagonist activity. In one embodiment, a variegated library of Kalpa variants is generated by combinatorial mutagenesis at the nucleic acid level and is encoded by a variegated gene library. A variegated library of Kalpa variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential Kalpa sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display) containing the set of Kalpa sequences therein. There are a variety of methods which can be used to produce libraries of potential Kalpa variants from a degenerate oligonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be performed in an automatic DNA synthesizer, and the synthetic gene then ligated into an appropriate expression vector. Use of a degenerate set of genes allows for the provision, in one mixture, of all of the sequences encoding the desired set of potential Kalpa sequences.
In addition, libraries of fragments of a Kalpa coding sequence can be used to generate a variegated population of Kalpa fragments for screening and subsequent selection of variants of a Kalpa. In one embodiment, a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of a Kalpa coding sequence with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with SI nuclease, and ligating the resulting fragment library into an expression vector. By this method, an expression library can be derived which encodes N-terminal, C-terminal and internal fragments of various sizes of the Kalpa.
Modified Kalpa can be used for such puφoses as enhancing therapeutic or prophylactic efficacy, or stability (e.g., ex vivo shelf life and resistance to proteolytic degradation in vivo). Such modified peptides, when designed to retain at least one activity of the naturally occurring form of the protein, are considered functional equivalents of the Kalpa described in more detail herein. Such modified peptide can be produced, for instance, by amino acid substitution, deletion, or addition. Whether a change in the amino acid sequence of a peptide results in a functional Kalpa homolog (e.g. functional in the sense that it acts to mimic or antagonize the wild-type form) can be readily determined by assessing the ability of the variant peptide to produce a response in cells in a fashion similar to the wild-type Kalpa or competitively inhibit such a response. Peptides in which more than one replacement has taken place can readily be tested in the same manner.
This invention further contemplates a method of generating sets of combinatorial mutants of the presently disclosed Kalpa, as well as truncation and fragmentation mutants, and is especially useful for identifying potential variant sequences which are functional in binding to a Kalpa - target protein but differ from a wild-type form of the protein by, for example, efficacy, potency and/or intracellular half-life. One puφose for screening such combinatorial libraries is, for example, to isolate novel Kalpa homologs which function as either an agonist or an antagonist of the biological activities of the wild-type protein, or alternatively, possess novel activities all together. For example, mutagenesis can give rise to Kalpa homologs which have intracellular half-lives dramatically different than the corresponding wild-type protein. The altered protein can be rendered either more stable or less stable to proteolytic degradation or other cellular process which result in destruction of, or otherwise inactivation of, a Kalpa. Such Kalpa homologs, and the genes which encode them, can be utilized to alter the envelope of expression for a particular recombinant Kalpa by modulating the half-life of the recombinant protein. For instance, a short half-life can give rise to more transient biological effects associated with a particular recombinant Kalpa and, when part of an inducible expression system, can allow tighter control of recombinant protein levels within a cell. As above, such proteins, and particularly their recombinant nucleic acid constructs, can be used in gene therapy protocols.
In an illustrative embodiment of this method, the amino acid sequences for a population of Kalpa homologs or other related proteins are aligned, preferably to promote the highest homology possible. Such a population of variants can include, for example Kalpa homologs from one or more species, or Kalpa homologs from the same species but which differ due to mutation. Amino acids which appear at each position of the aligned sequences are selected to create a degenerate set of combinatorial sequences. There are many ways by which the library of potential Kalpa homologs can be generated from a degenerate oligonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be carried out in an automatic DNA synthesizer, and the synthetic genes then be ligated into an appropriate gene for expression. The purpose of a degenerate set of genes is to provide, in one mixture, all of the sequences encoding the desired set of potential Kalpa sequences. The synthesis of degenerate oligonucleotides is well known in the art (see for example. Narang, SA (1983) Tetrahedron 393; Itakura et al. (1981) Recombinant DNA, Proc 3rd Cleveland Sympos. Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp. 273-289; Itakura et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) Nucleic Acid Res. 11:477. Such techniques have been employed in the directed evolution of other proteins (see, for example, Scott et al. (1990) Science 249:386-390; Roberts et al. (1992) PNAS 89:2429-2433; Devlin et al. (1990) Science 249: 404-406; Cwirla et al. (1990) PNAS 87: 6378-6382; as well as U.S. Patents Nos: 5, 223,409, 5,198,346, and 5,096,815). The disclosures of the above references are incoφorated herein by reference in their entireties. Alternatively, other forms of mutagenesis can be utilized to generate a combinatorial library, particularly where no other naturally occurring homologs have yet been sequenced. For example, Kalpa homologs (both agonist and antagonist forms) can be generated and isolated from a library by screening using, for example, alanine scanning mutagenesis and the like (Ruf et al. (1994) Biochemistry 33:1565-1572; Wang et al. (1994) J Biol. Chem. 269:3095-3099; Balint et al. (1993) Gene 137:109-118; Grodberg et al. (1993) Eur. J Biochem. 218:597-601; Nagashima et al. (1993) J Biol. Chem. 268:2888-2892; Lowman et al. (1991) Biochemistry 30:10832-10838; and Cunningham et al. (1989) Science 244:1081-1085), by linker scanning mutagenesis (Gustin et al. (1993) Virology 193:653-660; Brown et al. (1992) Mol. Cell Biol. 12:2644 2652; McKnight et al. (1982) Science 232:316); by saturation mutagenesis (Meyers et al. (1986) Science 232:613); by PCR mutagenesis (Leung et al. (1989) Method Cell Mol Biol 1 : 1-19); or by random mutagenesis (Miller et al. (1992) A Short Course in Bacterial Genetics, CSHL Press, Cold Spring Harbor, NY; and Greener et al. (1994) Strategies in Mol Biol 7:32- 34, the disclosures of which are incoφorated herein by reference in their entireties).
A wide range of techniques are known in the art for screening gene products of combinatorial libraries made by point mutations, as well as for screening cDNA libraries for gene products having a certain property. Such techniques will be generally adaptable for rapid screening of the gene libraries generated by the combinatorial mutagenesis of Kalpa. The most widely used techniques for screening large gene libraries typically comprises cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates relatively easy isolation of the vector encoding the gene whose product was detected.
Each of the illustrative assays described below are amenable to high through-put analysis as necessary to screen large numbers of degenerate Kalpa sequences created by combinatorial mutagenesis techniques. In one screening assay, the candidate gene products are displayed on the surface of a cell or viral particle, and the ability of particular cells or viral particles to bind a Kalpa target molecule (protein or DNA) via this gene product is detected in a "panning assay". For instance, the gene library can be cloned into the gene for a surface membrane protein of a bacterial cell, and the resulting fusion protein detected by panning (Ladner et al., WO 88/06630; Fuchs et al. (1991) BiolTechnology 9:1370-1371, and Goward et al. (1992) TIBS 18:136 140). In a similar fashion, fluorescently labeled Kalpa target can be used to score for potentially functional Kalpa homologs. Cells can be visually inspected and separated under a fluorescence microscope, or, where the moφhology of the cell permits, separated by a fluorescence- activated cell sorter.
In an alternate embodiment, the gene library is expressed as a fusion protein on the surface of a viral particle. For instance, in the filamentous phage system, foreign peptide sequences can be expressed on the surface of infectious phage, thereby conferring two significant benefits. First, since these phage can be applied to affinity matrices at very high concentrations, a large number of phage can be screened at one time. Second, since each infectious phage displays the combinatorial gene product on its surface, if a particular phage is recovered from an affinity matrix in low yield, the phage can be amplified by another round of infection. The group of almost identical E. coli filamentous phages M13, fd, and fl are most often used in phage display libraries, as either of the phage gill or gVffl coat proteins can be used to generate fusion proteins without disrupting the ultimate packaging of the viral particle (Ladner et al. PCT publication WO 90/02909; Garrard et al., PCT publication WO 92/09690; Marks et al. (1992) J Biol. Chem. 267:16007-16010; Griffiths et al. (1993) EMBO J 12:725- 734; Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) PNAS 89:4457 4461, the disclosures of which are incorporated herein by reference in their entireties). In an illustrative embodiment, the recombinant phage antibody system (RPAS, Pharmacia Catalog number 27-9400-01) can be easily modified for use in expressing Kalpa combinatorial libraries, and the Kalpa phage library can be panned on immobilized Kalpa target molecule (glutathione immobilized Kalpa target-GST fusion proteins or immobilized DNA). Successive rounds of phage amplification and panning can greatly enrich for Kalpa homologs which retain an ability to bind a Kalpa target and which can subsequently be screened further for biological activities in automated assays, in order to distinguish between agonists and antagonists.
The invention also provides for identification and reduction to functional minimal size of the Kalpa domains, particularly a TRP repeat, glycosyl transferase or aldoketo-reductase domain of the subject Kalpa to generate mimetics, e.g. peptide or non-peptide agents, which are able to disrupt binding of a polypeptide of the present invention with a Kalpa target molecule (protein or DNA). Thus, such mutagenic techniques as described above are also useful to map the determinants of Kalpa which participate in protein-protein or protein-DNA interactions involved in, for example, binding to a Kalpa target protein or DNA. To illustrate, the critical residues of a Kalpa which are involved in molecular recognition of the Kalpa target can be determined and used to generate Kalpa target- 13 P-derived peptidomimetics that competitively inhibit binding of the Kalpa to the Kalpa target. By employing, for example, scanning mutagenesis to map the amino acid residues of a particular Kalpa involved in binding a Kalpa target, peptidomimetic compounds can be generated which mimic those residues in binding to a Kalpa target, and which, by inhibiting binding of the Kalpa to the Kalpa target molecule, can interfere with the function of a Kalpa in transcriptional regulation of one or more genes. For instance, non hydrolyzable peptide analogs of such residues can be generated using retro- inverse peptides (e.g., see U.S. Patents 5,116,947 and 5,219,089; and Pallai et al. (1983) Int J Pept Protein Res 21 :84-92), benzodiazepine (e.g., see Freidinger et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), azepine (e.g., see Huffman et al. in Peptides.- Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), substituted gamma lactam rings (Garvey et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), keto-methylene pseudopeptides (Ewenson et al. (1986) J Med Chem 29:295; and Ewenson et al. in Peptides: Structure and Function (Proceedings of the 9th American Peptide Symposium) Pierce Chemical Co. Rockland, IL, 1985), P-turn dipeptide cores (Nagai et al. (1985) Tetrahedron Left 26:647; and Sato et al. (1986) J Chem Soc Perkin Trans 1: 123 1), and P- aminoalcohols (Gordon et al. (1985) Biochem Biophys Res Commun 126:419; and Dann et al. (1986) Biochem Biophys Res Commun 134:71, the disclosures of which are incoφorated herein by reference in their entireties).
An isolated Kalpa, or a portion or fragment thereof, can be used as an immunogen to generate antibodies that bind Kalpa using standard techniques for polyclonal and monoclonal antibody preparation. A full-length Kalpa can be used or, alternatively, the invention provides antigenic peptide fragments of Kalpa for use as immunogens. Any fragment of the Kalpa which contains at least one antigenic determinant may be used to generate antibodies. The antigenic peptide of a Kalpa comprises at least 8 amino acid residues of an amino acid sequence selected from the group consisting of SEQ ID No 4 and encompasses an epitope of a Kalpa such that an antibody raised against the peptide forms a specific immune complex with a Kalpa. Preferably, the antigenic peptide comprises at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.
Preferred epitopes encompassed by the antigenic peptide are regions of a Kalpa that are located on the surface of the protein, e.g., hydrophilic regions. A Kalpa immunogen typically is used to prepare antibodies by immunizing a suitable subject, (e.g., rabbit, goat, mouse or other mammal) with the immunogen. An appropriate immunogenic preparation can contain, for example, recombinantly expressed Kalpa or a chemically synthesized Kalpa polypeptide. The preparation can further include an adjuvant, such as Freund's complete or incomplete adjuvant, or similar immunostimulatory agent. Immunization of a suitable subject with an immunogenic Kalpa preparation induces a polyclonal anti-Kalpa antibody response.
Thus, antibody compositions for use in accordance with the invention include either polyclonal or monoclonal antibodies capable of selectively binding, or which selectively bind to an epitope-containing a polypeptide comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, 100, or more than 100 amino acids of an amino acid sequence of a functional domain of a Kalpa of SEQ ID NOS 4. The invention also concerns a purified or isolated antibody capable of specifically binding to a mutated or variant Kalpa or to a fragment thereof comprising an epitope of the mutated Kalpa.
Recombinant Expression Vectors and Host Cells
Another aspect of the invention pertains to vectors, preferably expression vectors, containing a nucleic acid encoding a Kalpa (or a portion thereof). As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors". In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. In the present specification, "plasmid" and "vector" can be used interchangeably as the plasmid is the most commonly used form of vector. However, the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions. The recombinant expression vectors of the invention comprise a Kalpa nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory sequence" is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990), the disclosure of which is incoφorated herein by reference in its entirety. Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cell and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., Kalpa, mutant forms of Kalpa, fusion proteins, or fragments of any of the preceding proteins, etc.).
The recombinant expression vectors of the invention can be designed for expression of Kalpa in prokaryotic or eukaryotic cells. For example, Kalpa can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors) yeast cells, or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990), the disclosure of which is incorporated herein by reference in its entirety. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non- fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Phaπnacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.), the disclosures of which are incoφorated herein by reference in their entireties, which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein. Purified fusion proteins can be utilized in Kalpa activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for Kalpa, for example. In a preferred embodiment, a Kalpa fusion protein expressed in a retroviral expression vector of the present invention can be utilized to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g six (6) weeks).
Examples of suitable inducible non-fusion E. coli expression vectors include pTrc (Amann et al., (1988) Gene 69:301-315) and pET l id (Studier et al., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 60-89), the disclosures of which are incoφorated herein by reference in their entireties. Target gene expression from the pTrc vector relies on host RNA polymerase transcription from a hybrid trp- lac fusion promoter. Target gene expression from the pET l id vector relies on transcription from a T7 gnlO-lac fusion promoter mediated by a coexpressed viral RNA polymerase (T7 gn 1). This viral polymerase is supplied by host strains BL21 (DE3) or HMS174(DE3) from a resident prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 promoter.
One strategy to maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128, the disclosure of which is incoφorated herein by reference in its entirety). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111- 2118, the disclosure of which is incorporated herein by reference in its entirety). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.
In another embodiment, the Kalpa expression vector is a yeast expression vector. Examples of vectors for expression in yeast S. cerivisae include pYepSec 1 (Baldari, et al., (1987) Embo J. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 30:933-943), pJRY88 (Schultz et al., (1987) Gene 54:113-123), pYES2 (Invitrogen Coφoration, San Diego, Calif), and picZ (InVitrogen Coφ, San Diego, Calif), the disclosures of which are incoφorated herein by reference in their entireties.
Alternatively, Kalpa can be expressed in insect cells using baculovirus expression vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., Sf 9 cells) include the pAc series (Smith et al. (1983) Mol. Cell Biol. 3:2156-2165) and the pVL series (Lucklow and Summers (1989) Virology 170:31-39), the disclosures of which are incorporated herein by reference in their entireties. In particularly preferred embodiments, Kalpa are expressed according to Kamiski et al, Am. J. Physiol. (1998) 275: F79-87, the disclosure of which is incorporated herein by reference in its entirety. In yet another embodiment, a nucleic acid of the invention is expressed in mammalian cells using a mammalian expression vector. Examples of mammalian expression vectors include pCDM8 (Seed, B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J. 6:187-195), the disclosures of which are incoφorated herein by reference in their entireties. When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40. For other suitable expression systems for both prokaryotic and eukaryotic cells see chapters 16 and 17 of Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989, the disclosure of which is incorporated herein by reference in its entirety.
In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue- specific regulatory elements are used to express the nucleic acid). Tissue-specific regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1 :268-277, the disclosure of which is incoφorated herein by reference in its entirety), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron- specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) PNAS 86:5473- 5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the alpha-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546), the disclosures of which are incoφorated herein by reference in their entireties.
The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in a manner which allows for expression (by transcription of the DNA molecule) of an RNA molecule which is antisense to Kalpa mRNA. Regulatory sequences operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the continuous expression of the antisense RNA molecule in a variety of cell types, for instance viral promoters and/or enhancers, or regulatory sequences can be chosen which direct constitutive, tissue specific or cell type specific expression of antisense RNA. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus in which antisense nucleic acids are produced under the control of a high efficiency regulatory region, the activity of which can be determined by the cell type into which the vector is introduced. For a discussion of the regulation of gene expression using antisense genes see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1) 1986, the disclosure of which is incorporated herein by reference in its entirety.
Another aspect of the invention pertains to host cells into which a recombinant expression vector of the invention has been introduced. The terms "host cell" and "recombinant host cell" are used interchangeably herein. It is understood that such term refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
A host cell can be any prokaryotic or eukaryotic cell. For example, a Kalpa can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells or human cells). Other suitable host cells are known to those skilled in the art, including Xenopus laevis oocytes as further described in the Examples.
Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989, the disclosure of which is incoφorated herein by reference in its entirety), and other laboratory manuals. For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a selectable marker can be introduced into a host cell on the same vector as that encoding a Kalpa or can be introduced on a separate vector. Cells stably transfected with the introduced nucleic acid can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die). A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can be used to produce (i.e., express) a Kalpa. Accordingly, the invention further provides methods for producing a Kalpa using the host cells of the invention. In one embodiment, the method comprises culturing the host cell of invention (into which a recombinant expression vector encoding a Kalpa has been introduced) in a suitable medium such that a Kalpa is produced. In another embodiment, the method further comprises isolating a Kalpa from the medium or the host cell.
In another embodiment, the invention encompasses providing a cell capable of expressing a Kalpa, culturing said cell in a suitable medium such that a Kalpa is produced, and isolating or purifying the Kalpa from the medium or cell.
The host cells of the invention can also be used to produce nonhuman transgenic animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte or an embryonic stem cell into which Kalpa-coding sequences have been introduced. Such host cells can then be used to create non-human transgenic animals in which exogenous Kalpa sequences have been introduced into their genome or homologous recombinant animals in which endogenous Kalpa sequences have been altered. Such animals are useful for studying the function and/or activity of a Kalpa polypeptide or fragment thereof and for identifying and/or evaluating modulators of Kalpa activity. As used herein, a "transgenic animal" is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous DNA which is integrated into the genome of a cell from which a transgenic animal develops and which remains in the genome of the mature animal, thereby directing the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal. As used herein, a "homologous recombinant animal" is a non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous Kalpa gene has been altered by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal. A transgenic animal of the invention can be created by introducing a Kalpa-encoding nucleic acid into the male pronuclei of a fertilized oocyte, e.g., by microinjection or retroviral infection, and allowing the oocyte to develop in a pseudopregnant female foster animal. The Kalpa cDNA sequence or a fragment thereof such as a sequence of SEQ ID No 2 or 3 can be introduced as a transgene into the genome of a non-human animal. Alternatively, a nonhuman homologue of a huma Kalpa gene, such as a mouse or rat Kalpa gene, can be used as a transgene. Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a Kalpa transgene to direct expression of a Kalpa to particular cells. Methods for generating transgenic animals via embryo manipulation and microinjection, particularly animals such as mice, have become conventional in the art and are described, for example, in U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al. and in Hogan, B., Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986, the disclosure of which is incorporated herein by reference in its entirety). Similar methods are used for production of other transgenic animals. A transgenic founder animal can be identified based upon the presence of a Kalpa transgene in its genome and/or expression of Kalpa mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a Kalpa can further be bred to other transgenic animals carrying other transgenes.
To create an animal in which a desired nucleic acid has been introduced into the genome via homologous recombination, a vector is prepared which contains at least a portion of a Kalpa gene into which a deletion, addition or substitution has been introduced to thereby alter, e.g., functionally disrupt, the Kalpa gene. The Kalpa gene can be a human gene (e.g., the cDNAs of SEQ ID No 2 or 3), but more preferably, is a non-human homologue of a huma Kalpa gene (e.g., a cDNA isolated by stringent hybridization with a nucleotide sequence of SEQ ID No 2 or 3). For example, a mouse Kalpa gene can be used to construct a homologous recombination vector suitable for altering an endogenous Kalpa gene in the mouse genome. In a preferred embodiment, the vector is designed such that, upon homologous recombination, the endogenous Kalpa gene is functionally disrupted (i.e., no longer encodes a functional protein; also referred to as a "knock out" vector). Alternatively, the vector can be designed such that, upon homologous recombination, the endogenous Kalpa gene is mutated or otherwise altered but still encodes functional protein (e.g., the upstream regulatory region can be altered to thereby alter the expression of the endogenous Kalpa). In the homologous recombination vector, the altered portion of the Kalpa gene is flanked at its 5' and 3' ends by additional nucleic acid sequence of the Kalpa gene to allow for homologous recombination to occur between the exogenous Kalpa gene carried by the vector and an endogenous Kalpa gene in an embryonic stem cell. The additional flanking Kalpa nucleic acid sequence is of sufficient length for successful homologous recombination with the endogenous gene. Typically, several kilobases of flanking DNA (both at the 5' and 3' ends) are included in the vector (see e.g., Thomas, K. R. and Capecchi, M. R. (1987) Cell 51:503, the disclosure of which is incoφorated herein by reference in its entirety, for a description of homologous recombination vectors). The vector is introduced into an embryonic stem cell line (e.g., by electroporation) and cells in which the introduced Kalpa gene has homologously recombined with the endogenous Kalpa gene are selected (see e.g., Li, E. et al. (1992) Cell 69:915, the disclosure of which is incorporated herein by reference in its entirety). The selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to form aggregation chimeras (see e.g., Bradley, A. in Teratocarcinomas and Embryonic Stem Cells. A Practical Approach, E. J. Robertson, ed. (IRL, Oxford, 1987) pp. 113- 152, the disclosure of which is incorporated herein by reference in its entirety). A chimeric embryo can then be implanted into a suitable pseudopregnant female foster animal and the embryo brought to term. Progeny harboring the homologously recombined DNA in their germ cells can be used to breed animals in which all cells of the animal contain the homologously recombined DNA by germline transmission of the transgene. Methods for constructing homologous recombination vectors and homologous recombinant animals are described further in Bradley, A. (1991) Current Opinion in Biotechnology 2:823-829 and in PCT International Publication Nos.: WO 90/11354 by Le Mouellec et al.; WO 91/01140 by Smithies et al.; WO 92/0968 by Zijlstra et al.; and WO 93/04169 by Berns et al., the disclosures of which are incoφorated herein by reference in their entireties. In another embodiment, transgenic non-human animals can be produced which contain selected systems which allow for regulated expression of the transgene. One example of such a system is the cre/loxP recombinase system of bacteriophage PI. For a description of the cre/loxP recombinase system, see, e.g., Lakso et al. (1992) PNAS 89:6232-6236, the disclosure of which is incorporated herein by reference in its entirety. Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355, the disclosure of which is incoφorated herein by reference in its entirety). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein are required. Such animals can be provided through the construction of "double" transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase.
Drug Screening Assays
The invention provides a method (also referred to herein as a "screening assay") for identifying modulators, i.e., candidate or test compounds or agents (e.g., preferably small molecules, but also peptides, peptidomimetics or other drugs) which bind to Kalpa, have an inhibitory or activating effect on, for example, Kalpa expression or preferably Kalpa activity, or have an inhibitory or activating effect on, for example, the activity of a Kalpa target molecule. In some embodiments small molecules can be generated using combinatorial chemistry or can be obtained from a natural products library. Assays may be cell based or non-cell based assays. Drug screening assays may be binding assays or more preferentially functional assays, as further described.
In preferred embodiments, an assay is a cell-based assay in which a cell which expresses a Kalpa or biologically active portion thereof is contacted with a test compound and the ability of the test compound to inhibit, activate, or increase Kalpa activity determined. Determining the ability of the test compound to inhibit, activate, or increase Kalpa activity can be accomplished by monitoring the bioactivity of the Kalpa or biologically active portion thereof. The cell, for example, can be of mammalian origin, bacterial origin or a yeast cell. For example, in some embodiments, the cell can be a mammalian cell, bacterial cell or yeast cell which has been engineered to lack Kalpa activity or which naturally lacks Kalpa activity.
The invention further encompasses compounds capable of inhibiting or activating Kalpa activity. Preferably, a Kalpa inhibitor or activator is a selective Kalpa inhibitor or activator. In other embodiments, a Kalpa inhibitor is capable of inhibiting or increasing the activity of or binding to more than one (e.g. at least two, three, four) Kalpa -like proteins. Assays of the invention may be used to screen any suitable collection of compounds.
In a preferred embodiment, an inhibitor or activator is capable of inhibiting or increasing a Kalpa activity or preferably a Kalpa activity. Assays may thus be designed according to well known means to detect a Kalpa activities described herein. In other aspects, an inhibitor or activator is capable of modulating cholesterol regulation (CHOL), HDL-cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation. Methods for detecting said endpoints are known in the art.
In further aspects, an inhibitor or activator is capable of modulating (e.g. activating or inhibiting) the expression of the LDL receptor (LDLR). The inhibitor or activator is preferably able to increase the level of expression of LDLR. Preferably the method involves identifying compounds capable of enhancing the expression of the LDLR, preferably involving detecting modulation of Kalpa activity. Any suitable assay endpoint indicating activity of LDLR function or expression can be used. In one embodiment, the invention provides assays for screening candidate or test compounds which are target molecules of a Kalpa or polypeptide or biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds which bind to or modulate the activity of a Kalpa or polypeptide or biologically active portion thereof. The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection. The biological library approach is used with peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145, the disclosure of which is incoφorated herein by reference in its entirety).
Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91 :11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233, the disclosures of which are incoφorated herein by reference in their entireties. Libraries of compounds may be presented in solution (e.g., Houghten (1992)
Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.), the disclosures of which are incoφorated herein by reference in their entireties.
Determining the ability of the test compound to inhibit or increase Kalpa activity can also be accomplished, for example, by coupling the Kalpa or biologically active portion thereof with a radioisotope or enzymatic label such that binding of the Kalpa or biologically active portion thereof to its cognate target molecule can be determined by detecting the labeled Kalpa or biologically active portion thereof in a complex. For example, compounds (e.g., Kalpa or biologically active portion thereof) can be labeled with 125^ 35$^ 14^ or -Η, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product. The labeled molecule is placed in contact with its cognate molecule and the extent of complex formation is measured. For example, the extent of complex formation may be measured by immuno precipitating the complex or by performing gel electrophoresis.
It is also within the scope of this invention to determine the ability of a compound (e.g., Kalpa or biologically active portion thereof) to interact with its cognate target molecule without the labeling of any of the interactants. For example, a microphysiometer can be used to detect the interaction of a compound with its cognate target molecule without the labeling of either the compound or the target molecule. McConnell, H. M. et al. (1992) Science 257:1906-1912, the disclosure of which is incorporated herein by reference in its entirety. A microphysiometer such as a cytosensor is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between compound and receptor.
In a preferred embodiment, the assay comprises contacting a cell which expresses a
Kalpa or biologically active portion thereof, with a target molecule to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to inhibit or increase the activity of the Kalpa or biologically active portion thereof, wherein determining the ability of the test compound to inhibit or increase the activity of the Kalpa or biologically active portion thereof, comprises determining the ability of the test compound to inhibit or increase a biological activity of the Kalpa expressing cell (e.g., for example determining the ability of the test compound to inhibit or increase transduction, proteimprotein interactions). In another preferred embodiment, the assay comprises contacting a cell which is responsive to a Kalpa or biologically active portion thereof, with a Kalpa or biologically-active portion thereof, to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to modulate the activity of the Kalpa or biologically active portion thereof, wherein determining the ability of the test compound to modulate the activity of the Kalpa or biologically active portion thereof comprises determining the ability of the test compound to modulate a biological activity of the Kalpa-responsive cell.
In another embodiment, an assay is a cell-based assay comprising contacting a cell expressing a Kalpa target molecule (i.e. a molecule with which Kalpa interacts) with a test compound and determining the ability of the test compound to modulate (e.g. stimulate or inhibit) the activity of the Kalpa target molecule. Determining the ability of the test compound to modulate the activity of a Kalpa target molecule can be accomplished, for example, by determining the ability of the Kalpa to bind to or interact with the Kalpa target molecule. Examples of such cell and non-cell based interaction assays are described in Degterev et al, Nature Cell Biol. 3: 173-182 (2001) and Dandliker et al, Methods Enzymol. 74: 3-28 (1981), the disclosures of which are incorporated herein by reference.
Preferably, cells used in the cellular assays of the invention are cultured HepG2 cells, skin fibroblast, NIH 3T3 or muscle cells. Optionally, the cell is a Kalpa mutant, preferably having reduced or lacking Kalpa protein activity, or lacking or having reduced Kalpa expression.
Determining the ability of the Kalpa to bind to or interact with a Kalpa target molecule can be accomplished by one of the methods described above for determining direct binding. In a preferred embodiment, determining the ability of the Kalpa to bind to or interact with a Kalpa target molecule can be accomplished by determining the activity of the target molecule. For example, the activity of the target molecule can be determined by contacting the target molecule with the Kalpa or a fragment thereof and measuring induction of a cellular second messenger of the target (i.e. intracellular Ca2+, diacylglycerol, IP3, etc.), detecting catalytic/enzymatic activity of the target on an appropriate substrate, detecting the induction of a reporter gene (comprising a target-responsive regulatory element operatively linked to a nucleic acid encoding a detectable marker, e.g., luciferase), or detecting a target-regulated cellular response, for example, signal transduction or protei protein interactions.
In yet another embodiment, an assay of the present invention is a cell-free assay in which a Kalpa or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the Kalpa or biologically active portion thereof is determined. Binding of the test compound to the Kalpa can be determined either directly or indirectly as described above. In a preferred embodiment, the assay includes contacting the Kalpa or biologically active portion thereof with a known compound which binds a Kalpa protein (e.g., a Kalpa target molecule) to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a Kalpa protein, wherein determining the ability of the test compound to interact with a Kalpa protein comprises determining the ability of the test compound to preferentially bind to a Kalpa protein or biologically active portion thereof as compared to the known compound.
In another embodiment, the assay is a cell-free assay in which a Kalpa protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to modulate (e.g., stimulate or inhibit) the activity of the Kalpa protein or biologically active portion thereof is determined. Determining the ability of the test compound to modulate the activity of a Kalpa protein can be accomplished, for example, by determining the ability of the Kalpa protein to bind to a Kalpa target molecule by one of the methods described above for determining direct binding. Determining the ability of the Kalpa protein to bind to a Kalpa target molecule can also be accomplished using a technology such as real-time Biomolecular Interaction Analysis (BIA). Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705, the disclosures of which are incoφorated herein by reference in their entireties. As used herein, "BIA" is a technology for studying biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the optical phenomenon of surface plasmon resonance (SPR) can be used as an indication of real-time reactions between biological molecules.
In an alternative embodiment, determining the ability of the test compound to modulate the activity of a Kalpa protein can be accomplished by determining the ability of the Kalpa protein to further modulate the activity of a downstream effector (e.g., a growth factor mediated signal transduction pathway component) of a Kalpa target molecule. For example, the activity of the effector molecule on an appropriate target can be determined or the binding of the effector to an appropriate target can be determined as previously described.
In yet another embodiment, the cell-free assay involves contacting a Kalpa protein or biologically active portion thereof with a known compound which binds the Kalpa protein to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with the Kalpa, wherein determining the ability of the test compound to interact with the Kalpa protein comprises determining the ability of the Kalpa protein to preferentially bind to or modulate the activity of a Kalpa target molecule. The cell-free assays of the present invention are amenable to use of both soluble and/or membrane-bound forms of isolated proteins (e.g. Kalpa protein or biologically active portions thereof or molecules to which Kalpa targets bind). In the case of cell-free assays in which a membrane-bound form an isolated protein is used it may be desirable to utilize a solubilizing agent such that the membrane-bound form of the isolated protein is maintained in solution. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n- dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N- methylglucamide, Triton X-100, Triton X-l 14, Thesit, Isotridecypoly(ethylene glycol ether)n,3- [(3-cholamidopropyl)dimethylamminio]- 1-propane sulfonate (CHAPS), 3-[(3- cholamidopropyl)dimethylamminio]-2-hydroxy-l -propane sulfonate (CHAPSO), or N- dodecyl=N,N-dimethyl-3-ammonio- 1-propane sulfonate.
In more than one embodiment of the above assay methods of the present invention, it may be desirable to immobilize either Kalpa protein or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a Kalpa protein, or interaction of a Kalpa protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtitre plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/Kalpa fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtitre plates, which are then combined with the test compound or the test compound and either the non- adsorbed target protein or Kalpa protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtitre plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of Kalpa protein binding or activity determined using standard techniques. Other techniques for immobilizing proteins on matrices can also be used in the screening assays of the invention. For example, either a Kalpa protein or a Kalpa target molecule can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated Kalpa protein or target molecules can be prepared from biotin-NHS (N-hydroxy-succinimide) using techniques well known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, 111.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical). Alternatively, antibodies reactive with Kalpa protein or target molecules but which do not interfere with binding of the Kalpa protein to its target molecule can be derivatized to the wells of the plate, and unbound target or Kalpa trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST- immobilized complexes, include immunodetection of complexes using antibodies reactive with the Kalpa protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the Kalpa protein or target molecule.
In another embodiment, modulating Kalpa activity comprises moduating Kalpa expression. Modulators of Kalpa expression are identified in a method wherein a cell is contacted with a candidate compound and the expression of Kalpa mRNA or protein in the cell is determined. The level of expression of Kalpa mRNA or protein in the presence of the candidate compound is compared to the level of expression of Kalpa mRNA or protein in the absence of the candidate compound. The candidate compound can then be identified as a modulator of Kalpa expression based on this comparison. For example, when expression of Kalpa mRNA or protein is greater (statistically significantly greater) in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of Kalpa mRNA or protein expression. Alternatively, when expression of Kalpa mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of Kalpa mRNA or protein expression. The level of Kalpa mRNA or protein expression in the cells can be determined by methods described herein for detecting Kalpa mRNA or protein.
In yet another aspect of the invention, the Kalpa can be used as "bait proteins" in a two- hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300, the disclosures of which are incorporated herein by reference in their entireties), to identify other proteins, which bind to or interact with Kalpa ("Kalpa-binding proteins" or " Kalpa-bp") and are involved in Kalpa activity. Such Kalpa-binding proteins are also likely to be involved in the propagation of signals by the Kalpa or Kalpa targets as, for example, downstream elements of a Kalpa-mediated signaling pathway. Alternatively, such Kalpa- binding proteins are likely to be Kalpa inhibitors.
The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a Kalpa or a fragment thereof is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein ("prey" or "sample") is fused to a gene that codes for the activation domain of the known transcription factor. If the "bait" and the "prey" proteins are able to interact, in vivo, forming a Kalpa-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the Kalpa protein.
As discussed, SREBPs are bound to the ER membrane and nuclear envelope. The NH2 - terminal and COOH-terminal domains, each about 500 amino acids in length, project into the cytosol. They are linked by a pair of membrane-spanning sequences that flank a short 31-amino acid hydrophilic loop that projects into the lumen of the ER and nuclear envelope (Brown and Goldstein, Cell, 89:331-340, 1997). When cells are depleted of sterols, site-1 protease (SIP) cleaves the SREBPs at a leucine-serine bond in the luminal loop, thereby separating the proteins into halves, each with a single membrane spanning sequence (Duncan et al., J. Biol. Chem., 272:12778-12785, 1997). The NH2 -terminal half is called the "intermediate" form of SREBP. Next, a second protease, designated Site-2 protease (S2P), cleaves the NH2 -terminal intermediate at a leucine-cysteine bond that is located just within the first membrane spanning segment (Duncan et al., J. Biol. Chem., 273: 17801-17809, 1998). This liberates the NH2 - teπninal fragment, which dissociates from the membrane with three hydrophobic residues at its COOH-terminus. This fragment, designated nuclear SREBP (nSREBP) enters the nucleus and activates gene transcription.
The Site-1 cleavage reaction requires the participation of a membrane-bound regulatory protein designated SREBP cleavage-activating protein (SCAP) (Hua et al., Cell, 87:415-426, 1996). SCAP has two domains: a hydrophobic NH2 -terminal membrane domain consisting of eight membrane spanning sequences and a hydrophilic COOH-terminal domain containing five "WD" repeats that projects into the cytosol (Nohturfft et al., J. Biol. Chem., 273:17243-17250. 1998). The COOH-terminal domain of SCAP forms a tight complex with the COOH-terminal domain of the SREBPs (Sakai et al., J. Biol. Chem., 272:20213-20221, 1997; Sakai et al., J. Biol. Chem., 273:5785-5793, 1998). Disruption of this complex by overexpression of truncated dominant negative versions of SCAP or SREBP blocks Site- 1 cleavage of SREBPs, indicating that the SCAP/SREBP complex is absolutely required for cleavage. Moreover, truncated versions of SREBPs, which lack the COOH-terminal domain, fail to fonn complexes with SCAP and fail to undergo Site-1 cleavage (Sakai et al., 1998). Although the SCAP/SREBP complex is created by interactions on the cytosolic side of the membrane, the complex activates SIP, which cuts the SREBPs on the opposite (luminal) side (Sakai et al., 1998).
The Site-1 processing reaction is the target for feedback regulation of lipid biosynthesis and uptake in animal cells. When sterols accumulate in cells, the Site-1 cleavage reaction is blocked (Brown and Goldstein, 1997). The Site-2 cleavage reaction is blocked secondarily since it requires prior cleavage by SIP. The sterol effect appears to be mediated by five of the eight membrane spanning sequences of SCAP, which are designated as the sterol sensor (Hua et al., 1996). Point mutations at two positions within the sterol sensor render SCAP constitutively active and prevent sterol-mediated suppression of Site-1 cleavage (Nohturfft et al., Proc. Natl. Acad. Sci. USA, 93:13709-13714, 1996). Sequences that resemble the sterol-sensing domain are found in three other proteins that are postulated to interact with sterols (Loftus et al, Science, 277:232-235, 1997; Nohturfft et al., 1998).
The human gene for S2P has been described by Rawson et al. (Molecular Cell, 1:47-57, 1997). This gene encodes a unique hydrophobic zinc metalloprotease that cleaves the intermediate forms of SREBPs within their transmembrane sequences. The invention thus provides several methods for identifying or selecting candidates among compounds that modulate Kalpa activity.
In one aspect, there is provided a method of inducing LDL receptor expression through the modulation of Kalpa activity by contacting a cell with a compound that modulates Kalpa activity (e.g. activity of the protein itself and/or expression of the protein), wherein the modulation of said Kalpa protein results in the induction of low density lipoprotein receptor expression.
In other aspect, an indirect Kalpa activity can be measured using means known in the art. In one example, the invention comprises contacting a cell with a candidate Kalpa modulator and measuring LDL uptake in a cell. The binding, internalization and degradation of LDL in fibroblasts can be measured using known means in the art. Examples of suitable methods are provided in Goldstein, J.L., et al. ((1983) Methods Enzymol. 98, 241), and International Patent Publication No WO 99/47566, the disclosures of which are incorporated herein by reference. Prefeably, LDL uptake is measured in cultured HepG2 cells, skin fϊbroblast, NIH 3T3 or muscle cells. In one example, HepG2 cells are seeded in Biocat slides and pretreated with or without 25-hydroxycholesterol for 20 hours. Cells are incubated for 4 hours with 6 μg/ml fluorescent Di-LDL (fluorescent dye 3,3'-dioctadecylindocarbocyanine). Intracellular fluorescent dye is detected by microscopy using rhodamine filters. In another example, rimary human and monkey hepatocytes are seeded for 24 hours in collagen-coated plates in William E medium (0.1%) FCS, 0.4 μg/ml insulin, 0.1 μM dexamethasome). FACS is also used in parallel to quantify fluorescence.
In further examples, an indirect Kalpa activity can be assessed in an animal model. For example, the invention comprises administering a candidate Kalpa modulator to an animal, preferably a non-human animal, and measuring blood cholesterol levels, preferably the depletion of blood chlesterol levels. In one example, Kalpa activity itself can be administered, and blood cholesterol levels are measured. The method may comprise providing animals which transiently overexpress Kalpa, and measuring blood cholesterol levels. In preferred aspects, provided is a method of identifying or selecting a candidate compound that modulates Kalpa activity comprising administering a candidate Kalpa modulator to an animal and measuring blood cholesterol levels. Preferably said animal is a non-human animal. The present invention is also directed to a method of identifying or selecting candidate compounds that induce Kalpa-mediated LDL receptor expression. Preferably said candidate compounds are compounds that are known to or suspected of modulating Kalpa activity. The method comprises contacting a cell with a candidate Kalpa modulator and measuring low- density lipoprotein receptor expression. Activation of Kalpa and induction of low-density lipoprotein receptor expression in the presence of the compound is indicative of the compound's ability in inducing Kalpa -mediated LDL receptor expression.
In one aspect Kalpa may be involved in generating active SREBP, which in turn is known to mediate the transcription of several proteins involved in cholesterol regulation including proteins involved in cholesterol uptake and synthesis. In one aspect, assays comprise contacting a cell with a candidate Kalpa modulator, and detecting SREBP-mediated transcription, or levels of a protein encoded by a SREBP-mediated transcript. Preferably the method comprises detecting LDL receptor expression (e.g. detecting protein and/or mRNA). Examples of assays for the detection of LDL receptor expression are described in International Patent Publication No. WO 94/26922, the disclosure of which is incoφorated herein by reference. Detection and/or the quantitation of human LDL receptor can also be carried out using specific monoclonal antibodies. International Patent Publication No WO 01/68710 for example provides monoclonal antibodies, hybridomas producing said antibodies and methods for detecting LDL receptor. LDL receptor can be detected using Mabs that can be used as a pair in an ELISA (Enzyme Linked Immuno Sorbent Assay) for detection of human soluble LDL receptor, or using Mabs for identification of the LDLR in Western Blot analysis.
Method of detecting SREBP related expression can be carried out using the methods of International Patent Publication No WO 94/26922. Screening methods are based upon cellular assays in which candidate substances are screened for their ability to stimulate SRE mediated transcription and gene expression, and particularly, reporter gene expression. The preferred cellular assays comprise preparing a recombinant plasmid including a reporter gene, preferably a CAT gene or luciferase gene, under the transcriptional control of a functional SRE-1 sequence and introducing the plasmid into a recombinant host cell, such as a monkey CV-1 cell. The host cell will also include and express an SREBP protein and optionally a Kalpa protein, either naturally, or due to the presence of a recombinant vector that expresses SREBP and/or Kalpa. The host cell is then cultured under conditions effective to allow expression of the reporter gene, which expression is measured, and then the cell is contacted with the candidate substance and the new level of reporter gene expression is measured. An increase in reporter gene expression in thepresence of the candidate substance is indicative of a candidate substance capable of stimulating SRE mediated transcription.
Still further embodiments of the invention concernmethods to assay for candidate substances capable of stimulating SRE mediated gene transcription in the presence of sterols. These assays may be employed as a first screen, or a second screen to further analyze the properties of candidate substances which tested positive in earlier Kalpa activity or binding assays, The sterol-responsive cellular screening method involves culturing the host cell in the presence of sterols and then adding the candidate substance, wherein an increase in reporter gene expression is indicative of a substance capable of stimulating SRE mediated transcription even in the presence of sterols. As a practical manner, it is generally preferred to first measure gene expression without sterols, then to add sterols and to measure the "sterol-suppressed expression" and then to add the candidate substance and to test for increased reporter gene expression relative to the sterol-suppressed expression levels.
Other assays contemplated by the inventors involve the co-introduction of SRE mediated reporter genes along with Kalpa-encoding genes into host cells. These recombinant host constructs are then employed to screen for agents that will act to modulate expression of the reporter gene and/or SREBP gene. The use of the co-introduction approach may have advantages in terms of sensitivity, allowing a response that is readily detectable by automated detection means, such as by FACS or related technology. Moreover, a co-introduction system allows ready manipulation for the identification of selective agents that act specifically e,g„ through modulation of Kalpa levels and/or through modulation of downstream SREBP and LDL receptor modulating function.
In another aspect, Kalpa may be involved in cleavage of membrane bound for of SREBP. Provided are assays comprising contacting a cell with a candidate Kalpa modulator, and detecting SIP- or S2P-mediated cleavage, preferably S IP-mediated cleavage. Preferably cleavage of an SREPB related substrate is measured. In other aspect, provided are assays comprising contacting a cell with a candidate Kalpa modulator, and detecting SIP recruitment or binding. In other aspects, provided are assays comprising contacting a cell with a candidate Kalpa modulator, and detecting levels of membrane bound or non-membrane bound or DNA- bound SREBP protein. In yet further assays, methods comprise contacting a cell with a candidate Kalpa modulator, and detecting detecting levels of or localization of SREBP protein. In another embodiment, a Kalpa inhibitor is assessed by screening for modified Site-1 protease cleavage, the method comprising the steps of:
(a) preparing a solution comprising a candidate modulator, Site-1 protease, and a polypeptide target sequence capable of being cleaved by said Site-1 protease;
(b) monitoring the rate cleavage of said target sequence in the presence of the candidate modulator relative to the rate of cleavage in the absence of the candidate; and
(c) preparing said modulator. In other embodiments, the solution further comprises SREBP cleavage activating protein.
In another embodiment, a Kalpa inhibitor is assessed by screening for modified Site-1 protease cleavage, the method comprising the steps of:
(a) providing a cell comprising a transgene encoding a fusion protein between a reporter polypeptide and a polypeptide comprising a Site-1 protease target sequence, wherein said transgene is operably linked to a promoter functional in said cell, and wherein said cell comprises Site-1 protease activity;
(b) contacting the cell with a candidate modulator;
(c) monitoring the activity of said Site-1 protease by detecting said reporter polypeptide; and (d) preparing said modulator.
Optionally, said methods may comprise monitoring Site-2 protease cleavage instead or in addition to Site-1 protease cleavage.
As discussed, Kalpa may interact with SCAP and may be involved in mediating SCAP- SREBP interactions. The methods of the invention thus further include assays that comprise contacting a cell with a candidate Kalpa modulator, and detecting binding of Kalpa to a SCAP protein (e.g. levels of), or cellular localization of SCAP protein. In further embodiments, the invention provides a method comprising contacting a cell with a candidate Kalpa modulator, and detecting a SCAP-SREBP complex.
The invention also provides cells overexpressing Kalpa, or lacking or having diminished Kalpa activity. Cells lacking Kalpa activity or having diminished Kalpa activity (e.g. 'knockdown', generated using mRNA interference methods for example) would provide a valuable means for studying the LDL receptor and SREBP pathway. In other aspects, the in vivo effects of Kalpa on LDL and cholesterol metabolism can be studied using cells that transiently overexpressed hepatic Kalpa as a result of infection by intravenous infusion with a recombinant, replication defective adenovirus. In the adenovirus, the Kalpa cDNA is under the control of the cytomegalovirus (CMV) immediate early enhancer/promotor. Controls can include mice infected with a replication defective adenovirus lacking a cDNA transgene. Kalpa expression can be determined by immunofluorescence microscopy and by immunoblotting. Kalpa expression can then be monitored in the livers of adenovirus treated animals over the course of several days. Levels and localization of the expression in the liver can then be assessed.
The effects of hepatic Kalpa overexpression on plasma cholesterol levels can be determined using known methods. Total cholesterol and plasma cholesterol can be measured. Fast pressure liquid chromatography (FPLC) analysis of plasma can be performed to determine specifically the effects of hepatic Kalpa overexpression on the different classes of lipoproteins. Methods include determining the portion of cholesterol contained in the HDL, VLDL and LDL fractions.
In another example, an LDL clearance study is performed. Mice are infused with either the control virus or with Kalpa adenovirus. Several days following virus infusion, when transgene expression levels are maximal, mice were infused with 125 I-labeled LDL. Plasma samples are obtained at various time points, and the amount of 125 1 remaining in the plasma is determined.
Active Compounds
This invention further pertains to novel agents identified by the above-described screening assays and to processes for producing such agents by use of these assays. Accordingly, in one embodiment, the present invention includes a compound or agent obtainable by a method comprising the steps of any one of the aformentioned screening assays (e.g., cell-based assays or cell-free assays). For example, in one embodiment, the invention includes a compound or agent obtainable by a method comprising contacting a cell which expresses a Kalpa target molecule with a test compound and the determining the ability of the test compound to bind to, or modulate the activity of, the Kalpa target molecule. In another embodiment, the invention includes a compound or agent obtainable by a method comprising contacting a cell which expresses a Kalpa target molecule with a Kalpa or biologically-active portion thereof, to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with, or modulate the activity of, the Kalpa target molecule. In another embodiment, the invention includes a compound or agent obtainable by a method comprising contacting a Kalpa or biologically active portion thereof with a test compound and determining the ability of the test compound to bind to, or modulate (e.g., stimulate or inhibit) the activity of, the Kalpa or biologically active portion thereof. In yet another embodiment, the present invention included a compound or agent obtainable by a method comprising contacting a Kalpa or biologically active portion thereof with a known compound which binds the Kalpa to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with, or modulate the activity of the Kalpa. Accordingly, it is within the scope of this invention to further use an agent identified as described herein in an appropriate animal model. For example, an agent identified as described herein (e.g., a Kalpa modulating agent, an antisense Kalpa nucleic acid molecule, a Kalpa- specific antibody, or a Kalpa-binding partner) can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above-described screening assays for treatments as described herein.
The present inventon also pertains to uses of novel agents identified by the above- described screening assays for diagnoses, prognoses, and treatments as described herein. Accordingly, it is within the scope of the present invention to use such agents in the design, formulation, synthesis, manufacture, and/or production of a drug or pharmaceutical composition for use in diagnosis, prognosis, or treatment, as described herein. For example, in one embodiment, the present invention includes a method of synthesizing or producing a drug or pharmaceutical composition by reference to the structure and/or properties of a compound obtainable by one of the above-described screening assays. For example, a drug or pharmaceutical composition can be synthesized based on the structure and/or properties of a compound obtained by a method in which a cell which expresses a Kalpa target molecule is contacted with a test compound and the ability of the test compound to bind to, or modulate the activity of, the Kalpa target molecule is determined. In another exemplary embodiment, the present invention includes a method of synthesizing or producing a drug or pharmaceutical composition based on the structure and/or properties of a compound obtainable by a method in which a Kalpa or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to, or modulate (e.g., stimulate or inhibit) the activity of, the Kalpa or biologically active portion thereof is determined.
Methods Of Treatment using Kalpa Modulators
Another aspect of the invention pertains to methods of modulating Kalpa expression or activity for therapeutic and/or prophylactic puφoses. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell or Kalpa protein with an agent that modulates one or more of the activities of Kalpa activity. An agent that modulates Kalpa protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of the candidate protein (e.g., a phosphorylation or cleavage substrate), an antibody, an agonist or antagonist, a peptidomimetic of an agonist or antagonist, or other small molecule.
In one embodiment, the agent stimulates one or more Kalpa activities. Examples of such stimulatory agents include active Kalpa protein and a nucleic acid molecule encoding the candidate gene that has been introduced into the cell. In another embodiment, the agent inhibits one or more of the candidate activites. Examples of such inhibitory agents include antisense nucleic acid molecules, and antibodies. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g, by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant expression or activity of the candidate protein or nucleic acid molecule.
In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., upregulates or downregulates) Kalpa expression or activity. In another embodiment, the method involves administering the Kalpa protein or nucleic acid molecule as therapy to compensate for reduced or aberrant Kalpa expression or activity. Stimulation of Kalpa activity is desirable in situations in which it is abnormally downregulated and/or in which increased activity is likely to have a beneficial effect. For example, stimulation of Kalpa activity is desirable in situations in which Kalpa is downregulated and/or in which increased activity is likely to have a beneficial effect. Likewise, inhibition of Kalpa activity is desirable in situations in which the candidate is abnormally upregulated and/or in which decreased activity is likely to have a beneficial effect.
The candidate molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on Kalpa activity (e.g. gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) disorders (e.g, cholesterol disorders such as hypercholesterolemia) associated with aberrant activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a candidate molecule or modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a candidate molecule or modulator. Kalpa modulators, including inhibitors and activators, identified according to the methods in the section titled "Drug Screening Assays" can be further tested for their ability to ameliorate or disorder related to cholesterol regulation in a suitable animal model of disease.
Kalpa modulators, preferably modulators (e.g. inhibitors or activators) of Kalpa activity will be useful in the treatment of cholesterol regulation, HDL-cholesterol regulation, LDL- cholesterol regulation, triglycerides regulation and/or LDL/HDL ratio regulation. Said modulators will preferably also be useful in modulating expression of the LDL receptor. Accordingly, the present invention provides a method for the prophylaxis and/or treatment of a disorder related to cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation, LDL/HDL ratio regulation and/or LDL receptor expression or activity. The present invention provides a method for the prophylaxis and/or treatment of a disorder including for example hyperlipidaemia, hypercholesterolaemia, hypertriglyceridaemia, heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and type II Diabetes. It will be appreciated that any suitable means of inhibiting or activating Kalpa activity can be used. Described herein for example are small molecule compounds, active recombinant Kalpa and dominant negative Kalpa, antibodies specifically binding Kalpa, Kalpa antisense molecules and Kalpa-encoding gene therapy vectors. In a preferred embodiment, a disorder to be treated is characterized by aberrant protein or nucleic acid expression and is a cellular, metabolic-related disorder, e.g., a cholesterol related disorder or cholesterol homeostasis disorder.
In further aspects, the invention also provides methods of the prophylaxis and/or treatment of a disorder comprising modulating (e.g. activating or inhibiting) the activity of the ubiquitin-proteasome pathway which metabolically regulates intra-cellular degradation of apoB, or by modulating the expression of the LDLR gene which regulates the uptake of LDL.
Type 2 diabetes mellitus
Type 2 diabetes mellitus is an increasingly common disorder of carbohydrate and lipid metabolism. Type 2 diabetes is now a major global health problem that affects over 124 million individuals worldwide. In the United States, type 2 diabetes affects 90% of the 15.6 persons with diabetes, of which approximately one half remain undiagnosed. In addition, type 2 diabetes, which is normally associated with older adults, is becoming more common in children and adolescents. 800,000 new cases are identified each year. Patients with non-insulin- dependent diabetes have an increased incidence of ischaemic heart disease (IHD) when compared with nondiabetic subjects. In addition, they have a worse prognosis after their first myocardial infarction (MI). Abnormalities in insulin and glucose metabolism do not seem to entirely account for the high frequency of cardiovascular disease in patients with type 2 diabetes mellitus. An important additional factor may be dyslipidemia. There are a variety of environmental and genetic factors that seem to mediate the development of type 2 diabetes.
Dysmetabolic syndrome
The metabolic syndrome or syndrome X is characterized by the association of various cardiovascular risk factors (among which impaired glucose tolerance, arterial hypertension and dyslipidaemias), all closely linked to insulin resistance which is indeed the core of the syndrome. The first unifying definition for the metabolic syndrome was proposed by WHO in 1998. In accordance to this, patients with type 2 diabetes mellitus with impaired glucose tolerance have the syndrome if they fulfill two of the criteria: hypertension, dyslipidaemia, obesity/abdominal obesity and microalbuminuria. Importantly, presence of the dysmetabolic syndrome is associated with reduced survival, particularly because of increased cardiovascular mortality.
The metabolic syndrome most likely results from inteφlay between several genes and an affluent environment. A few candidate genes, encoding proteins of glucose, insulin and lipid metabolism, lipolytic cascade, fatty acid intestinal absoφtion, glucocorticoid metabolism, haemostasis and blood pressure, have been associated with a clustering of metabolic abnormalities, although the functional significance of these associations remains to be established. Furthermore, genetic polymorphisms, such as those detected at several lipoprotein metabolism loci, can modulate the relationships between different components of the metabolic syndrome. A growing understanding of the genetic architecture of the metabolic syndrome may help in the prevention of this condition.
Obesity
Obesity can be defined as BMI > 95th percentile for age and sex from large surveys that were carried out in the past. Using these cut points, over 10% of all children and adolescents are obese, and another 10% are overweight (BMI > 85th percentile). Obesity is a risk factor for many chronic diseases, including glucose intolerance, lipid disorders, hypertension, and coronary heart disease. Increased BMI results from a cumulative positive energy balance and is favored by both genetic and environmental factors. The prevalence of obesity increases rapidly in developed and developing countries, emphasizing the urgent need to identify new pharmacological and dietary intervention points to reduce excess body fat.
An "individual" treated by the methods of this invention is a vertebrate, particularly a mammal (including model animals of human disease, farm animals, sport animals, and pets), and typically a human. "Treatment" refers to clinical intervention in an attempt to alter the natural course of the individual being treated, and may be performed either for prophylaxis or during the course of clinical pathology. Desirable effects include preventing occurrence or recurrence of disease, alleviation of symptoms, diminishment of any direct or indirect pathological consequences of the disease, such as hyperresponsiveness, inflammation, or necrosis, lowering the rate of disease progression, amelioration or palliation of the disease state, and remission or improved prognosis. The "pathology" associated with a disease condition is anything that compromises the well-being, normal physiology, or quality of life of the affected individual.
Treatment is performed by administering an effective amount of a Kalpa inhibitor or activator, or preferably a Kalpa inhibitor or activator. An "effective amount" is an amount sufficient to effect a beneficial or desired clinical result, and can be administered in one or more doses.
The criteria for assessing response to therapeutic modalities employing the lipid compositions of this invention are dictated by the specific condition, measured according to standard medical procedures appropriate for the condition.
Pharmaceutical Compositions
Compounds capable of activating inhibiting Kalpa activity, preferably small molecules but also including peptides, Kalpa nucleic acid molecules, Kalpa, and anti-Kalpa antibodies (also referred to herein as "active compounds") of the invention can be incorporated into pharmaceutical compositions suitable for administration. Such compositions typically comprise a pharmaceutically acceptable carrier. As used herein the language "pharmaceutically acceptable carrier" is intended to include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absoφtion delaying agents, and the like, compatible with pharmaceutical administration. The use of such media and agents for pharmaceutically active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active compound, use thereof in the compositions is contemplated. Supplementary active compounds can also be incoφorated into the compositions.
A pharmaceutical composition of the invention is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.
Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absoφtion of the injectable compositions can be brought about by including in the composition an agent which delays absoφtion, for example, aluminum monostearate and gelatin. Where the active compound is a protein, peptide or anti-Kalpa antibody, sterile injectable solutions can be prepared by incoφorating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof. Oral compositions generally include an inert diluent or an edible carrier. They can be enclosed in gelatin capsules or compressed into tablets. For the puφose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules. For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer. Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art. Most preferably, active compound is delivered to a subject by intravenous injection.
In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Coφoration and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811, the disclosure of which is incorporated herein by reference in its entirety. It is especially advantageous to formulate oral or preferably parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier. The specification for the dosage unit forms of the invention are dictated by and directly dependent on the unique characteristics of the active compound and the particular therapeutic effect to be achieved, and the limitations inherent in the art of compounding such an active compound for the treatment of individuals. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50%> of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit large therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.
The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography. The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.
Diagnostic and Prognostic Uses
The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics; and in drug screening and methods of treatment (e.g., therapeutic and prophylactic) as further described herein. Activities of the various Kalpa are as described herein. The isolated nucleic acid molecules of the invention can be used, for example, to express Kalpa (e.g., via a recombinant expression vector in a host cell or in gene therapy applications), to detect Kalpa mRNA (e.g., in a biological sample) or a genetic alteration in a Kalpa gene, and to modulate Kalpa activity, as described further below. The Kalpa can be used to treat disorders characterized by insufficient or excessive production of a Kalpa or Kalpa target molecules. In addition, the Kalpa can be used to screen for naturally occurring Kalpa target molecules, to screen for drugs or compounds which modulate, preferably inhibit Kalpa activity, as well as to treat disorders characterized by insufficient or excessive production of Kalpa or production of Kalpa forms which have decreased or aberrant activity compared to Kalpa wild type protein. Moreover, the anti-Kalpa antibodies of the invention can be used to detect and isolate Kalpa, regulate the bioavailability of Kalpa, and modulate Kalpa activity.
Disorders in which the diagnostic and prognostic method may be useful include disorders related to cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, triglycerides regulation, LDL/HDL ratio regulation, and LDL receptor expression or activity. The present invention provides a method for the prophylaxis and/or treatment of a disorder including for example hyperlipidaemia, hypercholesterolaemia, hypertriglyceridaemia, heart disease, coronary artery disease, myocardial infarct, and lipid related metabolic disorders such as the dysmetabolic syndrome, Obesity and Diabetes type II.
Accordingly one embodiment of the present invention involves a method of use (e.g., a diagnostic assay, prognostic assay, or a prophylactic/therapeutic method of treatment) wherein a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or most preferably a Kalpa inhibitor or activator) is used, for example, to diagnose, prognose and/or treat a disease and/or condition in which any of the aforementioned Kalpa activities is indicated. In another embodiment, the present invention involves a method of use (e.g., a diagnostic assay, prognostic assay, or a prophylactic/therapeutic method of treatment) wherein a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator) is used, for example, for the diagnosis, prognosis, and/or treatment of subjects, preferably a human subject, in which any of the aforementioned activities is pathologically perturbed. In a preferred embodiment, the methods of use (e.g., diagnostic assays, prognostic assays, or prophylactic/therapeutic methods of treatment) involve administering to a subject, preferably a human subject, a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator) for the diagnosis, prognosis, and/or therapeutic treatment. In another embodiment, the methods of use (e.g., diagnostic assays, prognostic assays, or prophylactic/therapeutic methods of treatment) involve administering to a human subject a molecule of the present invention (e.g., a Kalpa polypeptide, Kalpa nucleic acid, or a Kalpa inhibitor or activator).
For example, the invention encompasses a method of determining whether a Kalpa polypeptide is expressed within a biological sample comprising: a) contacting said biological sample with: ii) a polynucleotide that hybridizes under stringent conditions to a Kalpa nucleic acid; or iii) a detectable polypeptide that selectively binds to a Kalpa polypeptide; and b) detecting the presence or absence of hybridization between said polynucleotide and an RNA species within said sample, or the presence or absence of binding of said detectable polypeptide to a polypeptide within said sample. A detection of said hybridization or of said binding indicates that said Kalpa is expressed within said sample. Preferably, the polynucleotide is a primer, and wherein said hybridization is detected by detecting the presence of an amplification product comprising said primer sequence, or the detectable polypeptide is an antibody.
Also envisioned is a method of determining whether a mammal, preferably human, has an elevated or reduced level of Kalpa expression, comprising: a) providing a biological sample from said mammal; and b) comparing the amount of a Kalpa polypeptide or of a Kalpa RNA species encoding a Kalpa polypeptide within said biological sample with a level detected in or expected from a control sample. An increased amount of said Kalpa polypeptide or said Kalpa RNA species within said biological sample compared to said level detected in or expected from said control sample indicates that said mammal has an elevated level of Kalpa expression, and wherein a decreased amount of said Kalpa polypeptide or said Kalpa RNA species within said biological sample compared to said level detected in or expected from said control sample indicates that said mammal has a reduced level of Kalpa expression.
Gene Regions Associated with Genetic Disease: Differences in the DNA sequences between individuals affected and unaffected with a disease associated with the Kalpa gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease or phenotype (e.g. low LDL level phenotype). Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymoφhisms. Predictive Medicine:
The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, one aspect of the present invention relates to diagnostic assays for determining Kalpa protein and/or nucleic acid expression as well as activity, in the context of a biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether an individual is afflicted with a disease or disorder, or is at risk of developing a disorder, associated with aberrant expression or activity. The invention also provides for prognostic (or predictive) assays for determining whether an individual is at risk of developing a disorder associated with a Kalpa protein, nucleic acid expression or activity. For example, mutations in the Kalpa gene can be assayed in a biological sample. Such assays can be used for prognostic or predictive purpose to thereby prophylactically treat an individual prior to the onset of a disorder characterized by or associated with the Kalpa protein, nucleic acid expression or activity.
Diagnostic Assays
An exemplary method for detecting the presence or absence of the Kalpa protein or nucleic acid in a biological sample involves obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting Kalpa protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes Kalpa protein such that the presence of Kalpa protein or nucleic acid is detected in the biological sample. A preferred agent for detecting Kalpa mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to Kalpa mRNA or genomic DNA. The nucleic acid probe can be, for example, a human nucleic acid, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to Kalpa mRNA or genomic DNA. Other suitable probes for use in the diagnostic assays of the invention are described herein.
A preferred agent for detecting the Kalpa protein is an antibody capable of binding to the Kalpa protein, preferably an antibody with a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used. The term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled. Examples of indirect labeling include detection of a primary antibody using a fluorescently labeled secondary antibody and end- labeling of a DNA probe with biotin such that it can be detected with fluorescently labeled streptavidin. The term "biological sample" is intended to include tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. That is, the detection method of the invention can be used to detect candidate mRNA, protein, or genomic DNA in a biological sample in vitro as well as in vivo. For example, in vitro techniques for detection of candidate mRNA include Northern hybridizations and in situ hybridizations. In vitro techniques for detection of the candidate protein include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence. In vitro techniques for detection of candidate genomic DNA include Southern hybridizations. Furthermore, in vivo techniques for detection of the Kalpa protein include introducing into a subject a labeled anti- antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In one embodiment, the biological sample contains protein molecules from the test subject. Alternatively, the biological sample can contain mRNA molecules from the test subject or genomic DNA molecules from the test subject. A preferred biological sample is a serum sample isolated by conventional means from a subject.
In another embodiment, the methods further involve obtaining a control biological sample from a control subject, contacting the control sample with a compound or agent capable of detecting the Kalpa protein, mRNA, or genomic DNA, such that the presence of Kalpa protein, mRNA or genomic DNA is detected in the biological sample, and comparing the presence of Kalpa protein, mRNA or genomic DNA in the control sample with the presence of Kalpa protein, mRNA or genomic DNA in the test sample. The invention also encompasses kits for detecting the presence of the Kalpa protein, mRNA, or genomic DNA in a biological sample. For example, the kit can comprise a labeled compound or agent capable of detecting Kalpa protein or mRNA in a biological sample; means for determining the amount of Kalpa protein or mRNA in the sample; and means for comparing the amount of Kalpa protein, mRNA, or genomic DNA in the sample with a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect Kalpa protein or nucleic acid.
The diagnostic methods described herein can furthermore be utilized to identify subjects having or at risk of developing a disease, disorder or trait associated with aberrant expression or activity of the Kalpa gene or protein. For example, the assays described herein, such as the preceding diagnostic assays or the following assays, can be utilized to identify a subject having or at risk of developing a disorder associated with the Kalpa protein, nucleic acid expression or activity. Thus, the present invention provides a method for identifying a disease or disorder associated with aberrant Kalpa expression or activity in which a test sample is obtained from a subject and Kalpa protein or nucleic acid (e.g., mRNA, genomic DNA) is detected, wherein the presence of Kalpa protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant expression or activity. As used herein, a "test sample" refers to a biological sample obtained from a subject of interest. For example, a test sample can be a biological fluid (e.g., serum), cell sample, or tissue.
Furthermore, the prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant Kalpa expression or activity. Thus, the present invention provides methods for determining whether a subject can be effectively treated with an agent for a disorder associated with aberrant Kalpa expression or activity in which a test sample is obtained and Kalpa protein or nucleic acid expression or activity is detected.
The methods of the invention can also be used to detect genetic alterations in the Kalpa gene, thereby determining if a subject with the altered gene is at risk for a disorder associated with the Kalpa gene. In preferred embodiments, the methods include detecting, in a sample of cells from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a -protein, or the mis-expression of the Kalpa gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from the Kalpa gene; 2) an addition of one or more nucleotides to the Kalpa gene; 3) a substitution of one or more nucleotides of the Kalpa gene, 4) a chromosomal rearrangement of the Kalpa gene; 5) an alteration in the level of a messenger RNA transcript of the Kalpa gene, 6) aberrant modification of the Kalpa gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of the Kalpa gene, 8) a non- wild type level of the Kalpa protein, 9) allelic loss of the Kalpa gene, and 10) inappropriate post-translational modification of the Kalpa protein. As described herein, there are a large number of assay techniques known in the art, which can be used for detecting alterations in the Kalpa gene. A preferred biological sample is a tissue or serum sample isolated by conventional means from a subject, e.g., a liver tissue sample. In certain embodiments, detection of the alteration involves the use of a probe/primer in a polymerase chain reaction (PCR), such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegran et al. (1988) Science 241 :23-1080; and Nakazawa et al. (1994) Proc. Natl. Acad. Sci. USA 91:360-364), the latter of which can be particularly useful for detecting point mutations in the Kalpa gene (see Abravaya et al. (1995) Nucleic Acids Res .23:675-682). This method can include the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to the Kalpa gene under conditions such that hybridization and amplification of the Kalpa gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein.
In an alternative embodiment, mutations in the Kalpa gene from a sample cell can be identified by alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.
In other embodiments, genetic mutations in the Kalpa gene can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, to high density arrays containing hundreds or thousands of oligonucleotides probes (Cronin, M.T. et al. (1996) Human Mutation 1: 244-255; Kozal, M.J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in the Kalpa gene can be identified in two dimensional arrays containing light-generated DNA probes as described in Cronin, M.T. et al. supra.
In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the Kalpa gene and detect mutations by comparing the sequence of the sample with the corresponding wild-type (control) sequence. Examples of sequencing reactions include those based on techniques developed by Maxam and Gilbert ((1977) Proc. Natl. Acad. Sci. USA 74:560) or Sanger ((1977) Proc. Natl. Acad. Sci. USA 74:5463). It is also contemplated that any of a variety of automated sequencing procedures can be utilized when performing the diagnostic assays.
The methods described herein may be performed, for example, by utilizing prepackaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving the candidate gene. Furthermore, any cell type or tissue in which the Kalpa gene is expressed may be utilized in the prognostic assays described herein. Expression of the Kalpa gene is further discussed in the examples below.
Monitoring of Effects During Clinical Trials
Monitoring the influence of agents (e.g., drugs or compounds) on the expression or activity of the candidate protein can be applied not only in basic drug screening, but also in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase candidate gene expression, protein levels, or upregulate activity, can be monitored in clinical trials of subjects exhibiting decreased gene expression, protein levels, or downregulated activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease Kalpa gene expression, protein levels, or downregulate activity, can be monitored in clinical trials of subjects exhibiting increased candidate gene expression, protein levels, or upregulated activity. In such clinical trials, the expression or activity of the candidate gene, and preferably, other genes that have been implicated in a disorder can be used as a "read out" or markers of the phenotype of a particular cell. In a preferred aspect, LDLR expression is used as a marker of phenotype, wherein a compound preferably increases LDLR expression.
For example, to study the effect of agents on a cholesterol disorder associated with Kalpa, in a clinical trial, cells can be isolated and RNA prepared and analyzed for the levels of expression of Kalpa and other genes implicated such as preferably LDLR in the associated disorder, respectively. The levels of gene expression (i.e., a gene expression pattern) can be quantified by Northern blot analysis or RT-PCR, or alternatively by measuring the amount of protein produced, or by measuring the levels of activity of the candidate gene or other genes. In this way, the gene expression pattern can serve as a marker, indicative of the physiological response of the cells to the agent. Accordingly, this response state may be determined before, and at various points during treatment of the individual with the agent.
In a preferred embodiment, the present invention provides a method for monitoring the effectiveness of treatment of a subject with an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate identified by the screening assays described herein) comprising the steps of (i) obtaining a pre- administration sample from a subject prior to administration of the agent; (ii) detecting the level of expression of the candidate protein, mRNA, or genomic DNA in the pre-administration sample; (iii) obtaining one or more post-administration samples from the subject; (iv) detecting the level of expression or activity of the protein, mRNA, or genomic DNA in the post- administration samples; (v) comparing the level of expression or activity of the protein, mRNA, or genomic DNA in the pre-administration sample with the protein, mRNA, or genomic DNA in the post administration sample or samples; and (vi) altering the administration of the agent to the subject accordingly. For example, increased administration of the agent may be desirable to increase the expression or activity of the Kalpa to higher levels than detected, i.e., to increase the effectiveness of the agent. Alternatively, decreased administration of the agent may be desirable to decrease expression or activity of Kalpa to lower levels than detected, i.e. to decrease the effectiveness of the agent. According to such an embodiment, expression or activity may be used as an indicator of the effectiveness of an agent, even in the absence of an observable phenotypic response.
Polymorphisms of the Invention in Methods of Genetic Diagnostics
The polymorphisms of the present invention can also be used to develop diagnostics tests capable of identifying individuals who express a detectable trait as the result of a specific genotype or individuals whose genotype places them at risk of developing a detectable trait at a subsequent time.
The trait analyzed using the present diagnostics may be any detectable trait, including a disease involving cholesterol regulation, a response to an agent acting on cholesterol regulation or side effects to an agent acting on cholesterol regulation.
The diagnostic techniques of the present invention may employ a variety of methodologies to determine whether a test subject has a polymorphism pattern associated with an increased risk of developing a detectable trait or whether the individual suffers from a detectable trait as a result of a particular mutation, including methods which enable the analysis of individual chromosomes for haplotyping, such as family studies, single sperm DNA analysis or somatic hybrids.
The present invention provides diagnostic methods to determine whether an individual is at risk of developing a disease or suffers from a disease resulting from a mutation or a polymorphism in a candidate gene of the present invention. The present invention also provides methods to determine whether an individual is likely to respond positively to an agent acting on cholesterol-related disorder or whether an individual is at risk of developing an adverse side effect to an agent acting on cholesterol-related disorder.
These methods involve obtaining a nucleic acid sample from the individual and, determining, whether the nucleic acid sample contains at least one allele or at least one polymoφhism, indicative of a risk of developing the trait or indicative that the individual expresses the trait as a result of possessing a particular candidate gene polymoφhism or mutation (trait-causing allele).
Preferably, in such diagnostic methods, a nucleic acid sample is obtained from the individual and this sample is genotyped using methods described above in "Methods of Genotyping an Individual for Polymorphisms". The diagnostics may be based on a single polymoφhism or a on group of polymorphisms.
In each of these methods, a nucleic acid sample is obtained from the test subject and the polymorphism pattern of one or more of the polymorphisms (for example a polymorphism of SEQ ID Nos 1, 2 or 3 or 5 to 23, or a polymoφhism listed in Table 3) is determined.
In one embodiment, a PCR amplification is conducted on the nucleic acid sample to amplify regions in which polymorphisms associated with a detectable phenotype have been identified. The amplification products are sequenced to determine whether the individual possesses one or more polymorphisms associated with a detectable phenotype. A preferred set of primers includes primes derived from described in SEQ ID Nos. 5 to 23. Alternatively, the nucleic acid sample is subjected to microsequencing reactions as described above to determine whether the individual possesses one or more polymoφhisms associated with a detectable phenotype resulting from a mutation or a polymorphism in a candidate gene. In another embodiment, the nucleic acid sample is contacted with one or more allele specific oligonucleotide probes which, specifically hybridize to one or more candidate gene alleles associated with a detectable phenotype.
The present invention provides methods of determining whether an individual is at risk of developing cholesterol-related disorder or whether said individual suffers from a cholesterol- related disorder, comprising: a) genotyping said individual for at least one Kalpa-related polymoφhism; and b) correlating the result of step a) with a risk of developing cholesterol- related disorder. In a preferred embodiment, said Kalpa-related polymoφhism is selected from the group consisting of polymorphisms of a Kalpa gene. Preferably, said Kalpa-related polymorphism is selected from the polymorphisms described in Table 3. In large part because of the risk of complications such as atherosclerosis, the detection of susceptibility to cholesterol-related disorders in individuals is very important. Consequently, the invention concerns a method for the treatment of cholesterol-related disorders, or a related disorder comprising the following steps: - selecting an individual whose DNA comprises alleles of a Kalpa-related marker or group of polymorphisms associated with a cholesterol-related disorder;
- following up said individual for the appearance (and optionally the development) of the symptoms related to cholesterol-related disorder; and
- administering a treatment acting against cholesterol-related disorder or against symptoms thereof to said individual at an appropriate stage of the disease.
Another embodiment of the present invention comprises a method for the treatment of a cholesterol-related disorder comprising the following steps:
- selecting an individual whose DNA comprises alleles of a polymorphism or of a group of Kalpa-related polymorphisms associated with a cholesterol-related disorder; - administering a preventive treatment of said cholesterol-related disorder to said individual.
In a further embodiment, the present invention concerns a method for the treatment ofcholesterol-related disorder comprising the following steps: - selecting an individual whose DNA comprises alleles of a polymoφhism or of a group of Kalpa-related polymorphisms associated with a cholesterol-related disorder;
- administering a preventive treatment of cholesterol-related disorder to said individual;- following up said individual for the appearance and the development of a cholesterol-related disorder symptoms; and optionally - administering a treatment acting against said cholesterol-related disorder or against symptoms thereof to said individual at the appropriate stage of the disease.
For use in the determination of the course of treatment of an individual suffering from disease, the present invention also concerns a method for the treatment of a cholesterol-related disorder comprising the following steps: - selecting an individual whose DNA comprises alleles of a polymoφhism or of a group of Kalpa-related polymoφhisms associated with the gravity of a cholesterol-related disorder or of the symptoms thereof; and
- administering a treatment acting against said cholesterol-related disorder or symptoms thereof to said individual. The invention also concerns a method for the treatment of a cholesterol-related disorder in a selected population of individuals. The method comprises:
- selecting an individual suffering from cholesterol-related disorder and whose DNA comprises alleles of a polymorphism or of a group of Kalpa-related polymorphisms associated with a positive response to treatment with an effective amount of a medicament acting against said cholesterol-related disorder or symptoms thereof,
- and/or whose DNA does not comprise alleles of a polymoφhism or of a group of Kalpa-related polymoφhisms associated with a negative response to treatment with said medicament; and
- administering at suitable intervals an effective amount of said medicament to said selected individual.
In the context of the present invention, a "positive response" to a medicament can be defined as comprising a reduction of the symptoms related to the disease. In the context of the present invention, a "negative response" to a medicament can be defined as comprising either a lack of positive response to the medicament which does not lead to a symptom reduction or which leads to a side-effect observed following administration of the medicament.
The invention also relates to a method of determining whether a subject is likely to respond positively to treatment with a medicament. The method comprises identifying a first population of individuals who respond positively to said medicament and a second population of individuals who respond negatively to said medicament. One or more polymoφhisms isidentifϊed in the first population which is associated with a positive response to said medicamentor one or more polymorphisms is identified in the second population which is associated with anegative response to said medicament. The polymorphisms may be identified using the techniques described herein.
A DNA sample is then obtained from the subject to be tested. The DNA sample is analyzed to determine whether it comprises alleles of one or more polymoφhisms associated with a positive response to treatment with the medicament and/or alleles of one or more polymoφhisms associated with a negative response to treatment with the medicament. In some embodiments, the medicament may be administered to the subject in a clinical trial if the DNA sample contains alleles of one or more polymorphisms associated with a positive response to treatment with the medicament and/or if the DNA sample lacks alleles of one or more polymoφhisms associated with a negative response to treatment with the medicament. In preferred embodiments, the medicament is a drug acting against a cholesterol- related disorder.
Using the method of the present invention, the evaluation of drug efficacy may be conducted in a population of individuals likely to respond favorably to the medicament.
Another aspect of the invention is a method of using a medicament comprising obtaining a DNA sample from a subject, determining whether the DNA sample contains alleles of one or more polymoφhisms associated with a positive response to the medicament and/or whether the DNA sample contains alleles of one or more polymoφhisms associated with a negative response to the medicament, and administering the medicament to the subject if the
DNA sample contains alleles of one or more polymorphisms associated with a positive response to the medicament and/or if the DNA sample lacks alleles of one or more polymoφhisms associated with a negative response to the medicament.
The invention also concerns a method for the clinical testing of a medicament, preferably a medicament acting against a cholesterol-related disorder or symptoms thereof. The method comprises the following steps: - administering a medicament, preferably a medicament susceptible of acting against a cholesterol-related disorder or symptoms thereof to a heterogeneous population of individuals,
- identifying a first population of individuals who respond positively to said medicament and a second population of individuals who respond negatively to said medicament, identifying polymoφhisms in said first population which are associated with a positive response to said medicament,
- selecting individuals whose DNA comprises polymoφhisms associated with a positive response to said medicament, and
- administering said medicament to said individuals.
Furthermore, since the Kalpa may be involved in the SCAP/SREBP pathway which mediates LDL receptor expression, individuals might differ in Kalpa levels and therefore in their LDL receptor levels. Measurement of Kalpa and/or downstream proteins involved in LDL receptor regulation can then be used to explain differential responses within human population to high cholesterol diets. Thus, in the present invention, there is provided a method of determining the level of LDL receptor expression in an individual by examining the level of Kalpa expression in said individual. Also provided are methods of assessing expression of acitivity of a downstream protein comprising examining Kalpa activity or expression. Preferably said downstream protien is an LDL receptor, SREBP, SCAP, SIP or S2P protein. Further preferred methods comprise determining the allele present at a Kalpa-related polymoφhism, and correlating said allele with a change in expression of a Kalpa protein, expression of an LDL receptor, SREBP, SCAP, SIP or S2P protein, or with blood or cellular cholesterol, HDL, VLDL or LDL levels.
Polymorphisms of the Kalpa gene of the invention The polymorphisms of the Kalpa gene disclosed herein offer a number of important advantages over other genetic markers such as RFLP (Restriction fragment length polymoφhism) and VNTR (Variable Number of Tandem Repeats) markers.
The first generation of markers, were RFLPs, which are variations that modify the length of a restriction fragment. But methods used to identify and to type RFLPs are relatively wasteful of materials, effort, and time. The second generation of genetic markers were VNTRs, which can be categorized as either minisatellites or microsatellites. Minisatellites are tandemly repeated DNA sequences present in units of 5-50 repeats which are distributed along regions of the human chromosomes ranging from 0.1 to 20 kilobases in length. Since they present many possible alleles, their informative content is very high. Minisatellites are scored by performing Southern blots to identify the number of tandem repeats present 4 in a nucleic acid sample from the individual being tested. However, there are only 10 potential VNTRs that can be typed by Southern blotting. Moreover, both
RFLP and VNTR markers are costly and timeconsuming to develop and assay in large numbers.
Single nucleotide polymoφhism or polymorphisms can be used in the same manner as RFLPs and VNTRs but offer several advantages. Single nucleotide polymoφhisms are densely spaced in the human genome and represent the most frequent type of variation. An estimated number of more than 10' sites are scattered along the 3x1 θ" base pairs of the human genome.
Therefore, single nucleotide polymoφhism occur at a greater frequency and with greater uniformity than RFLP or VNTR markers which means that there is a greater probability that such a marker will be found in close proximity to a genetic locus of interest. Single nucleotide polymorphisms are less variable than VNTR markers but are mutationally more stable.
Also, the different forms of a characterized single nucleotide polymoφhism, such as the polymoφhisms of the present invention, are often easier to distinguish and can therefore be typed easily on a routine basis. Polymoφhisms have single nucleotide based alleles and they have only two common alleles, which allows highly parallel detection and automated scoring.
The polymoφhisms of the present invention offer the possibility of rapid, high-throughput genotyping of a large number of individuals.
Polymorphisms are densely spaced in the genome, sufficiently informative and can be assayed in large numbers. The combined effects of these advantages make polymoφhisms extremely valuable in genetic studies. Polymorphisms can be used in linkage studies in families, in allele sharing methods, in linkage disequilibrium studies in populations, in association studies of case-control populations. An important aspect of the present invention is that polymorphisms allow association studies to be performed to identify genes involved in complex traits. Association studies examine the frequency of marker alleles in unrelated case- and control-populations and are generally employed in the detection of polygenic or sporadic traits. Association studies may be conducted within the general population and are not limited to studies performed on related individuals in affected families (linkage studies). Biallelic markers in different genes can be screened in parallel for direct association with disease or response to a treatment. This multiple gene approach is a powerful tool for a variety of human genetic studies as it provides the necessary statistical power to examine the synergistic effect of multiple genetic factors on a particular phenotype, drug response, sporadic trait, or disease state with a complex genetic etiology.
Methods for De Novo Identification of Polymorphisms A. Genomic DNA samples
The genomic DNA samples from which the polymoφhisms of the present invention are generated are preferably obtained from unrelated individuals corresponding to a heterogeneous population of known ethnic background. The number of individuals from whom DNA samples are obtained can vary substantially, preferably from about 10 to about 1000, more preferably from about 50 to about 200 individuals. Usually, DNA samples are collected from at least about 100 individuals in order to have sufficient polymorphic diversity in a given population to identify as many markers as possible and to generate statistically significant results.
As for the source of the genomic DNA to be subjected to analysis, any test sample can be foreseen without any particular limitation. These test samples include biological samples, which can be tested by the methods of the present invention described herein, and include human and animal body fluids such as whole blood, serum, plasma, cerebrospinal fluid, urine, lymph fluids, and various external secretions of the respiratory, intestinal and genitourinary tracts, tears, saliva, milk, white blood cells, myelomas and the like; biological fluids such as cell culture supernatants; fixed tissue specimens including tumor and non-tumor tissue and lymph node tissues; bone marrow aspirates and fixed cell specimens. The preferred source of genomic DNA used in the present invention is from peripheral venous blood of each donor. Techniques to prepare genomic DNA from biological samples are well known to the skilled technician. A person skilled in the art can choose to amplify pooled or unpooled DNA samples. B. DNA Amplification
The identification of polymoφhisms in a sample of genomic DNA may be facilitated through the use of DNA amplification methods. DNA samples can be pooled or unpooled for the amplification step. DNA amplification techniques are well known to those skilled in the art. Various methods to amplify DNA fragments carrying polymoφhisms are further described herein. The PCR technology is the preferred amplification technique used to identify new polymoφhisms.
In a first embodiment, polymoφhisms are identified using genomic sequence information generated by the inventors. Genomic DNA fragments, such as the inserts of the BAC clones described above, are sequenced and used to design primers for the amplification of 500 bp fragments. These 500 bp fragments are amplified from genomic DNA and are scanned for polymoφhisms. Primers may be designed using the OSP software (Hillier L. and Green P., Methods Appl. 1 : 124-8, 1991). All primers may contain, upstream of the specific target bases, a common oligonucleotide tail that serves as a sequencing primer. Those skilled in the art are familiar with primer extensions, which can be used for these puφoses.
In another embodiment of the invention, genomic sequences of candidate genes are available in public databases allowing direct screening for biallelic markers. Preferred primers, useful for the amplification of genomic sequences encoding the candidate genes, focus on promoters, exons and splice sites of the genes. A polymorphism present in these functional regions of the gene has a higher probability to be a causal mutation.
C. Sequencing Of Amplified Genomic DNA And Ide^ tification Of Single Nucleotide Polymorphisms The amplification products generated as described above, are then sequenced using any method known and available to the skilled technician. Methods for sequencing DNA using either the dideoxymediated method (Sanger method) or the Maxam-Gilbert method are widely known to those of ordinary skill in the art. Such methods are for example disclosed in Maniatis et al. (Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, 2nd Edition, 1989). Alternative approaches include hybridization to high-density DNA probe arrays as described in Chee et al. (Science 274: 610, 1996).
Preferably, the amplified DNA is subjected to automated dideoxy terminator sequencing reactions using a dye-primer cycle sequencing protocol. The products of the sequencing reactions are run on sequencing gels and the sequences are determined using gel image analysis. The polymoφhism search is based on the presence of superimposed peaks in the electrophoresis pattern resulting from different bases occurring at the same position. Because each dideoxy terminator is labeled with a different fluorescent molecule, the two peaks corresponding to a biallelic site present distinct colors corresponding to two different nucleotides at the same position on the sequence. However, the presence of two peaks can be an artifact due to background noise. To exclude such an artifact, the two DNA strands are sequenced and a comparison between the peaks is carried out. In order to be registered as a polymoφhic sequence, the polymorphism has to be detected on both strands.
The above procedure permits those amplification products, which contain polymorphisms to be identified. The detection limit for the frequency of biallelic polymorphisms detected by sequencing pools of 100 individuals is approximately 0.1 for the minor allele, as verified by sequencing pools of known allelic frequencies. However, more than 90% of the biallelic polymoφhisms detected by the pooling method have a frequency for the minor allele higher than 0.25. Therefore, the polymoφhisms selected by this method have a frequency of at least 0.1 for the minor allele and less than 0.9 for the major allele. Preferably at least 0.2 for the minor allele and less than 0.8 for the major allele, more preferably at least 0.3 for the minor allele and less than 0.7 for the major allele, thus a heterozygosity rate higher than 0.18, preferably higher than 0.32, more preferably higher than 0.42.
In another embodiment, polymorphisms are detected by sequencing individual DNA samples, the frequency of the minor allele of such a polymorphism may be less than 0.1.
The markers carried by the same fragment of genomic DNA, such as the insert in a BAC clone, need not necessarily be ordered with respect to one another within the genomic fragment to conduct association studies. However, in some embodiments of the present invention, the order of polymoφhisms carried by the same fragment of genomic DNA are determined.
D. Validation of the Polymorphisms of the Present Invention
The polymorphisms are evaluated for their usefulness as genetic markers by validating that both alleles are present in a population. Validation of the biallelic markers is accomplished by genotyping a group of individuals by a method of the invention and demonstrating that both alleles are present.
Microsequencing is a preferred method of genotyping alleles. The validation by genotyping step may be performed on individual samples derived from each individual in the group or by genotyping a pooled sample derived from more than one individual. The group can be as small as one individual if that individual is heterozygous for the allele in question. Preferably the group contains at least three individuals, more preferably the group contains five or six individuals, so that a single validation test will be more likely to result in the validation of more of the polymorphisms that are being tested. It should be noted, however, that when the validation test is performed on a small group it may result in a false negative result if as a result of sampling error none of the individuals tested carries one of the two alleles. Thus, the validation process is less useful in demonstrating that a particular initial result is an artifact, than it is at demonstrating that there is a bona fide polymorphism at a particular position in a sequence. For an indication of whether a particular polymoφhism has been validated see Figure 2.
All of the genotyping, haplotyping, association, and interaction study methods of the invention may optionally be performed solely with validated biallelic markers.
E. Evaluation of the Frequency of the Polymorphisms of the Present Invention The validated polymoφhisms are further evaluated for their usefulness as genetic markers by deteπnining the frequency of the least common allele at the polymorphism site. The determination of the least common allele is accomplished by genotyping a group of individuals by a method of the invention and demonstrating that both alleles are present. This determination of frequency by genotyping step may be performed on individual samples derived from each individual in the group or by genotyping a pooled sample derived from more than one individual. The group must be large enough to be representative of the population as a whole. Preferably the group contains at least 20 individuals, more preferably the group contains at least 50 individuals, most preferably the group contains at least 100 individuals. Of course the larger the group the greater the accuracy of the frequency determination because of reduced sampling error. A polymoφhism wherein the frequency of the less common allele is 30% or more is termed a"high quality polymorphism.'Αll of the genotyping, haplotyping, association, and interaction study methods of the invention may optionally be performed solely with high quality polymoφhisms.
Methods Of Genotyping an Individual for Polymorphisms
Methods are provided to genotype a biological sample for one or more polymorphisms of the present invention, all of which may be performed in vitro. Such methods of genotyping comprise deteπnining the identity of a nucleotide at a Kalpa-related polymoφhism by any method known in the art. These methods find use in genotyping case-control populations in association studies as well as individuals in the context of detection of alleles of biallelic markers which, are known to be associated with a given trait, in which case both copies of the polymoφhism present in individual's genome are determined so that an individual may be classified as homozygous or heterozygous for a particular allele. These genotyping methods can be performed nucleic acid samples derived from a single individual or pooled DNA samples.
Genotyping can be performed using similar methods as those described above for the identification of the polymoφhisms, or using other genotyping methods such as those further described below. In preferred embodiments, the comparison of sequences of amplified genomic fragments from different individuals is used to identify new polymorphisms whereas microsequencing is used for genotyping known polymoφhisms in diagnostic and association study applications.
A. Source of DNA for Genotyping Any source of nucleic acids, in purified or non-purified form, can be utilized as the starting nucleic acid, provided it contains or is suspected of containing the specific nucleic acid sequence desired. DNA or RNA may be extracted from cells, tissues, body fluids and the like as described above in II. A. While nucleic acids for use in the genotyping methods of the invention can be derived from any mammalian source, the test subjects and individuals from which nucleic acid samples are taken are generally understood to be human.
B. Amplification Of DNA Fragments Comprising Polymorphisms
Methods and polynucleotides are provided to amplify a segment of nucleotides comprising one or more polymoφhism of the present invention. It will be appreciated that amplification of DNA fragments comprising polymorphisms may be used in various methods and for various purposes and is not restricted to genotyping. Nevertheless, many genotyping methods, although not all, require the previous amplification of the DNA region carrying the polymorphism of interest. Such methods specifically increase the concentration or total number of sequences that span the polymorphism or include that site and sequences located either distal or proximal to it. Diagnostic assays may also rely on amplification of DNA segments carrying a polymoφhism of the present invention.
Amplification of DNA may be achieved by any method known in the art. The established PCR (polymerase chain reaction) method or by developments thereof or alternatives. Amplification methods which can be utilized herein include but are not limited to Ligase Chain Reaction (LCR) as described in EP A 320 308 and EP A 439 182, Gap LCR (Wolcott, M. J., Clin. Mcrobiol. Rev. 5: 370-386), the socalled "NASBA" or "3SR" technique described in Guatelli J. C et al. (Proc. Natl. Acad. Sci. USA 87: 1874-1878, 1990) and in Compton J. (Nature 350: 91-92, 1991), Q-beta amplification as described in European Patent Application no 4544610, strand displacement amplification as described in Walker et al. (Clin. Chem. 42: 9-13, 1996) and EP A 684 315 and, target mediated amplification as described in PCT Publication WO 9322461.
LCR and Gap LCR are exponential amplification techniques, both depend on DNA ligase to join adjacent primers annealed to a DNA molecule. In Ligase Chain Reaction (LCR), probe pairs are used which include two primary (first and second) and two secondary (third and fourth) probes, all of which are employed in molar excess to target. The first probe hybridizes to a first segment of the target strand and the second probe hybridizes to a second segment of the target strand, the first and second segments being contiguous so that the primary probes abut one another in 5'phosphate-3'hydroxyl relationship, and so that a ligase can covalently fuse or ligate the two probes into a fused product. In addition, a third (secondary) probe can hybridize to a portion of the first probe and a fourth (secondary) probe can hybridize to a portion of the second probe in a similar abutting fashion. Of course, if the target is initially double stranded, the secondary probes also will hybridize to the target complement in the first instance. Once the ligated strand of primary probes is separated from the target strand, it will hybridize with the third and fourth probes which can be ligated to form a complementary, secondary ligated product. It is important to realize that the ligated products are functionally equivalent to either the target or its complement. By repeated cycles of hybridization and ligation, amplification of the target sequence is achieved. A method for multiplex LCR has also been described (WO 9320227). Gap LCR (GLCR) is a version of LCR where the probes are not adjacent but are separated by 2 to 3 bases.
For amplification of mRNAs, it is within the scope of the present invention to reverse transcribe mRNA into cDNA followed by polymerase chain reaction (RT-PCR); or, to use a single enzyme for both steps as described in U. S. Patent No. 5,322,770 or, to use Asymmetric Gap LCR (RT-AGLCR) as described by Marshall R. L. et al. (PCR Methods and Applications 4: 80-84, 1994). AGLCR is a modification of GLCR that allows the amplification of RNA.
Some of these amplification methods are particularly suited for the detection of single nucleotide polymorphisms and allow the simultaneous amplification of a target sequence and the identification of the polymorphic nucleotide as it is further described in IIIC The PCR technology is the preferred amplification technique used in the present invention. A variety of PCR techniques are familiar to those skilled in the art. For a review of PCR technology, see Molecular Cloning to Genetic Engineering White, B. A. Ed. in Methods in Molecular Biology 67: Humana Press, Totowa (1997) and the publication entitled' CR Methods and Applications" (1991, Cold Spring Harbor Laboratory Press). In each of these PCR procedures, PCR primers on either side of the nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid sample along with dNTPs and a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent polymerase. The nucleic acid in the sample is denatured and the PCR primers are specifically hybridized to complementary nucleic acid sequences in the sample. The hybridized primers are extended. Thereafter, another cycle of denaturation, hybridization, and extension is initiated. The cycles are repeated multiple times to produce an amplified fragment containing the nucleic acid sequence between the primer sites. PCR has further been described in several patents including US Patents 4,683,195; 4,683,202 and 4,965,188. The identification of polymorphisms as described above allows the design of appropriate oligonucleotides, which can be used as primers to amplify DNA fragments comprising the polymorphisms of the present invention. Amplification can be performed using the primers initially used to discover new biallelic markers which are described herein or any set of primers allowing the amplification of a DNA fragment comprising a polymorphism of the present invention. Primers can be prepared by any suitable method. As for example, direct chemical synthesis by a method such as the phosphodiester method of Narang S. A. et al. (Methods Enzymol. 68: 90-98, 1979), the phosphodiester method of Brown E. L. et al. (Methods Enzymol. 68: 109-151, 1979), the diethylphosphoramidite method of Beaucage et al. (Tetrahedron Lett. 22: 1859-1862, 1981) and the solid support method described in EP 0 707 592.
In some embodiments the present invention provides primers for amplifying a DNA fragment containing one or more polymoφhisms of the present invention. It will be appreciated that the primers can be obtained from the genomic SEQ ID No 1, or SEQ ID Nos 1001 to 12 or 3, or from sequences from the selected locus as available in databases. The primers are selected to be substantially complementary to the different strands of each specific sequence to be amplified. The length of the primers of the present invention can range from 8 to 100 nucleotides, preferably from 8 to 50, 8 to 30 or more preferably 8 to 25 nucleotides. Shorter primers tend to lack specificity for a target nucleic acid sequence and generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. Longer primers are expensive to produce and can sometimes self-hybridize to form haiφin structures. The formation of stable hybrids depends on the melting temperature (Tm) of the DNA. The Tm depends on the length of the primer, the ionic strength of the solution and the G+C content. The higher the G+C content of the primer, the higher is the melting temperature because G: C pairs are held by three H bonds whereas A: T pairs have only two. The G+C content of the amplification primers of the present invention preferably ranges between 10 and 75 %>, more preferably between 35 and 60 %>, and most preferably between 40 and 55 %>. The appropriate length for primers under a particular set of assay conditions may be empirically determined by one of skill in the art. The spacing of the primers determines the length of the segment to be amplified. In the context of the present invention amplified segments carrying biallelic markers can range in size from at least about 25 bp to 35 kbp. Amplification fragments from 25-3000 bp are typical, fragments from 50-1000 bp are preferred and fragments from 100-600 bp are highly preferred. It will be appreciated that amplification primers for the polymoφhisms may be any sequence which allow the specific amplification of any DNA fragment carrying the markers. Amplification primers may be labeled or immobilized on a solid support as described in "Polymoφhisms and Polynucleotides Comprising Polymorphisms" or "Methods of Genotyping DNA Samples for Polymorphisms". Any method known in the art can be used to identify the nucleotide present at a polymorphism site. Since the polymorphism allele to be detected has been identified and specified in the present invention, detection will prove simple for one of ordinary skill in the art by employing any of a number of techniques. Many genotyping methods require the previous amplification of the DNA region carrying the polymoφhism of interest. While the amplification of target or signal is often preferred at present, ultra sensitive detection methods which do not require amplification are also encompassed by the present genotyping methods. Methods well-known to those skilled in the art that can be used to detect biallelic polymorphisms include methods such as, conventional dot blot analyzes, single strand conformational polymoφhism analysis (SSCP) described by Orita et al. (Proc. Natl. Acad. Sci. U. S. A 86: 27776-2770, 1989), denaturing gradient gel electrophoresis (DGGE), heteroduplex analysis, mismatch cleavage detection, and other conventional techniques as described in Sheffield, V. C et al. (Proc. Natl. Acad. Sci. USA 49: 699-706, 1991), White et al. (Genomics 12: 301-306, 1992), Grompe, M. et al. (Proc. Natl. Acad. Sci. USA 86: 5855-5892, 1989) and Grompe, M. (Nature Genetics 5: 111-117, 1993). Another method for determining the identity of the nucleotide present at a particular polymorphic site employs a specialized exonuclease- resistant nucleotide derivative as described in US Patent 4, 656, 127. Preferred methods involve directly determining the identity of the nucleotide present at a polymoφhism site by sequencing assay, enzyme-based mismatch detection assay, or hybridization assay. The following is a description of some preferred methods. A highly preferred method is the microsequencing technique. The term "sequencing assay" is used herein to refer to polymerase extension of duplex primer/template complexes and includes both traditional sequencing and microsequencing.
1. Sequencing assays The nucleotide present at a polymoφhic site can be determined by sequencing methods.
In a preferred embodiment, DNA samples are subjected to PCR amplification before sequencing as described above. DNA sequencing methods are well known in the art. Preferably, the amplified DNA is subjected to automated dideoxy terminator sequencing reactions using a dye-primer cycle sequencing protocol. Sequence analysis allows the identification of the base present at the polymoφhism site.
2. Microsequencing assays
In microsequencing methods, a nucleotide at the polymoφhic site that is unique to one of the alleles in a target DNA is detected by a single nucleotide primer extension reaction. This method involves appropriate microsequencing primers which, hybridize just upstream of a polymoφhic base of interest in the target nucleic acid. A polymerase is used to specifically extend the 3' end of the primer with one single ddNTP (chain terminator) complementary to the selected nucleotide at the polymorphic site. Next the identity of the incorporated nucleotide is determined in any suitable way. Typically, microsequencing reactions are carried out using fluorescent ddNTPs and the extended microsequencing primers are analyzed by electrophoresis on ABI 377 sequencing machines to determine the identity of the incoφorated nucleotide as described in EP 412 883. Alternatively capillary electrophoresis can be used in order to process a higher number of assays simultaneously. An example of a typical microsequencing procedure that can be used in the context of the present invention is provided in [Example 3].
Different approaches can be used to detect the nucleotide added to the microsequencing primer.
A homogeneous phase detection method based on fluorescence resonance energy transfer has been described by Chen and Kwok (Nucleic Acids Research 25: 347-353 1997) and Chen et al. (Proc. Natl. Acad. Sci. USA 94/20 216-221, 1997). In this method amplified genomic DNA fragments containing polymorphic sites are incubated with a 5'-fluorescein- labeled primer in the presence of allelic dye-labeled dideoxyribonucleoside triphosphates and a modified Taq polymerase. The dye labeled primer is extended one base by the dye-terminator specific for the allele present on the template.
At the end of the genotyping reaction, the fluorescence intensities of the two dyes in the reaction mixture are analyzed directly without separation or purification. All these steps can be performed in the same tube and the fluorescence changes can be monitored in real time. Alternatively, the extended primer may be analyzed by MALDI-TOF Mass Spectrometry. The base at the polymoφhic site is identified by the mass added onto the microsequencing primer (see Haff L. A. and Smiraov I. P., Genome Research, 7: 378-388, 1997).
Microsequencing may be achieved by the established microsequencing method or by developments or derivatives thereof. Alternative methods include several solid-phase microsequencing techniques. The basic microsequencing protocol is the same as described previously, except that the method is conducted as a heterogeneous phase assay, in which the primer or the target molecule is immobilized or captured onto a solid support. To simplify the primer separation and the terminal nucleotide addition analysis, oligonucleotides are attached to solid supports or are modified in such ways that permit affinity separation as well as polymerase extension. The 5' ends and internal nucleotides of synthetic oligonucleotides can be modified in a number of different ways to permit different affinity separation approaches, e. g., biotinylation. If a single affinity group is used on the oligonucleotides, the oligonucleotides can be separated from the incorporated terminator reagent. This eliminates the need of physical or size separation. More than one oligonucleotide can be separated from the terminator reagent and analyzed simultaneously if more than one affinity group is used. This permits the analysis of several nucleic acid species or more nucleic acid sequence infonnation per extension reaction. The affinity group need not be on the priming oligonucleotide but could alternatively be present on the template. For example, immobilization can be carried out via an interaction between biotinylated DNA and streptavidin-coated microtitration wells or avidin-coated polystyrene particles. In the same manner oligonucleotides or templates may be attached to a solid support in a high-density format. In such solid phase microsequencing reactions, incoφorated ddNTPs can be radiolabeled (Syvanen, Clinica Chimica Acta 226: 225-236, 1994) or linked to fluorescein (Livak and Hainer, Human Mutation 3: 379-385, 1994). The detection of radiolabeled ddNTPs can be achieved through scintillation-based techniques. The detection of fluorescein-linked ddNTPs can be based on the binding of antifluorescein antibody conjugated with alkaline phosphatase, followed by incubation with a chromogenic substrate (such as p-nitrophenyl phosphate). Other possible reporterdetection pairs include: ddNTP linked to dinitrophenyl (DNP) and anti-DNP alkaline phosphatase conjugate (Harju et al., Clin. Chem. 39/11 2282-2287, 1993) or biotinylated ddNTP and horseradish peroxidase-conjugated streptavidin with o-phenylenediamine as a substrate (WO 92/15712). As yet another alternative solid-phase microsequencing procedure, Nyren et al. (Analytical Biochemistry 208: 171-175, 1993) described a method relying on the detection of DNA polymerase activity by an enzymatic luminometric inorganic pyrophosphate detection assay (ELIDA). Pastinen et al. (Genome research 7: 606-614, 1997) describe a method for multiplex detection of single nucleotide polymorphism in which the solid phase minisequencing principle is applied to an oligonucleotide array format. High-density arrays of DNA probes attached to a solid support (DNA chips) are further described herein.
In one aspect the present invention provides polynucleotides and methods to genotype one or more polymorphisms of the present invention by performing a microsequencing assay. Similarly, it will be appreciated that microsequencing analysis may be performed for any polymoφhism or any combination of polymorphisms of the present invention.
3. Mismatch detection assays based on polymerases and ligases
In one aspect the present invention provides polynucleotides and methods to determine the allele of one or more polymoφhisms of the present invention in a biological sample, by mismatch detection assays based on polymerases and/or ligases. These assays are based on the specificity of polymerases and ligases. Polymerization reactions places particularly stringent requirements on coπect base pairing of the 3' end of the amplification primer and the joining of two oligonucleotides hybridized to a target DNA sequence is quite sensitive to mismatches close to the ligation site, especially at the 3' end. The terms "enzyme based mismatch detection assay" are used herein to refer to any method of determining the allele of a polymorphism based on the specificity of ligases and polymerases.
Preferred methods are described below. Methods, primers and various parameters to amplify DNA fragments comprising polymorphisms of the present invention are further, described above in the section titled "Probes and Primers".
Allele specific amplification Discrimination between the two alleles of a polymoφhism can also be achieved by allele specific amplification, a selective strategy, whereby one of the alleles is amplified without amplification of the other allele. This is accomplished by placing a polymoφhic base at the 3' end of one of the amplification primers. Because the extension forms from the 3' end of the primer, a mismatch at or near this position has an inhibitory effect on amplification. Therefore, under appropriate amplification conditions, these primers only direct amplification on their complementary allele. Designing the appropriate allele-specific primer and the corresponding assay conditions are well with the ordinary skill in the art.
Ligation/amplification based methods
The "Oligonucleotide Ligation Assay" (OLA) uses two oligonucleotides which are designed to be capable of hybridizing to abutting sequences of a single strand of a target molecules. One of the oligonucleotides is biotinylated, and the other is detectably labeled. If the precise complementary sequence is found in a target molecule, the oligonucleotides will hybridize such that their termini abut, and create a ligation substrate that can be captured and detected. OLA is capable of detecting polymoφhisms and may be advantageously combined with PCR as described by Nickerson D. A. et al. (Proc. Natl. Acad. Sci. U.S.A. 87: 8923-8927, 1990). In this method, PCR is used to achieve the exponential amplification of target DNA, which is then detected using OLA.
Other methods which are particularly suited for the detection of polymorphisms include LCR (ligase chain reaction), Gap LCR (GLCR) which are described above in III. B. As mentioned above LCR uses two pairs of probes to exponentially amplify a specific target. The sequences of each pair of oligonucleotides, is selected to permit the pair to hybridize to abutting sequences of the same strand of the target. Such hybridization forms a substrate for a template- dependant ligase. In accordance with the present invention, LCR can be performed with oligonucleotides having the proximal and distal sequences of the same strand of a polymorphism site. In one embodiment, either oligonucleotide will be designed to include the polymorphism site. In such an embodiment, the reaction conditions are selected such that the oligonucleotides can be ligated together only if the target molecule either contains or lacks the specific nucleotide (s) that is complementary to the polymorphism on the oligonucleotide.
In an alternative embodiment, the oligonucleotides will not include the polymoφhism, such that when they hybridize to the target molecule, a"gap"is created as described in WO 90/015. This gap is then "filled"with complementary dNTPs (as mediated by DNA polymerase), or by an additional pair of oligonucleotides. Thus at the end of each cycle, each single strand has a complement capable of serving as a target during the next cycle and exponential allele-specific amplification of the desired sequence is obtained.
Ligase/Polymerase-mediated Genetic Bit Analysis is another method for determining the identity of a nucleotide at a preselected site in a nucleic acid molecule (WO 95/21271). This method involves the incoφoration of a nucleoside triphosphate that is complementary to the nucleotide present at the preselected site onto the terminus of a primer molecule, and their subsequent ligation to a second oligonucleotide. The reaction is monitored by detecting a specific label attached to the reaction's solid phase or by detection in solution.
4. Hybridization assay methods
A preferred method of determining the identity of the nucleotide present at a polymorphism site involves nucleic acid hybridization. The hybridization probes, which can be conveniently used in such reactions, preferably include the probes defined herein. Any hybridization assay may be used including Southern hybridization, Northern hybridization, dot blot hybridization and solid-phase hybridization (see Sambrook et al., Molecular Cloning-A Laboratory Manual, Second Edition, Cold Spring Harbor Press, N. Y., 1989).
Hybridization refers to the formation of a duplex structure by two single stranded nucleic aci procedures using conditions of high stringency are as follows: Prehybridization of filters containing DNA is carried out for 8 h to overnight at 65 C in buffer composed of 6X SSC, 50 mM Tris-HCI (pH 7.5), I mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 g/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65 C, the preferred hybridization temperature, in prehybridization mixture containing 100 g/ml denatured salmon sperm DNA and 5-20 X 106 cpm of 32p_labeled probe. Alternatively, the hybridization step can be performed at 65 C in the presence of SSC buffer, 1 x SSC corresponding to 0.1 SM NaCI and 0.05 M Na citrate. Subsequently, filter washes can be done at 37 C for 1 h in a solution containing 2X SSC, 0.01% PVP, 0.01% Ficoll, and 0.01% BSA, followed by a wash in 0.1X SSC at 50 C for 45 min. Alternatively, filter washes can be performed in a solution containing 2 x SSC and 0.1% SDS, or 0.5 x SSC and 0.1% SDS, or 0.1 x SSC and 0.1% SDS at 68C for 15 minute intervals. Following the wash steps, the hybridized probes are detectable by autoradiography. By way of example and not limitation, procedures using conditions of intermediate stringency are as follows: Filters containing DNA are prehybridized, and then hybridized at a temperature of 60~, C in the presence of a 5 x SSC buffer and labeled probe. Subsequently, filters washes are performed in a solution containing 2x SSC at 50 C and the hybridized probes are detectable by autoradiography. Other conditions of high and intermediate stringency which may be used are well known in the art and as cited in Sambrook et al. (Molecular Cloning-A Laboratory Manual, Second Edition, Cold Spring Harbor Press, N. Y., 1989) and Ausubel et al. (Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N. Y., 1989). Although such hybridizations can be performed in solution, it is prefeπed to employ a solidphase hybridization assay. The target DNA comprising a biallelic marker of the present invention may be amplified prior to the hybridization reaction. The presence of a specific allele in the sample is determined by detecting the presence or the absence of stable hybrid duplexes formed between the probe and the target DNA. The detection of hybrid duplexes can be carried out by a number of methods. Various detection assay formats are well known which utilize detectable labels bound to either the target or the probe to enable detection of the hybrid duplexes. Typically, hybridization duplexes are separated from unhybridized nucleic acids and the labels bound to the duplexes are then detected. Those skilled in the art will recognize that wash steps may be employed to wash away excess target DNA or probe. Standard heterogeneous assay formats are suitable for detecting the hybrids using the labels present on the primers and probes.
Two recently developed assays allow hybridization-based allele discrimination with no need for separations or washes (see Landegren U. et al., Genome Research, 8: 769-776, 1998). The TaqMan assay takes advantage of the 5' nuclease activity of Taq DNA polymerase to digest a DNA probe annealed specifically to the accumulating amplification product. TaqMan probes are labeled with a donor-acceptor dye pair that interacts via fluorescence energy transfer. Cleavage of the TaqMan probe by the advancing polymerase during amplification dissociates the donor dye from the quenching acceptor dye, greatly increasing the donor fluorescence. All reagents necessary to detect two allelic variants can be assembled at the beginning of the reaction and the results are monitored in real time (see Livak et al., Nature Genetics, 9 : 341- 342, 1995). In an alternative homogeneous hybridization-based procedure, molecular beacons are used for allele discriminations. Molecular beacons are haiφinshaped oligonucleotide probes that report the presence of specific nucleic acids in homogeneous solutions. When they bind to their targets they undergo a conformational reorganization that restores the fluorescence of an internally quenched fluorophore (Tyagi et al., Nature Biotechnology, 16: 49-53, 1998).
The polynucleotides provided herein can be used in hybridization assays for the detection of polymoφhism alleles in biological samples. These probes are characterized in that they preferably comprise between 8 and 50 nucleotides, and in that they are sufficiently complementary to a sequence comprising a polymoφhism of the present invention to hybridize thereto and preferably sufficiently specific to be able to discriminate the targeted sequence for only one nucleotide variation. The GC content in the probes of the invention usually ranges between 10 and 75 %, preferably between 35 and 60 %, and more preferably between 40 and 55 %. The length of these probes can range from 10, 15, 20, or 30 to at least 100 nucleotides, preferably from 10 to 50, more preferably from 18 to 35 nucleotides.
A particularly preferred probe is 25 nucleotides in length. Preferably the polymoφhism is within 4 nucleotides of the center of the polynucleotide probe. In particularly preferred probes the polymorphism is at the center of said polynucleotide. Shorter probes may lack specificity for a target nucleic acid sequence and generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. Longer probes are expensive to produce and can sometimes self-hybridize to form hairpin structures. Methods for the synthesis of oligonucleotide probes have been described above and can be applied to the probes of the present invention. Preferably the probes of the present invention are labeled or immobilized on a solid support. Labels and solid supports are further described herein.
Detection probes are generally nucleic acid sequences or uncharged nucleic acid analogs such as, for example peptide nucleic acids which are disclosed in International Patent Application WO 92/20702, moφholino analogs which are described in U. S. Patents Numbered 5,034,506 and 5,142,047. The probe may have to be rendered"nonextendable"in that additional dNTPs cannot be added to the probe. In and of themselves analogs usually are non-extendable and nucleic acid probes can be rendered non-extendable by modifying the 3' end of the probe such that the hydroxyl group is no longer capable of participating in elongation. For example, the 3 'end of the probe can be functionalized with the capture or detection label to thereby consume or otherwise block the hydroxyl group.. The probes of the present invention are useful for a number of puφoses. They can be used in Southern hybridization to genomic DNA or Northern hybridization to mRNA. The probes can also be used to detect PCR amplification products. By assaying the hybridization to an allele specific probe, one can detect the presence or absence of a polymoφhism allele in a given sample. High-Throughput parallel hybridizations in array format are specifically encompassed within "hybridization assays" and are described below.
Hybridization to addressable arrays of oligonucleotides Hybridization assays based on oligonucleotide aπays rely on the differences in hybridization stability of short oligonucleotides to perfectly matched and mismatched target sequence variants. Efficient access to polymoφhism information is obtained through a basic structure comprising highdensity arrays of oligonucleotide probes attached to a solid support (the chip) at selected positions.
Each DNA chip can contain thousands to millions of individual synthetic DNA probes aπanged in a grid-like pattern and miniaturized to the size of a dime. The chip technology has already been applied with success in numerous cases. For example, the screening of mutations has been undertaken in the BRCA 1 gene, in S. cerevisiae mutant strains, and in the protease gene of HIV-I virus (Hacia et al., Nature Genetics, 14 (4): 441-447, 1996; Shoemaker et al., Nature Genetics, 14 (4): 450-456, 1996; Kozal et al., Nature Medicine, 2: 753-759, 1996). Chips of various formats for use in detecting biallelic polymoφhisms can be produced on a customized basis by Affymetrix (GeneChipT""), Hyseq (HyChip and HyGnostics), and Protogene Laboratories.
In general, these methods employ arrays of oligonucleotide probes that are complementary to target nucleic acid sequence segments from an individual which, target sequences include a polymorphic marker. EP785280 describes a tiling strategy for the detection of single nucleotide polymorphisms. Briefly, arrays may generally be"tiled"for a large number of specific polymorphisms.
By "tiling" is generally meant the synthesis of a defined set of oligonucleotide probes which is made up of a sequence complementary to the target sequence of interest, as well as preselected variations of that sequence, e. g., substitution of one or more given positions with one or more members of the basis set of monomers, i. e. nucleotides. Tiling strategies are further described in PCT application No. WO 95/11995. In a particular aspect, arrays are tiled for a number of specific, identified polymoφhism sequences. In particular the array is tiled to include a number of detection blocks, each detection block being specific for a specific polymoφhism or a set of polymorphisms. For example, a detection block may be tiled to include a number of probes, which span the sequence segment that includes a specific polymoφhism. To ensure probes that are complementary to each allele, the probes are synthesized in pairs differing at the polymoφhism. In addition to the probes differing at the polymoφhic base, monosubstituted probes are also generally tiled within the detection block. These monosubstituted probes have bases at and up to a certain number of bases in either direction from the polymoφhism, substituted with the remaining nucleotides (selected from A, T, G, C and U). Typically the probes in a tiled detection block will include substitutions of the sequence positions up to and including those that are 5 bases away from the polymoφhism. The monosubstituted probes provide internal controls for the tiled array, to distinguish actual hybridization from artefactual crosshybridization. Upon completion of hybridization with the target sequence and washing of the aπay, the array is scanned to determine the position on the aπay to which the target sequence hybridizes. The hybridization data from the scanned array is then analyzed to identify which allele or alleles of the polymorphism are present in the sample. Hybridization and scanning may be carried out as described in PCT application No. WO 92/10092 and WO 95/11995 and US patent No. 5, 424, 186.
Thus, in some embodiments, the chips may comprise an array of nucleic acid sequences of fragments of about 15 nucleotides in length. In further embodiments, the chip may comprise an array including at least one of the sequences selected from the group consisting of SEQ ID Nos. 1, 2 or 3 or 5 to 23, and the sequences complementary thereto, or a fragment thereof at least about 8 consecutive nucleotides, preferably 10, 15, 20, more preferably 25, 30, 40, 49, or 50 consecutive nucleotides. In some embodiments, the chip may comprise an array of at least 2, 3, 4, 5, 6, 7, 8 or more of these polynucleotides of the invention. Solid supports and polynucleotides of the present invention attached to solid supports are further described in herein. Polymorphisms and Polynucleotides Comprising Polymoφhisms.
5) Integrated Systems
Another technique, which may be used to analyze polymoφhisms, includes multicomponent integrated systems, which miniaturize and compartmentalize processes such as PCR and capillary electrophoresis reactions in a single functional device. An example of such technique is disclosed in US patent 5,589,136, which describes the integration of PCR amplification and capillary electrophoresis in chips.
Integrated systems can be envisaged mainly when microfluidic systems are used. These systems comprise a pattern of microchannels designed onto a glass, silicon, quartz, or plastic wafer included on a microchip. The movements of the samples are controlled by electric, electroosmotic or hydrostatic forces applied across different areas of the microchip. For genotyping polymorphisms, the microfluidic system may integrate nucleic acid amplification, microsequencing, capillary electrophoresis and a detection method such as laser-induced fluorescence detection,
Methods of Genetic Analysis Using the Polymorphisms of the Present Invention
Different methods are available for the genetic analysis of complex traits (see Lander and Schork, Science, 265, 2037-2048, 1994). The search for disease-susceptibility genes is conducted using two main methods: the linkage approach in which evidence is sought for cosegregation between a locus and a putative trait locus using family studies, and the association approach in which evidence is sought for a statistically significant association between an allele and a trait or a trait causing allele (Khoury J. et al, Fundamentals of Genetic Epidemiology, Oxford University Press, NY, 1993). In general, the polymoφhisms of the present invention find use in any method known in the art to demonstrate a statistically significant correlation between a genotype and a phenotype. The polymorphisms may be used in parametric and non-parametric linkage analysis methods. Preferably, the polymorphisms of the present invention are used to identify genes associated with detectable traits using association studies, an approach which does not require the use of affected families and which permits the identification of genes associated with complex and sporadic traits.
The genetic analysis using the polymorphisms of the present invention may be conducted on any scale. The whole set of polymorphisms of the present invention or any subset of polymorphisms of the present invention may be used. In some embodiments a subset of polymoφhisms corresponding to one or several candidate genes of the present invention may be used. In other embodiments a subset of polymorphisms corresponding to candidate genes from a given pathway of cholesterol regulation may be used, for example the ubiquitin- proteasome pathway which metabolically regulates intra-cellular degradation of apoB . Further, any set of genetic markers including a polymoφhism of the present invention may be used in studies relating to cholesterol regulation. A set of biallelic polymorphisms that, could be used as genetic markers in combination with the polymoφhisms of the present invention, has been described in WO 98/465. As mentioned above, it should be noted that the polymorphisms of the present invention may be included in any complete or partial genetic map of the human genome. These different uses are specifically contemplated in the present invention and claims.
A. Linkage Analysis
Linkage analysis is based upon establishing a correlation between the transmission of genetic markers and that of a specific trait throughout generations within a family. Thus, the aim of linkage analysis is to detect marker loci that show cosegregation with a trait of interest in pedigrees.
Parametric Methods
When data are available from successive generations there is the opportunity to study the degree of linkage between pairs of loci. Estimates of the recombination fraction enable loci to be ordered and placed onto a genetic map. With loci that are genetic markers, a genetic map can be established, and then the strength of linkage between markers and traits can be calculated and used to indicate the relative positions of markers and genes affecting those traits (Weir, 1996). The classical method for linkage analysis is the logarithm of odds (lod) score method (see Morton, 1955; Ott, 1991). Calculation of lod scores requires specification of the mode of inheritance for the disease (parametric method). Generally, the length of the candidate region identified using linkage analysis is between 2 and 20Mb. Once a candidate region is identified as described above, analysis of recombinant individuals using additional markers allows further delineation of the candidate region. Linkage analysis studies have generally relied on the use of a maximum of 5,000 microsatellite markers, thus limiting the maximum theoretical attainable resolution of linkage analysis to about 600 kb on average.
Linkage analysis has been successfully applied to map simple genetic traits that show clear Mendelian inheritance patterns and which have a high penetrance (i.e., the ratio between the number of trait positive carriers of allele a and the total number of a carriers in the population). However, parametric linkage analysis suffers from a variety of drawbacks. First, it is limited by its reliance on the choice of a genetic model suitable for each studied trait. Furthermore, as already mentioned, the resolution attainable using linkage analysis is limited, and complementary studies are required to refine the analysis of the typical 2Mb to 20Mb regions initially identified through linkage analysis. In addition, parametric linkage analysis approaches have proven difficult when applied to complex genetic traits, such as those due to the combined action of multiple genes and/or environmental factors. It is very difficult to model these factors adequately in a lod score analysis. In such cases, too large an effort and cost are needed to recruit the adequate number of affected families required for applying linkage analysis to these situations, as recently discussed by Risch, N. and Merikangas, K. (1996).
Non-Parametric Methods
The advantage of the so-called non-parametric methods for linkage analysis is that they do not require specification of the mode of inheritance for the disease, they tend to be more useful for the analysis of complex traits. In non-parametric methods, one tries to prove that the inheritance pattern of a chromosomal region is not consistent with random Mendelian segregation by showing that affected relatives inherit identical copies of the region more often than expected by chance. Affected relatives should show excess "allele sharing" even in the presence of incomplete penetrance and polygenic inheritance. In non-parametric linkage analysis the degree of agreement at a marker locus in two individuals can be measured either by the number of alleles identical by state (IBS) or by the number of alleles identical by descent (IBD). Affected sib pair analysis is a well-known special case and is the simplest form of these methods.
The polymoφhisms of the present invention may be used in both parametric and non- parametric linkage analysis. Preferably polymoφhisms may be used in non-parametric methods which allow the mapping of genes involved in complex traits. The polymorphisms of the present invention may be used in both IBD- and IBS- methods to map genes affecting a complex trait. In such studies, taking advantage of the high density of polymoφhisms, several adjacent polymoφhism loci may be pooled to achieve the efficiency attained by multi-allelic markers (Zhao et al., 1998).
Population Association Studies
The present invention comprises methods for identifying if a Kalpa gene is associated with a detectable trait using the polymorphisms of the present invention. In one embodiment the present invention comprises methods to detect an association between a polymorphism allele or a polymorphism haplotype and a trait. Further, the invention comprises methods to identify a trait causing allele in linkage disequilibrium with any polymorphism allele of the present invention.
As described above, alternative approaches can be employed to perform association studies: genome-wide association studies, candidate region association studies and candidate gene association studies. In a preferred embodiment, the polymorphisms of the present invention are used to perform candidate gene association studies. The candidate gene analysis clearly provides a short-cut approach to the identification of genes and gene polymorphisms related to a particular trait when some information concerning the biology of the trait is available. Further, the polymoφhisms of the present invention may be incoφorated in any map of genetic markers of the human genome in order to perform genome-wide association studies. The polymoφhisms of the present invention may further be incoφorated in any map of a specific candidate region of the genome (a specific chromosome or a specific chromosomal segment for example).
As mentioned above, association studies may be conducted within the general population and are not limited to studies performed on related individuals in affected families. Association studies are extremely valuable as they permit the analysis of sporadic or multifactor traits. Moreover, association studies represent a powerful method for fine-scale mapping enabling much finer mapping of trait causing alleles than linkage studies. Studies based on pedigrees often only narrow the location of the trait causing allele. Association studies using the polymorphisms of the present invention can therefore be used to refine the location of a trait causing allele in a candidate region identified by Linkage Analysis methods. Moreover, once a chromosome segment of interest has been identified, the presence of a candidate gene such as a candidate gene of the present invention, in the region of interest can provide a shortcut to the identification of the trait causing allele. Polymorphisms of the present invention can be used to demonstrate that a candidate gene is associated with a trait. Such uses are specifically contemplated in the present invention.
Determining the frequency of a polymorphism allele or of a polymorphism haplotype in a population
Association studies explore the relationships among frequencies for sets of alleles between loci.
Determining The Frequency Of An Allele In A Population
Allelic frequencies of the polymoφhisms in a populations can be determined using one of the methods described above under the heading "Methods for genotyping an individual for polymoφhisms", or any genotyping procedure suitable for this intended purpose. Genotyping pooled samples or individual samples can determine the frequency of a polymoφhism allele in a population. One way to reduce the number of genotypings required is to use pooled samples. A major obstacle in using pooled samples is in terms of accuracy and reproducibility for determining accurate DNA concentrations in setting up the pools. Genotyping individual samples provides higher sensitivity, reproducibility and accuracy and; is the preferred method used in the present invention. Preferably, each individual is genotyped separately and simple gene counting is applied to determine the frequency of an allele of a polymorphism or of a genotype in a given population.
The invention also relates to methods of estimating the frequency of an allele in a population comprising: a) genotyping individuals from said population for said polymorphism according to the method of the present invention; b) determining the proportional representation of said polymoφhism in said population. In addition, the methods of estimating the frequency of an allele in a population of the invention encompass methods with any further limitation described in this disclosure, or those following, specified alone or in any combination; optionally, wherein said Kalpa-related polymoφhism is selected from the group consisting of the polymorphisms of [Table 3] and SEQ ID NOS 5 to 23, and the complements thereof, or optionally the polymoφhisms in linkage disequilibrium therewith; optionally, wherein said Kalpa-related polymoφhism is a polymoφhism located in a gene selected from the group consisting of the Kalpa genes, and the complements thereof, or optionally the polymoφhisms in linkage disequilibrium therewith. Optionally, determining the frequency of a polymoφhism allele in a population may be accomplished by determining the identity of the nucleotides for both copies of said polymorphism present in the genome of each individual in said population and calculating the proportional representation of said nucleotide at said Kalpa- related polymorphism for the population; Optionally, determining the proportional representation may be accomplished by performing a genotyping method of the invention on a pooled biological sample derived from a representative number of individuals, or each individual, in said population, and calculating the proportional amount of said nucleotide compared with the total.
Determining The Frequency Of A Haplotype In A Population The gametic phase of haplotypes is unknown when diploid individuals are heterozygous at more than one locus. Using genealogical information in families gametic phase can sometimes be infeπed (Perlin et al., 1994). When no genealogical information is available different strategies may be used. One possibility is that the multiple-site heterozygous diploids can be eliminated from the analysis, keeping only the homozygotes and the single-site heterozygote individuals, but this approach might lead to a possible bias in the sample composition and the underestimation of low-frequency haplotypes. Another possibility is that single chromosomes can be studied independently, for example, by asymmetric PCR amplification (see Newton et al, 1989; Wu et al., 1989) or by isolation of single chromosome by limit dilution followed by PCR amplification (see Ruano et al., 1990). Further, a sample may be haplotyped for sufficiently close polymorphisms by double PCR amplification of specific alleles (Sarkar, G. and Sommer S. S., 1991). These approaches are not entirely satisfying either because of their technical complexity, the additional cost they entail, their lack of generalization at a large scale, or the possible biases they introduce. To overcome these difficulties, an algorithm to infer the phase of PCR-amplified DNA genotypes introduced by Clark, A.G.(1990) may be used. Briefly, the principle is to start filling a preliminary list of haplotypes present in the sample by examining unambiguous individuals, that is, the complete homozygotes and the single-site heterozygotes. Then other individuals in the same sample are screened for the possible occurrence of previously recognized haplotypes. For each positive identification, the complementary haplotype is added to the list of recognized haplotypes, until the phase information for all individuals is either resolved or identified as unresolved. This method assigns a single haplotype to each multiheterozygous individual, whereas several haplotypes are possible when there are more than one heterozygous site. Alternatively, one can use methods estimating haplotype frequencies in a population without assigning haplotypes to each individual. Preferably, a method based on an expectation-maximization (EM) algorithm (Dempster et al., 1977) leading to maximum-likelihood estimates of haplotype frequencies under the assumption of Hardy- Weinberg proportions (random mating) is used (see Excoffier L. and Slatkin M., 1995). The EM algorithm is a generalized iterative maximum-likelihood approach to estimation that is useful when data are ambiguous and/or incomplete. The EM algorithm is used to resolve heterozygotes into haplotypes. Haplotype estimations are further described below under the heading "Statistical Methods." Any other method known in the art to determine or to estimate the frequency of a haplotype in a population may be used.
The invention also encompasses methods of estimating the frequency of a haplotype for a set of polymorphisms in a population, comprising the steps of: a) genotyping at least one Kalpa-related polymorphism according to a method of the invention for each individual in said population; b) genotyping a second polymorphism by determining the identity of the nucleotides at said second polymorphism for both copies of said second polymorphism present in the genome of each individual in said population; and c) applying a haplotype determination method to the identities of the nucleotides determined in steps a) and b) to obtain an estimate of said frequency. In addition, the methods of estimating the frequency of a haplotype of the invention encompass methods with any further limitation described in this disclosure, or those following, specified alone or in any combination: optionally, wherein said Kalpa-related polymoφhism is selected from the group consisting of the polymoφhisms of Table 3, and the complements thereof, or optionally the polymorphisms in linkage disequilibrium therewith; optionally, wherein said Kalpa-related polymorphism is a polymorphism located in a gene selected from the group consisting of the Kalpa genes, and the complements thereof, or optionally the polymorphisms in linkage disequilibrium therewith. Optionally, said haplotype determination method is performed by asymmetric PCR amplification, double PCR amplification of specific alleles, the Clark algorithm, or an expectation-maximization algorithm.
Linkage Disequilibrium Analysis
Linkage disequilibrium is the non-random association of alleles at two or more loci and represents a powerful tool for mapping genes involved in disease traits (see Ajioka R.S. et al., 1997). Polymoφhisms, because they are densely spaced in the human genome and can be genotyped in greater numbers than other types of genetic markers (such as RFLP or VNTR markers), are particularly useful in genetic analysis based on linkage disequilibrium.
When a disease mutation is first introduced into a population (by a new mutation or the immigration of a mutation caπier), it necessarily resides on a single chromosome and thus on a single "background" or "ancestral" haplotype of linked markers. Consequently, there is complete disequilibrium between these markers and the disease mutation: one finds the disease mutation only in the presence of a specific set of marker alleles. Through subsequent generations recombination events occur between the disease mutation and these marker polymorphisms, and the disequilibrium gradually dissipates. The pace of this dissipation is a function of the recombination frequency, so the markers closest to the disease gene will manifest higher levels of disequilibrium than those that are further away. When not broken up by recombination, "ancestral" haplotypes and linkage disequilibrium between marker alleles at different loci can be tracked not only through pedigrees but also through populations. Linkage disequilibrium is usually seen as an association between one specific allele at one locus and another specific allele at a second locus.
The pattern or curve of disequilibrium between disease and marker loci is expected to exhibit a maximum that occurs at the disease locus. Consequently, the amount of linkage disequilibrium between a disease allele and closely linked genetic markers may yield valuable information regarding the location of the disease gene. For fine-scale mapping of a disease locus, it is useful to have some knowledge of the patterns of linkage disequilibrium that exist between markers in the studied region. As mentioned above the mapping resolution achieved through the analysis of linkage disequilibrium is much higher than that of linkage studies. The high density of polymorphisms combined with linkage disequilibrium analysis provides powerful tools for fine-scale mapping. Different methods to calculate linkage disequilibrium are described below under the heading "Statistical Methods".
Population-Based Case-Control Studies Of Trait-Marker Associations
As mentioned above, the occurrence of pairs of specific alleles at different loci on the same chromosome is not random and the deviation from random is called linkage disequilibrium. Association studies focus on population frequencies and rely on the phenomenon of linkage disequilibrium. If a specific allele in a given gene is directly involved in causing a particular trait, its frequency will be statistically increased in an affected (trait positive) population, when compared to the frequency in a trait negative population or in a random control population. As a consequence of the existence of linkage disequilibrium, the frequency of all other alleles present in the haplotype carrying the trait-causing allele will also be increased in trait positive individuals compared to trait negative individuals or random controls. Therefore, association between the trait and any allele (specifically a polymorphism allele) in linkage disequilibrium with the trait-causing allele will suffice to suggest the presence of a trait-related gene in that particular region. Case-control populations can be genotyped for polymoφhisms to identify associations that narrowly locate a trait causing allele. As any marker in linkage disequilibrium with one given marker associated with a trait will be associated with the trait. Linkage disequilibrium allows the relative frequencies in case-control populations of a limited number of genetic polymoφhisms (specifically polymoφhisms) to be analyzed as an alternative to screening all possible functional polymorphisms in order to find trait-causing alleles. Association studies compare the frequency of marker alleles in unrelated case-control populations, and represent powerful tools for the dissection of complex traits.
Case-Control Populations (Inclusion Criteria)
Population-based association studies do not concern familial inheritance but compare the prevalence of a particular genetic marker, or a set of markers, in case-control populations. They are case-control studies based on comparison of unrelated case (affected or trait positive) individuals and unrelated control (unaffected, trait negative or random) individuals. Preferably the control group is composed of unaffected or trait negative individuals. Further, the control group is ethnically matched to the case population. Moreover, the control group is preferably matched to the case-population for the main known confusion factor for the trait under study (for example age-matched for an age-dependent trait). Ideally, individuals in the two samples are paired in such a way that they are expected to differ only in their disease status. The terms "trait positive population", "case population" and "affected population" are used interchangeably herein.
An important step in the dissection of complex traits using association studies is the choice of case-control populations (see Lander and Schork, 1994). A major step in the choice of case-control populations is the clinical definition of a given trait or phenotype. Any genetic trait may be analyzed by the association method proposed here by carefully selecting the individuals to be included in the trait positive and trait negative phenotypic groups. Four criteria are often useful: clinical phenotype, age at onset, family history and severity. The selection procedure for continuous or quantitative traits (such as blood pressure for example) involves selecting individuals at opposite ends of the phenotype distribution of the trait under study, so as to include in these trait positive and trait negative populations individuals with non-overlapping phenotypes. Preferably, case-control populations comprise phenotypically homogeneous populations. Trait positive and trait negative populations comprise phenotypically uniform populations of individuals representing each between 1 and 98%), preferably between 1 and 80%, more preferably between 1 and 50%, and more preferably between 1 and 30%, most preferably between 1 and 20% of the total population under study, and preferably selected among individuals exhibiting non-overlapping phenotypes. The clearer the difference between the two trait phenotypes, the greater the probability of detecting an association with polymorphisms. The selection of those drastically different but relatively uniform phenotypes enables efficient comparisons in association studies and the possible detection of marked differences at the genetic level, provided that the sample sizes of the populations under study are significant enough.
In preferred embodiments, a first group of between 50 and 300 trait positive individuals, preferably about 100 individuals, are recruited according to their phenotypes. A similar number of control individuals are included in such studies.
Association Analysis
The invention also comprises methods of detecting an association between a genotype and a phenotype, comprising the steps of: a) determining the frequency of at least one Kalpa- related polymoφhism in a trait positive population according to a genotyping method of the invention; b) determining the frequency of said Kalpa-related polymorphism in a control population according to a genotyping method of the invention; and c) determining whether a statistically significant association exists between said genotype and said phenotype. Optionally, said control population may be a trait negative population, or a random population; Optionally, each of said genotyping steps a) and b) may be performed on a pooled biological sample derived from each of said populations; Optionally, each of said genotyping of steps a) and b) is performed separately on biological samples derived from each individual in said population or a subsample thereof. The general strategy to perform association studies using polymorphisms derived from a region carrying a candidate gene is to scan two groups of individuals (case-control populations) in order to measure and statistically compare the allele frequencies of the polymorphisms of the present invention in both groups.
If a statistically significant association with a trait is identified for at least one or more of the analyzed polymoφhisms, one can assume that: either the associated allele is directly responsible for causing the trait (i.e. the associated allele is the trait causing allele), or more likely the associated allele is in linkage disequilibrium with the trait causing allele. The specific characteristics of the associated allele with respect to the candidate gene function usually give further insight into the relationship between the associated allele and the trait (causal or in linkage disequilibrium). If the evidence indicates that the associated allele within the candidate gene is most probably not the trait causing allele but is in linkage disequilibrium with the real trait causing allele, then the trait causing allele can be found by sequencing the vicinity of the associated marker, and performing further association studies with the polymorphisms that are revealed in an iterative manner.
Association studies are usually run in two successive steps. In a first phase, the frequencies of a reduced number of polymoφhisms from the candidate gene are determined in the trait positive and control populations. In a second phase of the analysis, the position of the genetic loci responsible for the given trait is further refined using a higher density of markers from the relevant region. Alternatively, a single phase may be sufficient to establish significant associations.
Haplotype Analysis
As described above, when a chromosome carrying a disease allele first appears in a population as a result of either mutation or migration, the mutant allele necessarily resides on a chromosome having a set of linked markers: the ancestral haplotype. This haplotype can be tracked through populations and its statistical association with a given trait can be analyzed. Complementing single point (allelic) association studies with multi-point association studies also called haplotype studies increases the statistical power of association studies. Thus, a haplotype association study allows one to define the frequency and the type of the ancestral carrier haplotype. A haplotype analysis is important in that it increases the statistical power of an analysis involving individual markers.
In a first stage of a haplotype frequency analysis, the frequency of the possible haplotypes based on various combinations of the identified polymorphisms of the invention is determined. The haplotype frequency is then compared for distinct populations of trait positive and control individuals. The number of trait positive individuals, which should be, subjected to this analysis to obtain statistically significant results usually ranges between 30 and 300, with a preferred number of individuals ranging between 50 and 150. The same considerations apply to the number of unaffected individuals (or random control) used in the study. The results of this first analysis provide haplotype frequencies in case-control populations, for each evaluated haplotype frequency a p-value and an odd ratio are calculated. If a statistically significant association is found the relative risk for an individual carrying the given haplotype of being affected with the trait under study can be approximated.
An additional embodiment of the present invention encompasses methods of detecting an association between a haplotype and a phenotype, comprising the steps of: a) estimating the frequency of at least one haplotype in a trait positive population, according to a method of the invention for estimating the frequency of a haplotype; b) estimating the frequency of said haplotype in a control population, according to a method of the invention for estimating the frequency of a haplotype; and c) determining whether a statistically significant association exists between said haplotype and said phenotype. In addition, the methods of detecting an association between a haplotype and a phenotype of the invention encompass methods with any further limitation described in this disclosure, or those following: optionally, wherein said Kalpa-related polymoφhism is a polymoφhism located in a sequence of SEQ ID Nos 1 or 2 or 3, the complements thereof, or optionally the polymorphisms in linkage disequilibrium therewith; optionally, said control population is a trait negative population, or a random population. Optionally, said method comprises the additional steps of determining the phenotype in said trait positive and said control populations prior to step c).
Interaction Analysis
The polymorphisms of the present invention may also be used to identify patterns of polymorphisms associated with detectable traits resulting from polygenic interactions. The analysis of genetic interaction between alleles at unlinked loci requires individual genotyping using the techniques described herein. The analysis of allelic interaction among a selected set of polymorphisms with appropriate level of statistical significance can be considered as a haplotype analysis. Interaction analysis comprises stratifying the case-control populations with respect to a given haplotype for the first loci and performing a haplotype analysis with the second loci with each subpopulation. Statistical methods used in association studies are further described below.
Testing For Linkage In The Presence Of Association
The polymorphisms of the present invention may further be used in TDT (transmission/disequilibrium test). TDT tests for both linkage and association and is not affected by population stratification. TDT requires data for affected individuals and their parents or data from unaffected sibs instead of from parents (see Spielmann S. et al., 1993; Schaid D.J. et al., 1996, Spielmann S. and Ewens W.J., 1998). Such combined tests generally reduce the false — positive errors produced by separate analyses. Statistical methods
In general, any method known in the art to test whether a trait and a genotype show a statistically significant correlation may be used.
1 ) Methods In Linkage Analysis
Statistical methods and computer programs useful for linkage analysis are well-known to those skilled in the art (see Terwilliger J.D. and Ott J., 1994; Ott J., 1991).
2) Methods To Estimate Haplotype Frequencies In A Population
As described above, when genotypes are scored, it is often not possible to distinguish heterozygotes so that haplotype frequencies cannot be easily inferred. When the gametic phase is not known, haplotype frequencies can be estimated from the multilocus genotypic data. Any method known to person skilled in the art can be used to estimate haplotype frequencies (see Lange K., 1997; Weir, B.S., 1996) Preferably, maximum-likelihood haplotype frequencies are computed using an Expectation- Maximization (EM) algorithm (see Dempster et al., 1977; Excoffier L. and Slatkin M., 1995). This procedure is an iterative process aiming at obtaining maximum-likelihood estimates of haplotype frequencies from multi-locus genotype data when the gametic phase is unknown. Haplotype estimations are usually performed by applying the EM algorithm using for example the EM-HAPLO program (Hawley M. E. et al., 1994) or the Arlequin program (Schneider et al., 1997). The EM algorithm is a generalized iterative maximum likelihood approach to estimation and is briefly described below.
Please note that in the present section, "Methods To Estimate Haplotype Frequencies In A Population, " phenotypes will refer to multi-locus genotypes with unknown haplotypic phase. Genotypes will refer to multi-locus genotypes with known haplotypic phase.
Suppose one has a sample of N unrelated individuals typed for K markers. The data observed are the unknown-phase -locus phenotypes that can be categorized with F different phenotypes. Further, suppose that we have H possible haplotypes (in the case of K polymoφhisms, we have for the maximum number of possible haplotypes H= 2 ). For phenotypey with c, possible genotypes, we have:
Pj = ∑ P (genotyped)) = ∑P(hk ,h, ).
1=1 ,=1 Equation 1 where , is the probability of they .th ' „ pιheno *t,ype, and Λ P τ>(nh.k,h λ.ι) \ : i„s * tιhe pro ibab:iιl:i*t.y. o-fC tthUe,. i .-th genotype composed of haplotypes hk and h\. Under random mating (i.e. Hardy- Weinberg Equilibrium), PQii ) is expressed as:
P(hl,h,) = 2P(ht)P(h,) {or hi ≠ h,.
Equation 2 The E-M algorithm is composed of the following steps: First, the genotype frequencies are estimated from a set of initial values of haplotype frequencies. These haplotype frequencies are denoted P ° Pf , ,..., Pjf - The initial values for the haplotype frequencies may be obtained from a random number generator or in some other way well known in the art. This step is refeπed to the Expectation step. The next step in the method, called the Maximization step, consists of using the estimates for the genotype frequencies to re-calculate the haplotype frequencies. The first iteration haplotype frequency estimates are denoted by Pp, P2 (1), P ,..., PH . In general, the Expectation step at the 5th iteration consists of calculating the probability of placing each phenotype into the different possible genotypes based on the haplotype frequencies of the previous iteration:
Figure imgf000119_0001
Equation 3
P (h h )(i) where n} is the number of individuals with the/1 phenotype and J k ' ' is the probability of genotype hhh in phenotype j. In the Maximization step, which is equivalent to the gene- counting method (Smith, Ann. Hum. Genet, 21 :254-276, 1957), the haplotype frequencies are re-estimated based on the genotype estimates:
J-}
Figure imgf000119_0002
. Equation 4
Here, δa is an indicator variable which counts the number of occurrences that haplotype t is present in i* genotype; it takes on values 0, 1, and 2. The E-M iterations cease when the following criterion has been reached. Using
Maximum Likelihood Estimation (MLE) theory, one assumes that the phenotypes j are distributed multinomially. At each iteration s, one can compute the likelihood function L. Convergence is achieved when the difference of the log-likehood between two consecutive iterations is less than some small number, preferably 10"7. 3) Methods To Calculate Linkage Disequilibrium Between Markers
A number of methods can be used to calculate linkage disequilibrium between any two genetic positions, in practice linkage disequilibrium is measured by applying a statistical association test to haplotype data taken from a population.
Linkage disequilibrium between any pair of polymoφhisms comprising at least one of the polymorphisms of the present invention (M„ M,) having alleles (a,/b,) at marker M, and alleles (a/bj) at marker Mj can be calculated for every allele combination (a„aJ; a„bJ; b„aj and b„bj), according to the Piazza formula: aιaj= 4 - (4 3) (4 2), where: 4= - - = frequency of genotypes not having allele a, at M, and not having allele a, at Mj 3= - + = frequency of genotypes not having allele a, at M, and having allele a} at Mj 2= + - = frequency of genotypes having allele a, at M, and not having allele a, at Mj
Linkage disequilibrium (LD) between pairs of polymoφhisms (M„ Mj) can also be calculated for every allele combination (ai,aj; ai,bjj b„a, and b„bj), according to the maximum- likelihood estimate (MLE) for delta (the composite genotypic disequilibrium coefficient), as described by Weir (Weir B. S., 1996). The MLE for the composite linkage disequilibrium is: Daιaj= (2n, + n2 + n3 + n4/2)/N - 2(pr(a,). prfø))
Where nj = phenotype (a,/a„ aj/a,), n = phenotype (a,/a„ a bj), n3= phenotype (a/b„ a}/a3), n4= phenotype (a,/b„ a bj) and N is the number of individuals in the sample. This formula allows linkage disequilibrium between alleles to be estimated when only genotype, and not haplotype, data are available.
Another means of calculating the linkage disequilibrium between markers is as follows. For a couple of polymoφhisms, M, (α/b,) and M, (α/bj), fitting the Hardy-Weinberg equilibrium, one can estimate the four possible haplotype frequencies in a given population according to the approach described above.
The estimation of gametic disequilibrium between αi and qj is simply: J>αiαj = pr(hαplotype(αt ,«_/)) - pr(αt ).pr(αj ).
Where pr(α,) is the probability of allele α, and pr(θj) is the probability of allele α, and where pr(hαplotype (α„ αj) is estimated as in Equation 3 above. For a couple of polymorphism only one measure of disequilibrium is necessary to describe the association between M, and M,.
Then a normalized value of the above is calculated as follows:
D'a,aj = Da,aj / max (-pr(a,). pr(a,) , -pr(b,). prfb,)) with Daιaj<0 D'aιaj = Daιaj / max (pr(b,). pr(a,) , pr(a,). prfb,)) with Daιaj>0 The skilled person will readily appreciate that other linkage disequilibrium calculation methods can be used.
Linkage disequilibrium among a set of polymoφhisms having an adequate heterozygosity rate can be determined by genotyping between 50 and 1000 unrelated individuals, preferably between 75 and 200, more preferably around 100.
4) Testing For Association
Methods for determining the statistical significance of a correlation between a phenotype and a genotype, in this case an allele at a polymorphism or a haplotype made up of such alleles, may be determined by any statistical test known in the art and with any accepted threshold of statistical significance being required. The application of particular methods and thresholds of significance are well with in the skill of the ordinary practitioner of the art. Testing for association is performed by determining the frequency of a polymoφhism allele in case and control populations and comparing these frequencies with a statistical test to determine if their is a statistically significant difference in frequency which would indicate a correlation between the trait and the polymoφhism allele under study. Similarly, a haplotype analysis is performed by estimating the frequencies of all possible haplotypes for a given set of polymorphisms in case and control populations, and comparing these frequencies with a statistical test to determine if their is a statistically significant correlation between the haplotype and the phenotype (trait) under study. Any statistical tool useful to test for a statistically significant association between a genotype and a phenotype may be used. Preferably the statistical test employed is a chi-square test with one degree of freedom. A P- value is calculated (the P-value is the probability that a statistic as large or larger than the observed one would occur by chance).
Statistical Significance
In preferred embodiments, significance for diagnosis purposes, either as a positive basis for further diagnostic tests or as a preliminary starting point for early preventive therapy, the p value related to a polymoφhism association is preferably about 1 x 10"2 or less, more preferably about 1 x 10"4 or less, for a single polymorphism analysis and about 1 x 10"3 or less, still more preferably 1 x 10"6 or less and most preferably of about 1 x 10"8 or less, for a haplotype analysis involving two or more markers. These values are believed to be applicable to any association studies involving single or multiple marker combinations. The skilled person can use the range of values set forth above as a starting point in order to carry out association studies with polymorphisms of the present invention. In doing so, significant associations between the polymoφhisms of the present invention and a trait can be revealed and used for diagnosis and drug screening purposes.
Phenotypic Permutation
In order to confirm the statistical significance of the first stage haplotype analysis described above, it might be suitable to perform further analyses in which genotyping data from case-control individuals are pooled and randomized with respect to the trait phenotype. Each individual genotyping data is randomly allocated to two groups, which contain the same number of individuals as the case-control populations used to compile the data obtained in the first stage. A second stage haplotype analysis is preferably run on these artificial groups, preferably for the markers included in the haplotype of the first stage analysis showing the highest relative risk coefficient. This experiment is reiterated preferably at least between 100 and 10000 times. The repeated iterations allow the determination of the probability to obtain the tested haplotype by chance.
Assessment Of Statistical Association
To address the problem of false positives similar analysis may be performed with the same case-control populations in random genomic regions.
5) Evaluation Of Risk Factors
The association between a risk factor (in genetic epidemiology the risk factor is the presence or the absence of a certain allele or haplotype at marker loci) and a disease is measured by the odds ratio (OR) and by the relative risk (RR). If P(R+) is the probability of developing the disease for individuals with R and P(R") is the probability for individuals without the risk factor, then the relative risk is simply the ratio of the two probabilities, that is: RR= P(R+)/P(R-)
F+
OR =
\ - F+ (1 - F-) In case-control studies, direct measures of the relative risk cannot be obtained because of the sampling design. However, the odds ratio allows a good approximation of the relative risk for low-incidence diseases and can be calculated: OR= (F+/(l-F+))/(F7(l-F-)) F+ is the frequency of the exposure to the risk factor in cases and F" is the frequency of the exposure to the risk factor in controls. F+ and F" are calculated using the allelic or haplotype frequencies of the study and further depend on the underlying genetic model (dominant, recessive, additive...). One can further estimate the attributable risk (AR) which describes the proportion of individuals in a population exhibiting a trait due to a given risk factor. This measure is important in quantifying the role of a specific factor in disease etiology and in terms of the public health impact of a risk factor. The public health relevance of this measure lies in estimating the proportion of cases of disease in the population that could be prevented if the exposure of interest were absent. AR is determined as follows: AR = PE (RR-1) / (PE(RR-1)+1)
AR is the risk attributable to a polymorphism allele or a polymoφhism haplotype. PE is the frequency of exposure to an allele or a haplotype within the population at large; and RR is the relative risk which, is approximated with the odds ratio when the trait under study has a relatively low incidence in the general population.
Identification of polymorphisms in linkage disequilibrium with the polymorphisms of the invention
Once a first polymoφhism has been identified in a genomic region of interest, the practitioner of ordinary skill in the art, using the teachings of the present invention, can easily identify additional polymoφhisms in linkage disequilibrium with this first marker. As mentioned before any marker in linkage disequilibrium with a first marker associated with a trait will be associated with the trait.
Therefore, once an association has been demonstrated between a given polymoφhism and a trait, the discovery of additional polymorphisms associated with this trait is of great interest in order to increase the density of polymorphisms in this particular region. The causal gene or mutation will be found in the vicinity of the marker or set of markers showing the highest correlation with the trait.
Identification of additional markers in linkage disequilibrium with a given marker involves: (a) amplifying a genomic fragment comprising a first polymorphism from a plurality of individuals; (b) identifying of second polymoφhisms in the genomic region harboring said first polymorphism; (c) conducting a linkage disequilibrium analysis between said first polymorphism and second polymorphisms; and (d) selecting said second polymorphisms as being in linkage disequilibrium with said first marker. Subcombinations comprising steps (b) and (c) are also contemplated.
Methods to identify polymoφhisms and to conduct linkage disequilibrium analysis are described herein and can be carried out by the skilled person without undue experimentation.
The present invention then also concerns polymorphisms which are in linkage disequilibrium with the specific biallelic markers shown in Figure 2 and which are expected to present similar characteristics in terms of their respective association with a given trait.
Identification of Functional Mutations
Once a positive association is confiπned with a polymorphism of the present invention, the associated candidate gene can be scanned for mutations by comparing the sequences of a selected number of affected individuals and control individuals. In a prefeπed embodiment, functional regions such as exons and splice sites, promoters and other regulatory regions of the candidate gene are scanned for mutations. Preferably, affected individuals carry the haplotype shown to be associated with the trait and trait negative or control individuals do not carry the haplotype or allele associated with the trait.
The mutation detection procedure is essentially similar to that used for SNP identification. The method used to detect such mutations generally comprises the following steps: (a) amplification of a region of the candidate gene comprising a biallelic marker or a group of polymorphisms associated with the trait from DNA samples of affected patients and trait negative controls; (b) sequencing of the amplified region; (c) comparison of DNA sequences from affected trait-positive patients and trait-negative controls; and (d) determination of mutations specific to affected trait-positive patients. Subcombinations which comprise steps (b) and (c) are specifically contemplated.
It is prefeπed that candidate polymoφhisms be then verified by screening a larger population of cases and controls by means of any genotyping procedure such as those described herein, preferably using a microsequencing technique in an individual test format. Polymorphisms are considered as candidate mutations when present in cases and controls at frequencies compatible with the expected association results.
Candidate polymorphisms and mutations of the Kalpa gene suspected of being responsible for the detectable phenotype, such as low cholesterol or LDL levels, can be confirmed by screening a larger population of affected and unaffected individuals using any of the genotyping procedures described herein. Preferably the microsequencing technique is used. Such polymorphisms are considered as candidate"trait-causing"mutations when they exhibit a statistically significant correlation with the detectable phenotype. EXAMPLES
Several of the methods of the present invention are described in the following examples, which are offered by way of illustration and not by way of limitation. Many other modifications and variations of the invention as herein set forth can be made without departing from the spirit and scope thereof and therefore only such limitations should be imposed as are indicated by the appended claims.
Example 1
Association between Kalpa and lipid phenotypes
Sample definition
The sample comprised three groups: 1) A set of 146 unrelated individuals called 'UIS for Unrelated Individual Set'. Due to the plate configuration
2) A set of 382 nuclear families (some of them related) with in general only the sibship genotyped called 'FBS Family Based Set'. A sub-sample named 'FBS-adult' is composed of sets of families whose children are adults between 20-70. This sub-sample is composed of 194 families.
3) A set composed of the IUS and one sib from each sibship called 'CUS for complete Unrelated Set'. This sample is composed of 505 individuals. A sub-sample named 'CUS adults' is composed of CUS adults individuals (between 20-70 y). This sample is composed of 340 individuals.
Statistical analysis
For all the analysis the marker coding was the following : the most frequent allele (as identified in the CUS) is coded allele 1, the less frequent 2. Allele 1 is used as the reference allele. All the genetic models and the results presented are thus the effect of allele 2 (or genotypes containing 2) relative to allele 1. For genotypic coding scheme, it is always the genotypes 11 or 11+ 12 that are taken as references.
For the IUS and the CUS a variance analysis was performed using 4 different models: o Factor (unconstrained relation between homozygote 11, heterozygotes 12 and homozygotes 22), referred after as F, o Co-dominant effect (C), o dominant effect (D), o recessive effect (R).
In the UIS and CUS the effect of one polymorphism was analysed by variance analysis.
Four variables were tested corresponding to the four physiological variables total cholesterol, HDL-cholesterol, LDL-cholesterol and triglycerides.
Because population distribution of these variables are not Gaussian in the population, the analysis was performed with the natural logarithm of the measures.The effects of the polymorphism on the physiological variables was adjusted for the effect of gender, age and BMI, to rule out the putative interaction with confusion factors.
The analysis of variance is the most simplest case with only one factor to test which has either 2 (D,C,R coding schemes) or 3 levels (in the Factor coding scheme). The test of association is thus a usual Fisher test (which test for the importance of one factor in the partition of the overall variance of the trait) and, in the case of the Factor coding scheme, mean tests.
In the family based sample, a familial based association test was performed only under an additive model using the approach developed by (Abecasis, G.R. (2001) Extent and distribution of linkage disequilibrium in three genomic regions. Am. J. Hum. Genet, 68, 191- 197; and Abecasis, G.R., Cardon, L.R., and Cookson, W.O. (2000). A general test of association for quantitative traits in nuclear families. Am. J. Hum. Genet., 66, 279-92. using the related software QTDT. This approach is variance component model. The familial resemblance of individuals is decomposed according to a biometrical model which can take into account a shared environmental/genetic effect (Vg). This term explains the covariance term between sibs within a family.
The test of association is done using a likelihood ratio test. It is similar in the concept that the coπelation between related individuals has to be taken in to account as a random component in the model.
Two different test can be constructed : a global association test which tests for association within and between families and a within-familial association test which takes into account only within families comparison. This latter test is robust against any ethnic stratification in the sample. If an association is present both test have to be significant.
Analyses were undertaken with the logarithm of the four physiological variables, and adjusted for gender , age and BMI effects. TABLE 5: Genomic information on the markers analysed
Figure imgf000127_0001
Figure imgf000127_0002
Footnote:
Pol. : polymoφhism; missing values (%) : total %> of missing genotypes ;
Extreme CT values (%>): % of points that reached the limit number of cycles of PCR without exceeding the fluorescence threshold; Allele 1 (2) : most( less) frequent allele in the CUS; Allele frequency : frequency of allele 2;
HW : Test of Hardy-Weinberg departure (following a chi square with 1 df).
All the markers from the third wave are frequent and are all in Hardy-Weinberg equilibrium.
Linkage disequilibrium results
Linkage disequilibrium between markers has been calculated using the Idmax software (Abecassis 2001). The following tables presents the LD results obtained gene by gene. The upper part of the table (in gray) present the lewontin's D' value with the associated p-value in brackets. The lower part represent the genomic distance (in kb) between the SNPs.
TABLE 7: Kalpa
Figure imgf000128_0001
In Kalpa, markers SNP85.3 to SNP39.3 are in near complete to complete linkage disequilibrium (all D' measures are close to 1). Marker Coverage is sufficient in this gene.
TABLE 8: Haplotype diversity of KALPA
Figure imgf000129_0001
". " are reported when the allele of the considered marker is identical to the allele present in the most frequent haplotype
Eleven haplotypes are inferred. Under the hypothesis of no recombination, 8 haplotypes would have been observed.
If you consider only the 6 markers from SNP85.3 to SNP39.3, Hapl and Hap4 are grouped, Hap2 and Hap6, Hap3 and Hap8, Hap5 and Hapl 1, Hap7 and Hap9, and haplotype 10 stays. Six haplotypes only are thus left: this shows that 1) recombination occurred between 0 SNP2.3 and the rest of markers, and 2) no recombination is observed between markers SNP85.3 to SNP39.3. This profile of haplotype frequencies explain the LD matrix.
Association results
V.l — Single marker association 5 Association results are presented gene by gene. The adjustment has been done on age, gender and on BMI measurements.
TABLE 9: KALPA gene
Figure imgf000130_0001
0 NS: not significant
C : codominant model, R: recessive model, D: dominant model.
This table 9 shows that the family based test are significant for total cholesterol and LDL-cholesterol with the markers SNP5.3, SNP50.3 and SNP81.3 and finally to a lesser extent 5 with SNP39.3. In the Family based analysis, when performing a within-families comparison only, the results are more significant with SNP5.3 and SNP50.3 (pvalues in brackets). This test, because it compares difference of genotypes within sibship is less powerful than a global test of association but it is robust to any stratification of the population. So, provided that we are not facing a false positive, the association results can not be explained by a stratification of the population and really reflects a true association.
The results of the within-family association were also checked by permutations. The results were identical , which means that the significance of the test can be not falsely attributable to violations of assumptions underlying the use of a chi-square. Also it has to be underlined that these results remains equally significant when the analysis is restricted to the sub-sample of adults. This is to say that the effect stays the same in different age ranges.
As SNP5.3 is an exonic non conservative polymoφhism (changing Isoleucine->Valine) and as it gives the most significant results, this polymorhpism is likely to be the functional polymorphism.
In the CUS sample, the results reflects the results obtained on the FBS samples. The allele 2 (T) of SNP5.3 is lowering the LDL-cholesterol. The size of the effect is estimated to be between 15 to 20% of the absolute value of LDL-cholesterol.
Example I A: Association between chromosome 13q locus polymorphisms and lipid phenotypes Single Nucleotide polymoφhisms in genes located in the region defined by markers D13S158 to D13S265 were selected and subjected to a genotype-phenotype association analysis.
A set of 382 nuclear families and 146 unrelated individuals was genotyped. All individuals were selected from the INFOGEN population, recruited via a genetic fieldworking procedure (Schuster, H., et al (1998) Kidney Int 53:1449-54) database and are of German origin. Both sample sets were ascertained without specific inclusion criteria. The median age of the unrelated samples was older (median= 41.7 ranging from 0 to 82) than the family based sample (median=27.9 ranging from 4 to 80). Women were over-represented in both samples, 61% and 55 > respectively in the unrelated set and the family based test.
Five lipid phenotypes were studied, e.g. total Cholesterol (CHOL), HDL-cholesterol (HDL), LDL-cholesterol (LDL), Triglycerides (TGRL), the LDL/HDL ratio (LDL/HDL). Values of these five variables in the two sample sets were representative of what is expected in a population-based study. Twenty nine markers spanning the region were genotyped and analyzed in both sample.
Statistical analysis in the unrelated sample set was done by Analysis of variance. Different genetic models were used, e.g. co-dominant (C), dominant (D) or recessive (R). Family based association tests were performed using the approach developed by Abecasis, et al, (2000) Am J Hum Genet. 66:279-92. Significant results for six markers are shown in the Table 5 below.
A marker was reported significant either if at least one of the sample sets reported a significant result at the level of 0.05.
Table 5
Figure imgf000132_0001
Legend: (NS): Not significant; (C): codominant; (D): dominant; (R): Recessive
Significant associations were found with SNPs located close or within hCT 1640442 (a gene encoding SOX21), hCT23424 (a gene coding for ABCC4), hCT21820 (a gene with homology to the proteasome i-chain gene), hCT 1644904 (a gene with homology to EBI2 gene) and hCT15697 (a gene encoding propionyl-CoA carboxylase, subunit alpha).
The results indicate that these genes are involved in the regulation of lipid phenotypes and might thus also be involved in lipid related metabolic disorders, such as the metabolic syndrome, Obesity and Diabetes type II. Example IB: Association between Kalpa locus polymorphisms and lipid phenotypes
One frequent coding polymoφhism (SNP5.3) in exon 10 of the Kalpa gene is a non conservative polymoφhism (Isoleucine to Valine) with an allelic frequency of 0.34. In a sample of 382 nuclear famillies taken from the german population, we performed an association test of LDL-Cholesterol adjusted for age, BMI and gender with the coding polymorphism. An association was found using familial based association method (ref Abecassis et al., 2001). The p-value associated with the test was 0.003. In an independent cohort of german adult of 143 individuals this association was confirmed using analysis of variance with LDL-Cholesterol adjusted for age, BMI and gender, giving a p-value of 0.03. The size of the effect was similar in both population. These two independent associations studies show that the non-conservative coding polymoφhism in the Kalpa gene impact the LDL-cholesterol metabolism in the general population.
Example 2
De Novo Identification Of Polymorphisms
The DNA from individuals is extracted and tested for the detection of the polymorphisms.
30 ml of peripheral venous blood are taken from each donor in the presence of EDTA. Cells (pellet) are collected after centrifugation for 10 minutes at 2000 rpm. Red cells are lysed by a lysis solution (50 ml final volume: 10 mM Tris pH7.6; 5 mM MgCl2; 10 mM NaCl). The solution is centrifuged (10 minutes, 2000 rpm) as many times as necessary to eliminate the residual red cells present in the supernatant, after resuspension of the pellet in the lysis solution. The pellet of white cells is lysed overnight at 42°C with 3.7 ml of lysis solution composed of:
- 3 ml TE 10-2 (Tris-HKalpa 10 mM, EDTA 2 mM) / NaKalpa 0 4 M
- 200 μl SDS 10%
- 500 μl K-proteinase (2 mg K-proteinase in TE 10-2 / NaKalpa 0.4 M). For the extraction of proteins, 1 ml saturated NaKalpa (6M) (1/3.5 v/v) is added. After vigorous agitation, the solution is centrifuged for 20 minutes at 10000 rpm.
For the precipitation of DNA, 2 to 3 volumes of 100% ethanol are added to the previous supernatant, and the solution is centrifuged for 30 minutes at 2000 rpm. The DNA solution is rinsed three times with 70% ethanol to eliminate salts, and centrifuged for 20 minutes at 2000 rpm. The pellet iss dried at 37°C, and resuspended in 1 ml TE 10-1 or 1 ml water. The DNA concentration iss evaluated by measuring the OD at 260 nm (1 unit OD = 50 μg/ml DNA).
To deteπnine the presence of proteins in the DNA solution, the OD 260 / OD 280 ratio iss determined. Only DNA preparations having a OD 260 / OD 280 ratio between 1.8 and 2 are used in the subsequent examples described below.
The pool was constituted by mixing equivalent quantities of DNA from each individual.
Example 3 Genotyping of Polymorphisms
The amplification of specific genomic sequences of the DNA samples of example 1 is carried out on the pool of DNA obtained previously. In addition, 50 individual samples are similarly amplified.
PCR assays were performed using the following protocol: Final volume 25 μl
DNA 2 ng/μl
MgCl2 2 mM dNTP (each) 200 μM primer (each) 2.9 ng/μl Ampli Taq Gold DNA polymerase 0.05 unit/μl
PCR buffer (1 Ox = 0.1 M TrisHKalpa pH8.3 0.5M KC1) lx
Each pair of first primers (about 20nt in length) is designed using the sequence information of the resepctive Kalpa gene disclosed herein and the OSP software (Hillier & Green, 1991). After heating at 95°C for 10 min, 40 cycles are performed. Each cycle comprises: 30 sec at 95°C, 54°C for 1 min, and 30 sec at 72°C. For final elongation, 10 min at 72°C ends the amplification. The quantities of the amplification products obtained are determined on 96- well microtiter plates, using a fluorometer and Picogreen as intercalant agent (Molecular Probes).
Example 4
Preparation of antibody compositions
Substantially pure Kalpa protein or polypeptide is obtained. The concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per ml. Monoclonal or polyclonal antibodies to the protein can then be prepared as follows: Monoclonal Antibody Production by Hybridoma Fusion Monoclonal antibody to epitopes in the Kalpa or a portion thereof can be prepared from murine hybridomas according to the classical method of Kohler and Milstein (Nature, 256: 495, 1975) or derivative methods thereof (see Harlow and Lane, Antibodies A Laboratory Manual, Cold Spring Harbor Laboratory, pp. 53-242, 1988).
Briefly, a mouse is repetitively inoculated with a few micrograms of the Kalpa or a portion thereof over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as original described by Engvall, E., Meth. Enzymol. 70: 419 (1980). Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al. Basic Methods in Molecular Biology Elsevier, New York. Section 21-2.
Polyclonal Antibody Production by Immunization
Polyclonal antiserum containing antibodies to heterogeneous epitopes in the Kalpa or a portion thereof can be prepared by immunizing suitable non-human animal with the Kalpa or a portion thereof, which can be unmodified or modified to enhance immunogenicity. A suitable nonhuman animal is preferably a non-human mammal is selected, usually a mouse, rat, rabbit, goat, or horse. Alternatively, a crude preparation which, has been enriched for Kalpa concentration can be used to generate antibodies. Such proteins, fragments or preparations are introduced into the non-human mammal in the presence of an appropriate adjuvant (e. g. aluminum hydroxide, RIBI, etc.) which is known in the art. In addition the protein, fragment or preparation can be pretreated with an agent which will increase antigenicity, such agents are known in the art and include, for example, methylated bovine serum albumin (mBSA), bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin (KLH). Serum from the immunized animal is collected, treated and tested according to known procedures. If the serum contains polyclonal antibodies to undesired epitopes, the polyclonal antibodies can be purified by immunoaffinity chromatography.
Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. Techniques for producing and processing polyclonal antisera are known in the art, see for example, Mayer and Walker (1987). An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al. J. Clin. Endocrinol. Metab. 33: 988-991 (1971). Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, O. et al., Chap. 19 in: Handbook of Experimental Immunology D. Wier (ed) Blackwell (1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12: M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of Clinical
Example 5
High throughput two-hybrid screening assay for drugs that modulate Kalpa/Kalpa-target protein interaction To identify drugs that modulate Kalpa/WD-40 domain-containing protein interactions, a two-hybrid based high throughput screening assay is used.
AH 109 yeast cells (Clontech) cotransformed with plasmids pGBKT7- Kalpa and pGADT7- WD-40 domain-containing protein are grown in 384-well plates in selective media lacking Histidine and Adenine, according to manufacturer's instructions (MATCHMAKER two-hybrid system 3, Clontech).
Growth of the transformants on media lacking histidine and adenine is dependent on the
Kalpa WD-40 domain-containing protein two-hybrid interaction and drugs that disrupt
Kalpa/WD-40 domain-containing protein binding will therefore inhibit yeast cell growth.
Small molecules (5 mg mH in DMSO; Chembridge) are added by using plastic 384-pin arrays (Genetix). The plates are incubated for 4 to 5 days at 30 °C, and small molecules which inhibit the growth of yeast cells by disrupting Kalpa/WD-40 domain-containing protein two- hybrid interaction are selected for further analysis. Example 6
High throughput in vitro assay to identify inhibitors of Kalpa/Kalpa target interaction
To identify small molecule modulators of Kalpa function, a high-throughput screen based on fluorescence polarization (FP) is used to monitor the displacement of a fluorescently labelled Kalpa protein from a recombinant glutathione-S-transferase (GST)-Kapla binding domain of WD-40 domain fusion protein.
Assays are carried out essentially as in Degterev et al, Nature Cell Biol. 3: 173-182 (2001) and Dandliker et al, Methods Enzymol. 74: 3-28 (1981). The assay can be calibrated by titrating a Kalpa peptide labelled with Oregon Green with increasing amounts of GST- WD-40 domain protein. Binding of the peptide is accompanied by an increase in polarization (mP, millipolarization).
The Kalpa peptide, preferably a peptide comprising a TPR repeat domain, is expressed and purified using a QIAexpressionist kit (Qiagen) according to the manufacturer's instructions. Briefly, the entire Kalpa coding sequence is amplified by PCR using pGBKT7- Kalpa as a template and cloned into the BamΗi site of pQE30 vector (Qiagen). The resulting pQE30-HisKalpa plasmid is transformed in E.coli strain Ml 5 (Qiagen). 6xHis-tagged-Kalpa protein is purified from inclusion bodies on a Ni-Agarose column (Qiagen) under denaturing conditions, and the eluate is used for in vitro interaction assays. To produce GST- WD-40 fusion protein, WD-40 is amplified by PCR and cloned in frame downstream of the Glutathione S-Transferase (GST) ORF, into the BamHI site of the pGEX-2T prokaryotic expression vector (Amersham Pharmacia Biotech). GST- WD-40 fusion protein is expressed in E.Coli DH5α α (supE44, DELTAlacU169 (801acZdeltaM15), hsdR17, recAl, endAl, gyrA96, thil, relA 1) and purified by affinity chromatography with glutathione sepharose according to supplier's instructions (Amersham Pharmacia Biotech). For screening small molecules, Kalpa peptide is labelled with succinimidyl Oregon
Green (Molecular Probes, Oregon) and purified by HPLC. 33 nM labelled Kalpa peptide, 2μM GST- WD-40 protein, 0.1% bovine gamma-globullin (Sigma) and 1 mM dithiothreitol mixed with PBS, pH 7.2 (Gibco), are added to 384-well black plates (Lab Systems) with Multidrop
(Lab Systems). Small molecules (5 mg mH in DMSO; Chembridge) are transferred by using plastic 384-pin arrays (Genetix). The plates are incubated for 1-2 hours at 25 °C, and FP values are determined with an Analyst plate reader (LJL Biosystems).
Example 7
High throughput chip assay to identify inhibitors of Kalpa/Kalpa target interaction A chip based binding assay (Degterev et al, Nature Cell Biol. 3: 173-182 (2001)) using unlabelled Kalpa and Kalpa target protein is be used to identify molecules capable of interfering with Kalpa and Kalpa -family target interactions, providing high sensitivity and avoiding potential interference from label moieties. In this example, the Kalpa binding domain of WD-40 domain containing protein (WD-40 protein) is covalently attached to a surface- enhanced laser desroption/ionization (SELDI) chip, and binding of unlabelled Kalpa protein to immobilized protein in the presence of a test compound is monitored by mass spectrometry.
Recombinant Kalpa protein and GST- WD-40 fusion proteins are prepared as described in Example 6. Purified recombinant GST- WD-40 protein is coupled through its primary amine to SELDI chip surfaces derivatized with cabonyldiimidazole (Ciphergen). Kalpa protein is incubated in a total volume of 1 μl for 12 hours at 4 °C in a humidified chamber to allow binding to each spot of the SELDI chip, then washed with alternating high-pH and low-pH buffers (0.1M sodium acetate containing 0.5M NaCl, followed by 0.01 M HEPES, pH 7.3). The samples are embedded in alpha-cyano-4-hydroxycinnamic acid matric and analysed for mass by matrix-assisted laser desoφtion ionization time-of-flight (MALDI-TOF) mass spectrometry. Averages of 100 laser shots at a constant setting are collected over 20 spots in each sample.
Example 8
High throughput cell assay to identify inhibitors of Kalpa/Kalpa target interaction A fluoresence resonance energy transfer (FRET) assay is carried out between Kalpa and WD-40 domain-containing proteins fused with fluorescent proteins. Assays can be carried out as in Majhan et al, Nature Biotechnology 16: 547-552 (1998) and Degterev et al, Nature Cell Biol. 3: 173-182 (2001).
Kalpa protein is fused to cyan fluorescent protein (CFP) and WD-40 domain protein is fused to yellow fluorescent protein (YFP). Vectors containing Kalpa and Kalpa target proteins can be constructed essentially as in Majhan et al (1998). A Kalpa -CFP expression vector is generated by subcloning a Kalpa cDNA into the pECFP-Nl vector (Clontech). A WD-40 domain -YFP expression vector is generated by subcloning a WD-40 domain cDNA into the pEYFP-Nl vector (Clontech). Vectors are cotransfected to HEK-293 cells and cells are treated with test compounds.
HEK-293 cells are transfected with Kalpa -CFP and WD-40 domain-YFP expression vectors using Lipofect AMINE Plus (Gibco) or TransLT-1 (PanVera). 24 hours later cells are treated with test compounds and incubated for various time periods, preferably up to 48 hours. Cells are harvested in PBS, optionally supplemented with test compound, and fluorescence is determined with a C-60 fluorimeter (PTI) or a Wallac plate reader. Fluorescence in the samples separately expressing Kalpa -CFP and WD-40 domain-YFP is added together and used to estimate the FRET value in the absence of Kalpa / WD-40 domain binding.
The extent of FRET between CFP and YFP is determined as the ratio between the fluorescence at 527 nm and that at 475 nm after excitation at 433 nm. The cotransfection of Kalpa protein and WD-40 domain protein results in an increase of FRET ratio over a reference FRET ratio of 1.0 (determined using samples expressing the proteins separately). A change in the FRET ratio upon treatmemt with a test compound (over that observed after cotransfection in the absence of a test compound) indicates a compound capable of modulating the interaction of the Kalpa protein and the WD-40 domain protein.
Example 9
Assay for aldo keto reductase activity
Aldo keto reductase activity is measured using the decrease in absorbance at 340nm as NADPH is consumed. A standard reaction mixture is 135mM sodium phosphate buffer (pH 6.2- 7.2 depending on enzyme), 0.2mM NADPH, 0.3M lithium sulfate, 0.5-2.5 μg enzyme and an appropriate level of substrate. The reaction is incubated at 30°C and the reaction is monitoredcontinuously with a spectrophotometer. Enzyme activity is calculated as ml NADPH consumed/μg of enzyme. The binding of lithocholic acid to the enzymes is directly assessed by determining the unbound fraction of the bile acid at 25°C by ultracentrifugation assay using microconcentrators (Microcon 10; Amicon). A volume of 0.4ml of binding assay mixture (lOμM enzyme and lOμM lithocholic acid or 20 μM NADP+ in 0.1M potassium phosphate, pH7.4) was filtered by centrifugation at 13000g for 20min. As a control, the same volume of the mixture without the enzyme was filtered. The concentration of lithocholic acid in the filtrate was determined enzymaically. The filtrate (0.2ml) was added to a reaction mixture (1.0ml) containing 0.1M glycine/NaOH, PH 10.0 and 0.25mM NADP+, and the reaction was started by addition of the enzyme (~10μg), which oxidizes the bile acid. The fluorescence of NADPH was recorded until it was unchanged.
Example 10
Expression analysis of the Kalpa gene
RT-PCR and Northern blots were carried out to experimentally characterize the product of the Kalpa gene. (a) RT-PCR on the Kalpa gene
Two sets of specific PCR primers targeting the 5' end of the Kalpa cDNA (Exon 2,3,4) were designed. The RT-PCR was performed on commercially available RNA extracts from human brain and human liver. The first PCR reveals a distinctive band at 530 bp, shown in Fig 3A. The cDNA is detected both in the Brain and in the Liver.
The specificity of the amplification was checked by performing a nested PCR. The second PCR (B' and L') was been performed with the second set of primers. Figure 3B shows the result of the nested PCR confirming the specificity of the reverse transcription. The sequencing of the first PCR product (530 bp) was carried out, and the consensus sequence was aligned on the annotated cDNA Kalpa sequence. The match is perfect indicating that the product is specific to the Kalpa gene region. This 530bp PCR product will be used as a probe for Northern blot analysis.
(b) Northern Blot analysis
To examine the relative abundance of the human Kalpa mRNA in various adult human tissues, a Northern blot analysis was performed.
The 530bp PCR fragment, amplified from the liver RNA extract as described above, was radiolabeled using the random priming labeling kit Rediprime II DNA labeling system (Amersham). 25ng denaturated PCR product and 5μl Redivue [ -32P] dCTP (with a specific activity of l lOTBq/mmol) were added to the labeling reaction mixture according to the manufacturer's protocol. The labeled probe was purified using a Sephadex G50 column (Pharmacia) to remove the unincorporated nucleotides. Two human multiple tissue Northern Blots (Clontech) were pre-hybridized at 68°C for lhour in 25ml of ExpressHyb hybridization buffer (Clontech), and then the radiolabeled probe was heat denatured and added to 25ml fresh ExpressHyb hybridization buffer. The hybridization mixture was incubated 2hours at 68°C with continuous shaking.
The two blots were washed two times for 15min with 2xSSC, 0.05% SDS at room temperature; and two times with O.lxSSC, 0.1 %> SDS at 65°C for 30min. The blots were exposed to Hyperfilm (Amersham) for 2 days at -70°C.
As shown in Fig 3C, the northern analysis reveals one Kalpa transcript at 3.63kb. There appears to be a tissue-specific distribution of the band. The most intense signal is detected in the skeletal muscle. The signals detected in the kidney, liver and brain are quite intense, while the signals detected in the placenta, spleen, small intestine, tongue, thyroid, stomach, spinal cord and prostate are less intense. The signals detected in the brain and liver (blot 1 lanes 1 and
8) are in agreement with the RT-PCR results.
Although this invention has been described in terms of certain preferred embodiments, other embodiments which will be apparent to those of ordinary skill in the art of view of the disclosure herein are also within the scope of this invention. Accordingly, the scope of the invention is intended to be defined only by reference to the appended claims.
In addition to the references for which full citations are provided in the body of the text, the following additional references have also been cited in the present application :
Ajioka R.S. et al., Am. J. Hum. Genet., 60:1439-1447, 1997 Clark A.G. (1990) Mol. Biol. Evol. 7:1 11-122.
Dempster et al., (1977) J. R. Stat. Soc, 39B:l-38.
Excoffier L. and Slatkin M. (1995) Mol. Biol. Evol, 12(5): 921-927.
Hawley M.E. et al. (1994) Am. J. Phys. Anthropol. 18:104.
Lander and Schork, Science, 265, 2037-2048, 1994 Lange K. (1997) Mathematical and Statistical Methods for Genetic Analysis. Springer, New
York.
Morton N.E., Am.J. Hum.Genet., 7:277-318, 1955
Newton et al. ( 1989) Nucleic Acids Res. 17:2503-2516.
Ott J., Analysis of Human Genetic Linkage, John Hopkins University Press, Baltimore, 1991 Risch, N. and Merikangas, K. (Science, 273 : 1516- 1517, 1996
Ruano et al. (1990) Proc. Natl. Acad. Sci. U.S.A. 87:6296-6300.
Sarkar, G. and Sommer S.S. (1991) Biotechniques.
Schaid D.J. et al., Genet. Epidemiol. ,13:423-450, 1996
Schneider et al.( 1997) Arlequm: A Software For Population Genetics Data Analysis. University of Geneva.
Spielmann S. and Ewens Ψ ., Am. J. Hum. Genet., 62:450-458, 1998
Spielmann S. et al., Am. J. Hum. Genet., 52:506-516, 1993
Terwilliger J.D. and Ott J., Handbook of Human Genetic Linkage, John Hopkins University
Press, London, 1994 Weir, B.S. (1996) Genetic data Analysis II: Methods for Discrete population genetic Data,
Sinauer As soc, Inc., Sunder land, MA, U.S. A.
Wu et al. (1989) Proc Natl. Acad. Sci. U.S.A. 86:2757.
Zhao et al, Am. J. Hum. Genet., 63:225-240, 1998 Table 3
Figure imgf000142_0001
TABLE 4:
Figure imgf000143_0001
Figure imgf000143_0002
Figure imgf000143_0003
Legends of Table 4
Summary 1 : UDP N-Acetyl Glucosamine-Peptide --N-acetyl glucosaminyl transferse like protein Summary 2: The tetratricopeptide repeat of typically 34 amino acids was first described in the cell cycle regulator Cdc23p and later found to occur in a large number of proteins. A function for this repeat seems to be protein-protein interaction. It has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes. Prominent examples of TPR-proteins include, Cdclόp, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pasl0p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the pi 10 subunit of O-GlcNAc transferase. Summary 3: ATP-binding cassette, sub-family C, member 4 Summary 4: ABCjmembrane ABC transporter transmembrane region. his family represents a unit of six transmembrane helices .
Many members of the ABC transporter family (pfam0005) have two such regions Summary 5: ABC ran, ABC transporter. ABC transporters for a large family of proteine responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide, or belong in different polypeptide chains.
Summary 6: ATPases associated with a variety of cellular activities Summary 7: SRY (sex determining region Y) - box 21, SRY-box21, Summary 8: HMG (High Mobility group) box, HMG_box Summary 9: HSP90-like protein

Claims

1. A method of identifying a candidate cholesterol-lowering compound, said method comprising: (a) providing a Kalpa polypeptide or a fragment thereof, the method comprising contacting said polypeptide with a test compound; and
(b) providing a respective Kalpa-target polypeptide or a fragment thereof; and
(c) determining whether a test compound selectively inhibits the ability of said polypeptide of step (a) to bind to said respective target polypeptide of step (b).
2. The method of Claim 1, wherein said Kalpa polypeptide of step (a) is a polypeptide comprising at least one TPR domain.
3. The method of Claims 1 or 2, wherein said Kalpa polypeptide of step (a) is a polypeptide having the amino acid sequence of SEQ ID No 4, or a fragment comprising a a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4.
4. The method of Claims 1 to 3, wherein said Kalpa-target polypeptide is a polypeptide comprising a WD-40 domain, or a biologically active portion thereof.
5. The method of Claims 1 to 4, wherein said Kalpa-target polypeptide is SCAP polypeptide, or a biologically active portion thereof.
6. The method of Claims 1 to 5, further comprising administering said cholesterol- lowering compound to a non-human test animal.
7. The method of any one of Claims 1 to 6, further comprising assessing the ability of said cholesterol-lowering compound to increase expression of the LDL receptor.
8. The method of any one of Claims 1 to 7, wherein a determination that said test compound selectively inhibits the ability of said polypeptide of step (a) to bind to said target polypeptide of step (b) indicates that said compound is a candidate cholesterol-lowering compound.
9. The method of any one of Claims 1 to 8, comprising providing a cell comprising:
(a) a first expression vector comprising a nucleic acid encoding a polypeptide of SEQ ID No 4 or, a fragment comprising a a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4; and
(b) a second expression vector comprising a nucleic acid encoding a respective Kalpa -target polypeptide or a fragment thereof.
10. A method of identifying a candidate cholesterol-lowering compound comprising: (a) contacting a Kalpa polypeptide, or a biologically active fragment thereof, with a test compound; and
(b) determining whether said compound selectively modulates the activity of said polypeptide; wherein a determination that said compound selectively modulates the activity of said polypeptide indicates that said compound is a candidate cholesterol-lowering compound.
11. The method of Claim 10, wherein the polypeptide of step (a) comprises the amino acid sequence depicted in one of SEQ ID No 4, or a biologically active fragment thereof.
12. A method of assessing the biological activity of a Kalpa polypeptide, or a fragment thereof, comprising:
(a) providing a Kalpa polypeptide, or a biologically active fragment thereof; and
(b) assessing the ability of the said polypeptide to modulate a lipid phenotype selected from the group consisting of cholesterol regulation (CHOL), HDL-cholesterol (HDL) regulation, LDL- cholesterol (LDL) regulation, Triglycerides (TGRL) regulation and LDL/HDL ratio (LDL/HDL) regulation.
13. The method of Claim 12, wherein the polypeptide of step (a) comprises the amino acid sequence depicted in one of SEQ ID No 4, or a biologically active fragment thereof.
14. A method of assessing the biological activity of a Kalpa polypeptide, or a fragment thereof, comprising: (a) providing a Kalpa polypeptide, or a biologically active fragment thereof; and (b) assessing the ability of the said polypeptide to modulate apoB glycosylation.
15. A method of assessing the biological activity of a Kalpa polypeptide, or a fragment thereof, comprising:
(a) providing a Kalpa polypeptide, or a biologically active fragment thereof; and (b) assessing the ability of the said polypeptide to interact with a WD-40 protein or modulate LDL receptor expression.
16. A method of determining whether a Kalpa polypeptide, or a fragment thereof, is expressed within a biological sample, said method comprising the steps of:
(a) contacting a biological sample from a subject suffering from, suspected of suffering from or susceptible to a disorder related to cholesterol regulation, with: a polynucleotide that hybridizes under stringent conditions to a Kalpa nucleic acid; or a detectable polypeptide that selectively binds to a Kalpa polypeptide; and (b) detecting the presence or absence of hybridization between said polynucleotide and an RNA species within said sample, or the presence or absence of binding of said detectable polypeptide to a polypeptide within said sample; wherein a detection of said hybridization or of said binding indicates that said Kalpa polypeptide is expressed within said sample.
17. The method of claim 16, wherein said polynucleotide hybridizes under stringent conditions to a nucleic acid of any of SEQ ID No 2 or 3, or to a contiguous span of at least 15 contiguous nucleotides of a nucleic acid according to SEQ ID No 2 or 3.
18. The method of claim 16, wherein said polynucleotide hybridizes under stringent conditions to a nucleic acid encoding a polypeptide of any of SEQ ID No 4 or a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4.
19. The method of claim 16, wherein said detectable polypeptide selectively binds to a polypeptide of any of SEQ ID No 4 or a biologically active fragment thereof.
20. The method of claim 16, wherein said polynucleotide is a primer, and wherein said hybridization is detected by detecting the presence of an amplification product comprising said primer sequence.
21. The method of claim 16, wherein said detectable polypeptide is an antibody.
22. The method of Claim 16, wherein said subject suffers from a cholesterol-related disorder selected from the group consisting of heart disease, coronary artery disease, myocardial infarct and lipid-related metabolic disorders.
23. A method of determining whether a mammal has an elevated or reduced level of expression of a cholesterol-lowering protein, said method comprising the steps of:
(a) providing a biological sample from subject suffering from, suspected of suffering from or susceptible to a disorder related to cholesterol regulation; and
(b) comparing the amount of a Kalpa polypeptide or of a RNA species encoding a Kalpa polypeptide within said biological sample with a level detected in or expected from a control sample; wherein an increased amount of said polypeptide or RNA species within said biological sample compared to said level detected in or expected from said control sample indicates that said mammal has an elevated level of expression, and wherein a decreased amount of said polypeptide or said RNA species within said biological sample compared to said level detected in or expected from said control sample indicates that said mammal has a reduced level of expression.
24. The method of Claim 23, wherein said polypeptide of step (b) is selected from the group consisting of SEQ ID No 4 or, a fragment comprising a a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4.
25. The method of Claim 23, wherein said RNA of step (b) encodes a polypeptide selected from the group consisting of SEQ ID No 4 or, a fragment comprising a a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4.
26. The method of Claim 23, wherein said subject suffers from a cholesterol-related disorder selected from the group consisting of heart disease, coronary artery disease, myocardial infarct and lipid-related metabolic disorders.
27. A method of identifying a candidate cholesterol-lowering compound, said method comprising:
(a) contacting a Kalpa polypeptide with a test compound; and
(b) determining whether said compound selectively binds to said polypeptide; wherein a determination that said compound selectively binds to said polypeptide indicates that said compound is a candidate cholesterol-lowering compound.
28. A method according to claim 27, wherein a determination that said compound selectively binds to said polypeptide indicates that said compound is a candidate activator of said polypeptide.
29. A method according to claim 27, wherein a determination that said compound selectively binds to said polypeptide indicates that said compound is a candidate inhibitor of said polypeptide.
30. A method according to claim 27, wherein said polypeptide comprises an amino acid sequence of SEQ ID No 4, or a fragment thereof comprising a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4.
31. A method of identifying a candidate cholesterol-lowering compound, said method comprising:
(a) contacting a Kalpa polypeptide, with a test compound; and
(b) determining whether said compound selectively modulates a biological activity of said polypeptide; wherein a determination that said compound selectively modulates the activity of said polypeptide indicates that said compound is a candidate cholesterol-lowering compound.
32. A method according to claim 31, wherein said step of determining whether said compound selectively modulates the activity of said polypeptide comprises determining .
33. A method according to claim 31, wherein said compound selectively inhibits the activity of the polypeptide.
34. A method according to claim 31, wherein said compound selectively activates the activity of the polypeptide.
35. A method of identifying a candidate cholesterol-lowering compound, said method comprising:
(a) contacting a cell comprising a Kalpa polypeptide with a test compound; and
(b) determining whether said compound selectively modulates a biological activity of said polypeptide; wherein a determination that said compound selectively modulates the activity of said polypeptide indicates that said compound is a candidate cholesterol-lowering compound.
36. A method according to claim 35, wherein said polypeptide comprises an amino acid sequence of SEQ ID No 4, or a fragment thereof comprising a contiguous span of at least 6 contiguous amino acids of a polypeptide according to SEQ ID No 4.
37. A method according to claims 31 or 35, wherein said compound selectively inhibits the activity of the polypeptide.
38. A method according to claims 31 or 35, wherein said compound selectively activates the activity of the polypeptide.
39. The method of claims 31 or 35, wherein said candidate modulator is a candidate LDL- lowering molecule.
40. The method of claim 31 comprising introducing a nucleic acid comprising the nucleotide sequence encoding said Kalpa polypeptide to said cell.
41. The method of Claims 31 or 35, wherein determining whether said compound selectively modulates the activity of a Kalpa polypeptide comprises detecting glycosyltranferase activity.
42. The method of Claims 31 or 35, wherein determining whether said compound selectively modulates the activity of a Kalpa polypeptide comprises detecting aldo-keto reductase activity.
43. The method of Claims 31 or 35, wherein determining whether said compound selectively modulates the activity of a Kalpa polypeptide comprises detecting Kalpa-target binding activity.
44. The method of Claims 31 or 35, wherein determining whether said compound selectively modulates the activity of a Kalpa polypeptide comprises detecting interaction with a Kalpa-target polypeptide.
45. The method of Claims 31 or 35, wherein said Kalpa-target polypeptide comprises a WD-40 domain.
46. A method of genotyping comprising determining the identity of a nucleotide at a Kalpa -related single nucleotide polymorphism or the complement thereof in a biological sample.
47. A method according to claim 46, comprising determining the identity of a nucleotide at a Kalpa related single nucleotide polymorphism of SEQ ID NOS 1 or 2 or 3.
48. A method according to claim 46, comprising determining the identity of the nucleotide at position 25 at a Kalpa single nucleotide polymoφhism of any one of SEQ ID NOS 5 to 23.
49. A method according to claims 46 to 48, wherein said biological sample is obtained from an individual affected by, or suspected of being affected by or susceptible to a cholesterol- related disorder.
50. A method according to claim claims 46 to 48, wherein a lipid phenotype selected from the group consisting of phenotypes related to cholesterol regulation, HDL-cholesterol regulation, LDL-cholesterol regulation, Triglycerides regulation and LDL/HDL ratio regulation is determined.
51. A method according to claim 46, wherein said biological sample is derived from a single subject.
52. A method according to claims 46 to 48, wherein the identity of the nucleotides at said single nucleotide polymorphism is determined for both copies of said single nucleotide polymorphism present in said individual's genome.
53. A method according to claim 46, wherein said biological sample is derived from multiple subjects.
54. A method according to claim 46, further comprising amplifying a portion of said sequence comprising the single nucleotide polymorphism prior to said determining step.
55. A method of estimating the frequency of an allele of a Kalpa -related single nucleotide polymorphism in a population comprising: a) genotyping individuals from said population for said single nucleotide polymorphism according to the method of claim 46; and b) determining the proportional representation of said single nucleotide polymorphism in said population..
56. A method of detecting an association between a genotype and a trait related to a lipid phenotype, wherein said lipid phenotype is a phenotype related to cholesterol regulation (CHOL), HDL-cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, Triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation, comprising the steps of: a) determining the frequency of at least one Kalpa -related single nucleotide polymoφhism in trait positive population according to the method of claim 55; b) determining the frequency of at least one Kalpa -related single nucleotide polymorphism in a control population according to the method of claim 55; and c) determining whether a statistically significant association exists between said genotype and said trait.
57. A method of estimating the frequency of a haplotype for a set of single nucleotide polymoφhisms in a population having a selected lipid phenotype, wherein said lipid phenotype is a phenotype related to cholesterol regulation (CHOL), HDL-cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, Triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation, comprising: a) genotyping at least one Kalpa -related single nucleotide polymoφhism according to claim 80 for each individual in said population; b) genotyping a second single nucleotide polymorphism by determining the identity of the nucleotides at said second single nucleotide polymoφhism for both copies of said second single nucleotide polymorphism present in the genome of each individual in said population; and c) applying a haplotype determination method to the identities of the nucleotides determined in steps a) and b) to obtain an estimate of said frequency.
58. A method according to claim 57, wherein said haplotype determination method is selected from the group consisting of asymmetric PCR amplification, double PCR amplification of specific alleles, the Clark algorithm, or an expectation-maximization algorithm.
59. A method of detecting an association between a haplotype and a trait related to a lipid phenotype, wherein said lipid phenotype is a phenotype related to cholesterol regulation (CHOL), HDL-cholesterol (HDL) regulation, LDL-cholesterol (LDL) regulation, Triglycerides (TGRL) regulation or LDL/HDL ratio (LDL/HDL) regulation, comprising the steps of: a) estimating the frequency of at least one haplotype in a trait positive population according to the method of claim 57; b) estimating the frequency of said haplotype in a control population according to the method of claim 57; and c) determining whether a statistically significant association exists between said haplotype and said trait.
60. A method according to claim 56, wherein said genotyping steps a) and b) are performed on a single pooled biological sample derived from each of said populations.
61. A method according to claim 56, wherein said genotyping steps a) and b) performed separately on biological samples derived from each individual in said populations.
62. A method according to either claim 56 or 59, wherein said trait involves suffering from or suspected of suffering form a disease selected from the group consisting of heart disease, coronary artery disease, myocardial infarct and lipid-related metabolic disorders.
63. A method according to either claim 56 or 59, wherein said lipid phenotype is related to LDL receptor expression.
64. A method according to either claim 56 or 59, wherein said control population is a trait negative population.
65. A method according to either claim 56 or 59, wherein said case control population is a random population.
66. A method of determining whether an individual is at risk of developing a cholesterol- related disorder, comprising: a) genotyping at least one Kalpa -related single nucleotide polymoφhism according to the method of claim 46; and b) correlating the result of step a) with a risk of developing a cholesterol-related disorder.
67. The method of Claim 66, wherein said cholesterol related disorder is heart disease, coronary artery disease, myocardial infarct or a lipid related metabolic disorder.
68. A method of selecting a candidate Kalpa modulator, or a method of identifying a candidate compound for the treatment of a cholesterol-related disorder, said method comprising:
(a) providing a compound capable of inhibiting or activates the activity of a Kalpa protein; and
(b) determining whether said compound is capable of, or likely to be capable of, modulating cholesterol or LDL regulation.
69. The method of claim 68, wherein step (a) comprises determining whether said test compound is capable of inhibiting or activating the activity of a Kalpa protein.
70. A method of selecting a candidate Kalpa modulator, or a method of identifying a candidate compound for the treatment of a cholesterol-related disorder, said method comprising:
(a) providing a compound capable of modulates the expression of a Kalpa protein; and (b) determining whether said compound is capable of, or likely to be capable of, modulating cholesterol or LDL regulation.
71. The method of claim 70, wherein step (a) comprises determining whether said test compound is capable of modulating the expression of the activity of a Kalpa protein.
72. The method of claims 68 to 71, wherein determining whether said test compound is capable of inhibiting the activity of a Kalpa protein comprises determining whether said test compound inhibits binding to a target polypeptide.
73. The method of claim 68 to 71, wherein determining whether said test compound is capable of, or likely to be capable of, modulating cholesterol or LDL regulation comprises determining whether said test compound modulates blood or cellular cholesterol levels.
74. The method of claim 68 to 71, wherein determining whether said test compound is capable of, or likely to be capable of, modulating cholesterol or LDL regulation comprises determining whether said test compound modulates cellular LDL uptake.
75. The method of claim 68 to 71, wherein determining whether said test compound is capable of, or likely to be capable of, modulating cholesterol or LDL regulation comprises determining whether said test compound modulates or is likely to modulate LDL receptor expression.
76. The method of claim 68 to 71, wherein determining whether said test compound is capable of, or likely to be capable of, modulating cholesterol or LDL regulation comprises determining whether said test compound modulates of is likely to modulate SRE-mediated or SREBP-mediated expression.
77. The method of claims 68 to 71, wherein determining whether said test compound is capable of inhibiting the activity of a Kalpa protein comprises:
(i) contacting a Kalpa polypeptide or a fragment thereof with a test compound; and (ii) determining whether said compound selectively inhibits Kalpa activity.
78. The method of claims 68 to 71, wherein determining whether said test compound is capable of inhibiting the activity of a Kalpa protein comprises:
(i) providing a cell comprising a Kalpa polypeptide or a fragment thereof; (ii) contacting said cell with a test compound; and (iii) determining whether said compound selectively inhibits Kalpa activity.
PCT/IB2003/002376 2002-04-29 2003-04-29 Assays for identifying cholesterol - lowering molecules WO2003093826A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003233124A AU2003233124A1 (en) 2002-04-29 2003-04-29 Assays for identifying cholesterol - lowering molecules

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US37651002P 2002-04-29 2002-04-29
US60/376.510 2002-04-29
US40229002P 2002-08-09 2002-08-09
US60/402.290 2002-08-09

Publications (2)

Publication Number Publication Date
WO2003093826A2 true WO2003093826A2 (en) 2003-11-13
WO2003093826A3 WO2003093826A3 (en) 2004-12-02

Family

ID=29406760

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2003/002376 WO2003093826A2 (en) 2002-04-29 2003-04-29 Assays for identifying cholesterol - lowering molecules

Country Status (2)

Country Link
AU (1) AU2003233124A1 (en)
WO (1) WO2003093826A2 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001051638A2 (en) * 2000-01-14 2001-07-19 Incyte Genomics, Inc. Drug metabolizing enzymes
WO2001053312A1 (en) * 1999-12-23 2001-07-26 Hyseq, Inc. Novel nucleic acids and polypeptides
WO2002000677A1 (en) * 2000-06-07 2002-01-03 Human Genome Sciences, Inc. Nucleic acids, proteins, and antibodies
WO2002026950A2 (en) * 2000-09-29 2002-04-04 Incyte Genomics, Inc. Transferases
US20020045271A1 (en) * 1998-06-10 2002-04-18 Licata And Tyrrell P.C. Compounds and methods for identifying compounds that interact with microsomal triglyceride transfer protein binding sites on apolipoprotein b and modulate lipid biosynthesis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020045271A1 (en) * 1998-06-10 2002-04-18 Licata And Tyrrell P.C. Compounds and methods for identifying compounds that interact with microsomal triglyceride transfer protein binding sites on apolipoprotein b and modulate lipid biosynthesis
WO2001053312A1 (en) * 1999-12-23 2001-07-26 Hyseq, Inc. Novel nucleic acids and polypeptides
WO2001051638A2 (en) * 2000-01-14 2001-07-19 Incyte Genomics, Inc. Drug metabolizing enzymes
WO2002000677A1 (en) * 2000-06-07 2002-01-03 Human Genome Sciences, Inc. Nucleic acids, proteins, and antibodies
WO2002026950A2 (en) * 2000-09-29 2002-04-04 Incyte Genomics, Inc. Transferases

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GRAND-PERRET T ET AL: "SCAP ligands are potent new lipid-lowering drugs" NATURE MEDICINE, NATURE PUBLISHING, CO, US, vol. 7, no. 12, 2001, pages 1332-1338, XP002259713 ISSN: 1078-8956 *

Also Published As

Publication number Publication date
AU2003233124A1 (en) 2003-11-17
AU2003233124A8 (en) 2003-11-17
WO2003093826A3 (en) 2004-12-02

Similar Documents

Publication Publication Date Title
US20110207801A1 (en) Novel Genes, Compositions, and Methods for Modulating the Unfolded Protein Response
JP2002531069A (en) Novel members of the capsaicin / vanilloid receptor family of proteins and their uses
CA2406999A1 (en) Gene and sequence variation associated with sensing carbohydrate compounds and other sweeteners
WO2003065984A2 (en) Methods and compositions for treating cardiovascular disease
AU2003248794B2 (en) Therapeutic methods for reducing fat deposition and treating associated conditions
AU1618300A (en) Compositions and methods relating to the peroxisomal proliferator activated receptor-alpha mediated pathway
US20080199480A1 (en) Methods for Identifying Risk of Type II Diabetes and Treatments Thereof
WO2004063340A2 (en) Methods and compositions for treating cardiovascular disease using 1722, 10280, 59917, 85553, 10653, 9235, 21668, 17794, 2210, 6169, 10102, 21061,17662,1468,12282, 6350, 9035,1820, 23652, 7301, 8925, 8701, 3533, 9462, 9123,12788,17729, 65552,1261, 21476, 33770, 9380, 2569654, 33556, 53656, 44143, 32612, 10671, 261, 44570, 4
JP2003506042A (en) Novel GPCR-like molecule 15571 of secretin-like family and use thereof
US20040018533A1 (en) Diagnosing predisposition to fat deposition and therapeutic methods for reducing fat deposition and treatment of associated conditions
JP2005535289A (en) 1414 molecules, 1481 molecules, 1553 molecules, 34021 molecules, 1720 molecules, 1683 molecules, 1552 molecules, 1682 molecules, 1682 molecules, 1675 molecules, 12825 molecules, 9952 molecules, 5816 molecules, 12002 molecules, 1611 molecules, 1371 molecules, 14324 molecules, 126 molecules Methods and compositions for treating AIDS and HIV related disorders using 270 molecules, 312 molecules, 167 molecules, 326 molecules, 18926 molecules, 6747 molecules, 1793 molecules, 1784 molecules or 2045 molecules
WO2006022619A2 (en) Methods for identifying risk of type ii diabetes and treatments thereof
US6558936B1 (en) Human lipase proteins, nucleic acids encoding them, and uses of both of these
US8212016B2 (en) NPC1L1 orthologues
WO2003093826A2 (en) Assays for identifying cholesterol - lowering molecules
WO2006022633A1 (en) Methods for identifying a risk of type ii diabetes and treatments thereof
AU2002239363A1 (en) Expression analysis of inhibitor of differentiation nucleic acids and polypeptides useful in the diagnosis and treatment of prostate cancer
AU2002236503A1 (en) Expression analysis of KIAA nucleic acids and polypeptides useful in the diagnosis and treatment of prostate cancer
US20040110204A1 (en) Compositions, kits,and methods for prognostication, diagnosis, prevention, and treatment of bone-related disorders and other disorders
US20030054418A1 (en) Gene and sequence variation associated with cancer
US20030064372A1 (en) Gene and sequence variation associated with lipid disorder
US20020123098A1 (en) 55063, a novel human NMDA family member and uses thereof
WO2003004067A1 (en) Methods and compositions for the treatment and diagnosis of body weight disorders
JP2002525117A (en) SLGP protein and nucleic acid molecules and uses thereof
US20040081964A1 (en) Gene and sequence variation associated with sensing carbohydrate compounds and other sweeteners

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP