GUCY1B2 GENETIC MARKERS FOR LDL CHOLESTEROL RESPONSE TO STATIN
THERAPY
RELATED APPLICATIONS This application claims the benefit of US Provisional Application 60/392,818 filed June 28,
2002.
FIELD OF THE INVENTION
This invention relates to the field of genomics and pharmacogenetics. More specifically, this invention relates to genetic markers of the gene for guanylate cyclase 1 (GUCY1B2) and their use as predictors of response to treatment with statins.
BACKGROUND OF THE INVENTION
Cardiovascular disease is a major health problem in the United States and worldwide (R. H. Knopp, N. Engl J. Med. 341:498-511, 1999). The major cause of cardiovascular disease is atherosclerosis, which results from the formation of lipid-laden cellular lesions in one or more of the coronary arteries that supply the heart muscle with blood (Leff, T. and Graber, P.J., "Cardiovascular Diseases" in: Meyers. R. Molecular Biology and Biotechnology (VCH Publishers 1995) pp. 149-153). High levels of low-density lipoprotein cholesterol ("LDLC") have long been associated with an increased risk of developing atherosclerosis (Leff and Graber, supra). However, it is now widely accepted that high levels of plasma triglycerides ("TG") and low levels of high-density lipoprotein cholesterol ("HDLC") are associated with coronary artery disease as well (Gotto AM, American Journal of Cardiology, 87 (5 Suppl.) 13-18, 2001. Another risk factor for cardiovascular disease is high levels of LDL apolipoprotein B ("LDL Apo B"), which is the major lipoprotein associated with LDLC particles (American Heart Association National Center, New Release NR 96-4430 (Circ/apo B), August 1, 1996).
Patients with one or more of the above risk factors are frequently treated with one or more lipid- modifying drugs to achieve certain target levels of LDLC and HDLC that are recommended by the current National Cholesterol Education Program guidelines for treatment of hypercholesterolernia. Usual medical practice is to direct initial drug therapy toward elevated LDLC with treatment of low HDLC a secondary endpoint that is often managed by addition, after some weeks, of a second therapeutic agent.
One class of lipid-modifying drugs that are particularly useful for reducing elevated LDLC levels are statins, which inhibit the activity of 3-hydroxy-3-methylglutaryl-coenzyme A (HMG-CoA) reductase, the rate-limiting enzyme for cholesterol formation in the liver and other tissues (Vaughan et al., /. Amer. College Cardiol. 35:1-10, 2000; Knopp, supra), h addition, in clinical trials of various statin compounds, increases in HDLC levels were observed, with a mean increase of 2% - 12%, depending upon the specific statin compound and the conditions under which it was studied. hile
most of the common side effects of statin therapy are mild, transient and reversible (e.g., dyspepsia, abdominal pain and flatulence), more severe, long-term adverse reactions to statins occur and include hepatitis, peripheral neuropathy, insomnia, difficulty in concentrating, and elevation of creatine phosphokinase, which is correlated with rhabdomyolysis (Knopp, supra, Lupattelli G et al., Nucl. Med. Commun. 22(5): 575-8, 2001; Moghadasian MH et al., Expert Opin. Pharmacother. 1(4): 683-95, 2000). Currently, there are five statins sold in the United States: lovastatin and simvastatin (sold by Merck as Mevacor® and Zocor®, respectively); atorvastatin calcium (sold as Lipitor® by the Parke Davis Division of Pfizer); fluvastatin sodium (sold as Lescol® by Novartis); and pravastatin sodium (sold as Pravachol® by Bristol-Myers Squibb) (Knopp, supra). A sixth statin, cerivastatin sodium, was previously sold as Baycol® by Bayer, but was voluntarily removed from the market in 2001 because of safety concerns. Five of these drugs are metabolized by cytochrome P-450 enzyme systems, while the sixth, pravastatin sodium, is metabolized by sulfation and possibly other mechanisms (Knopp, supra).
Extensive studies have been performed with cerivastatin sodium, atorvastatin calcium, simvastatin, and pravastatin sodium to determine efficacy in the treatment of hypercholesterolemia. In three multicenter, placebo-controlled, dose-response studies of cerivastatin sodium, subjects with primary hypercholesterolemia experienced significantly reduced levels of total-cholesterol, LDLC, apolipoprotein B (apo B), triglycerides, total-cholesterol/HDLC ratio, and LDLC/HDLC ratios following treatment with cerivastatin sodium for an 8-week period. The mean decreases in LDLC and mean increases in HDLC for cerivastatin sodium administered once daily in the evening were 25% and 9% at 0.2 mg/day, 31% and 8% at 0.3 mg/day, and 34% and 7% at 0.4 mg/day (Physicians' Desk Reference, 2000, p. 675). Similarly, in two multicenter, placebo-controlled, dose-response studies, atorvastatin calcium given as a single dose over a six-week period significantly reduced total- cholesterol, LDLC, Apo B, and triglycerides. Atorvastatin calcium at 10, 20, 40, and 80 mg/day, resulted in mean LDLC decreases/HDLC increases of 39%/6%, 43%/9%, 50%/6%, and 60%/5%, respectively (Physicians' Desk Reference, 2000, p. 2255). Also, a multicenter, double-blind, placebo- controlled, dose-response study of simvastatin showed a significant decrease in total-cholesterol, LDLC, total cholesterol/HDLC ratio, and LDLC /HDLC ratios in subjects with familial or non-familial hypercholesterolemia. (Physicians' Desk Reference, 2000, p. 1917). In comparative studies of simvastatin at a low daily dose versus a high daily dose, the mean percent decreases in LDLC and mean percent increases in HDLC observed were 26% and 10% for 5 mg, 30% and 12%> for 10 mg, 41% and 9% for 40 mg, and 47% and 8% for 80 mg. Finally, in multicenter, placebo-controlled studies of pravastatin sodium given in daily doses from 10-40 mg, subjects with primary hypercholesterolemia showed consistent and significant decreases in total-cholesterol, LDLC, triglycerides, total- cholesterol/HDLC ratio, and LDLC/HDLC ratios. The mean LDLC decreases and HDLC increases for pravastatin sodium administered once daily at bedtime were 22%> and 7% at 10 mg/day, 32% and 2% at 20 mg/day, and 34% and 12% at 40 mg/day (Physicians' Desk Reference, 2000, p. 846).
Other comparative studies have suggested that statins differ in some of their clinical properties relevant to reducing the risks of atherosclerosis. A recent double-blind, randomized, parallel, 36-week
dose escalation study with 826 hypercholesterolemic patients compared simvastatin and atorvastatin at 40 or 80 mg day. As dose increased, simvastatin resulted in larger increases in HDLC than atorvastatin (Illingworth DR et al. (2001) Curr Med Res Opin 17(l):43-abstract only). Wierzbicki & Mikhaihdis (2002 IntJ. Cardiol. 84(l):53-57) reviewed five studies comparing the dose-response effects of atorvastatin and simvastatin on HDLC in hypercholesterolemic patients to compare daily doses for both drugs ranging from 10 to 80 mg. HDLC was significantly and consistently increased by all doses of simvastatin. However while atorvastatin showed increases in HDLC at low dose, the pooled data from all five studies suggest a negative dose-response effect with smaller increases in HDLC with increasing atorvastatin concentration. The studies described above report population means for changes in LDLC and HDLC that disguise substantial evidence of significant interindividual variation in response to statins. Indeed, any particular individual treated with a statin may experience a 10% to 70% reduction in LDLC (Aguilar- Salinas SA et al., Atherosclerosis 141:203-207 ', 1998). In addition, physicians have observed that some patients treated with statins exhibit minimal or no increase in HDLC, which is not an optimal response for patients with low HDLC levels. However, physicians currently are unable to identify patients who are at risk for reduced efficacy of statin therapy, which can be expensive and is not without risk. Also, physicians must currently rely on trial and error to determine which statin and dose combination will produce the best LDLC or HDLC response in any particular patient. Thus it would be useful to understand the biological basis for variability of response to statins. Part of this biological basis may be genetic variation in proteins involved in lipid metabolism and/or atherogenecity (Kuivenhoven et al., supra). One protein involved in regulation of blood pressure and platelet adhesion and aggregation is soluble guanylate cyclase 1. GUCY1B2 (also known as GCS- beta-2) is one subunit of soluble guanylate cyclase 1, a heterodimeric enzyme consisting of an alpha and a beta subunit. Soluble guanylate cyclase 1 catalyzes the conversion of GTP to cGMP in response to extracellular stimuli. cGMP, in turn, acts as a second messenger for a number of regulatory mechanisms, including protein kinases, phosphodiesterases and ion channels. Soluble guanylate cyclase 1 is particularly important as the receptor for the vasodilator nitric oxide (NO) and as the target protein of nitrovasodilator pharmaceuticals (vasodilators that result in the release of NO). Because of the role of NO in regulating blood pressure, soluble guanylate cyclase 1 is an important target for therapy designed to treat hypertension. Nitrovasodilators have been used for over 100 years to treat angina pectoris and coronary heart disease. In addition to blood pressure regulation, the guanylate cyclase-mediated NO- cGMP pathway is also involved in the regulation of platelet adhesion and aggregation, and modulation of neuronal transmission (since NO can act as a neurotransmitter)(Zabel et al, Biochem J 1998; 335 (Pt 1):51-57). Nitric oxide plays a major role in the cardiovascular system by regulating smooth muscle tone. There is also evidence that nitric oxide mediates oxidation of LDL-C, and that such oxidation contributes to fatty streak formation in arteries (Circulation 2002 Apr 30; 105(17):2078-82). Furthermore, statins are known to stimulate angiogenesis via an NO-mediated system (Circ Res 2001 Nov 9;89(10):866-73).
The guanylate cyclase 1, soluble, beta 2 gene is located on chromosome 13ql4.3 and contains 17 exons that encode a 617 amino acid protein. Although a complete genomic sequence was not available for GUCY1B2, a reference sequence for the GUCY1B2 gene comprising all of the exons was assembled by Applicants from several non-contiguous sequences. This reference sequence, shown in the contiguous lines of Figure 1 (SEQ ID NO: 1), is a composite genomic sequence assembled from Genaissance Reference Nos. 10409167, 10409214, 10409220, 10409226, 10409232, 10409240, 10409250, 10409256, 10409262, 10409268 and 10409274. Reference sequences for the coding sequence (GenBank Accession No. AF038499.2) and protein are shown in Figures 2 (SEQ ID NO: 2) and 3 (SEQ ID NO: 3), respectively. Because of the potential for variation in the GUCY1B2 gene to affect the expression and function of the encoded protein it would be useful to identify polymorphisms in the GUCY1B2 gene. Such information could be applied for studying the biological function of GUCYlB2 as well as in identifying drags targeting this protein for the treatment of disorders related to its abnormal expression or function. In particular, it would also be useful to determine whether any GUCY1B2 polymorphisms are associated with variation in response to treatment with statins. Such information would assist the treating physician in developing the most appropriate therapy regimen for patients at risk for or diagnosed with cardiovascular disease.
SUMMARY OF THE INVENTION
Accordingly, the inventors have identified correlations between certain haplotypes in the GUCY1B2 gene and differential LDLC response to treatment with atorvastatin calcium in a cohort of individuals participating in a a randomized, 16-week, open-label investigation of drag response in relationship to gene variants in adult subjects with primary hypercholesterolemia. The inventors have also discovered that the copy number of these GUCY1B2 haplotypes affect the change in LDLC level resulting after treatment with atorvastatin. The GUCY1B2 haplotypes shown to have association with LDLC level in response to treatment with atorvastatin calcium are shown in Table 1 below. One or two copies of any of haplotypes 1 to 10 or 27 to 38 in Table 1 , zero copy of any of haplotypes 11-26 in Table 1 or zero or one copy of any of haplotypes 39-64 in Table 1 are defined herein as a statin response marker I and are correlated with a larger mean percent reduction in LDLC at a given atorvastatin dose (See Tables 6 and 7). Correspondingly, zero copy of any of haplotypes 1 to 10 or 27 to 38 in Table 1, at least one copy of any of haplotypes 11-26 in Table 1 or two copies of any of haplotypes 39-64 in Table 1 are defined herein as a statin response marker II and are correlated with a smaller mean percent reduction in LDLC at a given atorvastatin dose (See Tables 6 and 7). The GUCY1B2 haplotypes and copy number that comprise statin response markers I and statin response markers II are summarized in Table 1 below.
a The location of each polymorphic site (PS) in Fig. 1 (SEQ ID NO: 1) is shown below the PS number designation. An asterick indicates the polymorphic site is not part of the haplotype.
In the patient cohort, the group of individuals having a statin response marker I experienced a larger mean percent decrease in LDLC in response to atorvastatin calcium than the group of individuals having a statin response marker II. Presence or absence of statin response marker 1 did not affect the mean percent decrease in LDLC observed in the groups of individuals treated with simvastatin or pravastatin sodium. Thus, testing for the presence of any of these statin response markers I or II in patients will provide valuable information that can be used by the treating physician to devise the most effective statin treatment regimen.
In addition, as described in more detail below, the inventors believe that additional statin response markers I and II may readily be identified based on linkage disequilibrium between the above GUCY1B2 haplotypes or their component polymorphisms and other haplotypes or polymorphisms, respectively, that are located in the GUCY1B2 gene or other genes. In particular, statin response markers of the invention include those comprising haplotypes that are in linkage disequilibrium with any of haplotypes 1 to 64 in Table 1, hereinafter referred to as "linked haplotypes", as well as "substitute haplotypes" for any of haplotypes 1 to 64 in Table 1 in which one or more of the polymorphic sites in the original haplotype is substituted with another polymorphic site, wherein the allele at the substitute polymorphic site is in linkage disequilibrium with the allele at the replaced or substituted polymorphic site.
The correlations between the different types of statin response markers and varying reduction in LDLC in response to treatment with statins suggest that testing for the presence of a statin response marker I or a statin response marker U in patients would provide valuable information that can be used by the treating physician to choose the most effective statin or combination therapy for achieving a desired effect on LDLC levels. In addition, these correlations suggest that any clinical trial of a statin should include in its design or analysis a consideration of the potential effect of statin response markers on the efficacy of statin response. Accordingly, some aspects of the invention are based on the correlations of statin response markers I and U with a differential LDLC response to treatment with atorvastatin or pharmaceutically acceptable salts of atorvastatin acid. In one aspect, the invention provides methods and kits for determining whether an individual has a statin response marker I or a statin response marker JJ. These methods and kits are useful for predicting the expected therapeutic response of an individual to treatment with statins, selecting an optimal statin for an individual or choosing appropriate therapy for an individual.
In one embodiment, a method for determining whether an individual has a statin response marker I or a statin response marker U comprises determining the copy number present in the individual of a particular haplotype. The haplotype is any one of GUCY1B2 haplotypes 1 to 64 shown in Table 1; a linked haplotype for any of haplotypes 1 to 64 in Table 1; and a substitute haplotype for any of haplotypes 1 to 64 in Table 1. The individual has a statin response marker I if the individual has at least one copy of any of haplotypes 1-10, 27-38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has zero copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has zero or one copy of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64. The individual has a statin response marker II if the individual has zero copy of any of haplotypes 1-10, 27- 38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has at least one copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has two copies of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64. In preferred embodiments of these methods, the haplotype comprises one of haplotypes 1, 2, 27-37 in Table 1; preferably the haplotype comprises one of haplotypes 1, 31, 32 or 33.
In another embodiment of the invention, a method for assigning an individual to a first or second statin response marker group comprises determining the copy number present in the individual of a particular haplotype and assigning the individual to a statin response-marker group based on the copy number of that haplotype. The individual is assigned to the first statin response marker group if the individual has at least one copy of any of haplotypes 1-10, 27-38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has zero copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute
haplotype for any of haplotypes 11-26; or if the individual has zero or one copy of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64. The individual is assigned to the second statin response marker group if the individual has zero copy of any of haplotypes 1-10, 27-38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has at least one copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has two copies of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64. In preferred embodiments of these methods, the haplotype comprises one of haplotypes 1, 2, 27-37 in Table 1; preferably the haplotype comprises one of haplotypes 1, 31, 32 or 33.
One embodiment of a kit for determining whether an individual has a statin response marker I or a statin response marker II comprises a set of oligonucleotides designed for identifying at least one of the alleles present at each polymorphic site (PS) in a set of polymorphic sites. The set of polymorphic sites (PSs) comprises the set of PSs for any one of GUCY1B2 haplotypes 1 to 64 shown in Table 1, the set of PSs for a linked haplotype to any one of GUCY1B2 haplotypes 1 to 64 shown in Table 1; or the set of PSs for a substitute haplotype for any one of GUCY1B2 haplotypes 1 to 64 shown in Table 1. In a preferred embodiment, the set of PSs comprises at least PS11, and may further comprise PS2 and PS6. In a further embodiment, the kit comprises a manual with instructions for performing one or more reactions on a human nucleic acid sample to identify the allele(s) present in the individual at each polymorphic site in the set of polymorphic sites and determining if the individual has a statin response marker I or a statin response marker II based on the identified allele(s).
Another aspect of the invention is a method of selecting a statin therapy to provide an optimal LDLC response in an individual. The method comprises determining whether the individual has a statin response marker I or a statin response marker II and selecting a statin therapy based on the results of the determining step. If the individual has a statin response marker TJ, then the selected statin therapy comprises a higher than standard dose of atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. If the individual has a statin response marker I, then the selected statin therapy comprises a standard dose of atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. In yet another embodiment, the invention provides a method for predicting an individual's LDLC response to treatment with a statin. The statin is atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. The method comprises determining whether the individual has a statin response marker I or a statin response marker U and making a response prediction based on the results of the determining step. In some embodiments, if the individual is determined to have a statin response marker I, then the response prediction is that the individual will likely experience a larger reduction in LDLC in response to treatment with the statin than an individual determined to have a statin response marker U; and wherein if the individual is determined to have a statin response marker JJ, then the response prediction is that the individual will likely experience a smaller reduction in LDLC in response to treatment with the statin than an individual determined to have a statin response marker I.
In other aspects, the invention provides: (i) a method of seeking regulatory approval for marketing a pharmaceutical formulation comprising a statin as at least one active ingredient for treating a disease or condition in a population partially or wholly defined by having a statin response marker, (ii) an article of manufacture comprising the pharmaceutical formulation that is marketed for treating the defined population, (iii) a method of manufacturing a drag product comprising the pharmaceutical formulation, and (iv) a method of marketing the drag product for treating the defined population. In preferred embodiments, the disease or condition is a cardiovascular or coronary artery disorder, e.g., hypercholesterolemia. The statin is atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. In some embodiments, the method of seeking regulatory approval comprises conducting at least one clinical trial which comprises administering the pharmaceutical formulation to first and second treatment groups of patients having the disease or condition, wherein each patient in the first treatment group has a statin response marker I and each patient in the second treatment group has a statin response marker II, demonstrating that the second treatment group exhibits a mean reduction in LDLC that is worse than the mean reduction in LDLC exhibited by the first treatment group. In some embodiments, the method further comprises filing with a regulatory agency an application for marketing approval of the pharmaceutical formulation with a label stating that a larger than standard dose of the pharmaceutical formulation is indicated for treating the disease or condition in patients having the statin response marker II. In other embodiments, the method further comprises filing with a regulatory agency an application for marketing approval of the pharmaceutical formulation with a label stating that the pharmaceutical formulation is indicated for treating the disease or condition in patients having the statin response marker I. In preferred embodiments, the regulatory agency is the United States Food and Drug Administration (FDA) or the European Agency for the Evaluation of Medicinal Products (EMEA), or a future equivalent of these agencies. In one embodiment, the article of manufacture comprises the pharmaceutical formulation and at least one indicium identifying a population for whom a different dose regimen of the pharmaceutical formulation is indicated, wherein the identified population is partially or wholly defined by having a statin response marker II. In these embodiments, a trial population having the statin response marker II exhibits a smaller mean LDLC reduction in response to the statin than a trial population lacking the defining statin response marker. Another embodiment of the article of manufacture comprises packaging material and the pharmaceutical formulation contained within the packaging material, wherein the packaging material comprises a label approved by a regulatory agency for the pharmaceutical formulation, wherein the label states that a different dosing regimen of the pharmaceutical formulation is indicated for a population partially or wholly defined by having a statin response marker π. In yet other embodiments, an article of manufacture according to the invention comprises a pharmaceutical formulation comprising a statin and the defining statin response marker is a statin response marker I.
The method for manufacturing the drag product comprises combining in a package a
pharmaceutical formulation comprising a statin as at least one active ingredient and a label which states that a different dosing regimen for the drug product is indicated for treating a population defined wholly or partially by having a statin response marker JJ.
The method for marketing the drug product comprises promoting to a target audience the use of the drug product for treating individuals who belong to the defined population.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 illustrates a reference sequence for the GUCY1B2 gene (contiguous lines), with the start and stop positions of each region of coding sequence indicated with a bracket ([ or ]) and the numerical position below the sequence and the polymorphic site(s) and polymorphism(s) identified by Applicants in a reference population indicated by the variant nucleotide positioned below the polymorphic site in the sequence. SEQ ID NO: 1 is equivalent to Figure 1, with the two alternative allelic variants of each polymorphic site indicated by the appropriate nucleotide symbol (R= G or A, Y= T or C, M= A or C, K= G or T, S= G or C, and W= A or T; WTPO standard ST.25). SEQ JD NO:41 is a modified version of SEQ ID NO: 1 that shows the context sequence of each polymorphic site enumerated in Table 5 in a uniform format to facilitate electronic searching. For each polymorphic site, SEQ ID NO:41 contains a block of 60 bases of the nucleotide sequence encompassing the centrally-located polymorphic site at the 30th position, followed by 60 bases of unspecified sequence to represent that each PS is separated by genomic sequence whose composition is defined elsewhere herein. Figure 2 illustrates a reference sequence for the GUCY1B2 coding sequence (contiguous lines;
SEQ ID NO: 2), with the polymorphic site(s) and polymorphism(s) identified by Applicants in a reference population indicated by the variant nucleotide positioned below the polymorphic site in the sequence.
Figure 3 illustrates a reference sequence for the GUCY1B2 protein (contiguous lines; SEQ ID NO: 3), with the variant amino acid(s) caused by the polymorphism(s) of Figure 2 positioned below the polymorphic site in the sequence.
DEFINITIONS
In the context of this disclosure, the terms below shall be defined as follows unless otherwise indicated:
Allele - A particular form of a genetic locus, distinguished from other forms by its particular nucleotide sequence, or one of the alternative polymorphisms found at a polymorphic site.
Gene - A segment of DNA that contains the coding sequence for a protein, wherein the segment may include promoters, exons, introns, and other untranslated regions that control expression. Genotype - An unphased 5 ' to 3 ' sequence of nucleotide pair(s) found at a set of one or more polymorphic sites in a locus on a pair of homologous chromosomes in an individual. As used herein, genotype includes a full-genotype and/or a sub-genotype as described below. Genotyping - A process for determining a genotype of an individual.
Haplotype — A 5 ' to 3 ' sequence of nucleotides found at a set of one or more polymorphic sites in a locus on a single chromosome from a single individual.
Haplotype pair - The two haplotypes found for a locus in a single individual. Haplotyping - A process for determining one or more haplotypes in an individual and includes use of family pedigrees, molecular techniques and/or statistical inference.
Haplotype data - Information concerning one or more of the following for a specific gene: a listing of the haplotype pairs in an individual or in each individual in a population; a listing of the different haplotypes in a population; frequency of each haplotype in that or other populations, and any known associations between one or more haplotypes and a trait. Isolated - As applied to a biological molecule such as RNA, DNA, oligonucleotide, or protein, isolated means the molecule is substantially free of other biological molecules such as nucleic acids, proteins, lipids, carbohydrates, or other material such as cellular debris and growth media. Generally, the term "isolated" is not intended to refer to a complete absence of such material or to absence of water, buffers, or salts, unless they are present in amounts that substantially interfere with the methods of the present invention.
Locus - A location on a chromosome or DNA molecule corresponding to a gene or a physical or phenotypic feature, where physical features include polymorphic sites.
Nucleotide pair - The nucleotides found at a polymorphic site on the two copies of a chromosome from an individual. Phased - As applied to a sequence of nucleotide pahs for two or more polymorphic sites in a locus, phased means the combination of nucleotides present at those polymorphic sites on a single copy of the locus is known.
Polymorphic site (PS) - A position on a chromosome or DNA molecule at which at least two alternative sequences are found in a population. Polymorphism - The sequence variation observed in an individual at a polymorphic site.
Polymorphisms include nucleotide substitations, insertions, deletions and microsatelhtes and may, but need not, result in detectable differences in gene expression or protein function.
Polynucleotide - A nucleic acid molecule comprised of single-stranded RNA or DNA or comprised of complementary, double-stranded DNA. Population Group - A group of individuals sharing a common ethnogeographic origin.
Reference Population - A group of subjects or individuals who are predicted to be representative of the genetic variation found in the general population. Typically, the reference population represents the genetic variation in the population at a certainty level of at least 85%, preferably at least 90%, more preferably at least 95% and even more preferably at least 99%. Set of Polymorphic Sites — one or more polymorphic sites.
Single Nucleotide Polymorphism (SNP) - Typically, the specific pair of nucleotides observed at a single polymorphic site. In rare cases, three or four nucleotides may be found.
Statin Response Marker I ~ at least one copy of any of haplotypes 1-10, 27-38 in Table 1, a
linked haplotype to any one of haplotypes 1-10, 27-38, or a substitute haplotype for any one of haplotypes 1-10, 27-38; zero copy of any of haplotypes 11-26 in Table 1, a linked haplotype to any one of haplotypes 11-26, or a substitute haplotype for any one of haplotypes 11-26; or zero or one copy of any of haplotypes 39-64 in Table 1, a linked haplotype to any one of haplotypes 39-64, or a substitute haplotype for any one of haplotypes 39-64.
Statin Response Marker II ~ zero copy of any of haplotypes 1-10, 27-38 in Table 1, a linked haplotype to any one of haplotypes 1-10, 27-38, or a substitute haplotype for any one of haplotypes 1- 10, 27-38; at least one copy of any of haplotypes 11-26 in Table 1, a linked haplotype to any one of haplotypes 11-26, or a substitute haplotype for any one of haplotypes 11-26; or two copies of any of haplotypes 39-64 in Table 1, a linked haplotype to any one of haplotypes 39-64, or a substitute haplotype for any one of haplotypes 39-64.
Subject - A human individual whose genotypes or haplotypes or response to treatment or disease state are to be determined.
Treatment - A stimulus administered internally or externally to a subject. Unphased - As applied to a sequence of nucleotide pahs for two or more polymorphic sites in a locus, unphased means the combination of nucleotides present at those polymorphic sites on a single copy of the locus is not known.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The inventors herein have discovered certain haplotypes in the GUCY1B2 gene that are associated with variation in LDLC reduction in response to treatment with atorvastatin. The inventors have also discovered that the copy number of these GUCY1B2 haplotypes affects the LDLC reduction in response to atorvastatin. Each statin response marker of the invention is a combination of a particular haplotype, or genetic marker, and the copy number for that haplotype, or genetic marker. Preferably the genetic marker component of the statin response marker is one of the GUCY1B2 haplotypes shown in Table 1. The GUCY1B2 polymorphic sites in these GUCY1B2 haplotypes are referred to herein as PS2, PSS, PS6, PS11, PS15, PS16 and PS17 and are located in the GUCY1B2 gene at positions corresponding to those identified in Figure 1 (SEQ ID NO: 1). In describing the polymorphic sites in the statin response markers of the invention, reference is made to the sense strand of a gene for convenience. However, as recognized by the skilled artisan, nucleic acid molecules containing a particular gene may be complementary double stranded molecules and thus reference to a particular site or haplotype on the sense strand refers as well to the corresponding site or haplotype on the complementary antisense strand. Further, reference may be made to detecting a genetic marker or haplotype for one strand and it will be understood by the skilled artisan that this includes detection of the complementary haplotype on the other strand.
The location of a polymorphic site in an individual's GUCY1B2 gene or fragment may be identified by aligning the sequence of the gene or fragment against the corresponding region of SEQ ID
NO: 1. Alignment of a polymorphism in SEQ ID NO: 1 against an alternative GUCY1B2 sequence to determine the corresponding position of the polymorphic site in that alternative GUCY1B2 sequence should use a context sequence from SEQ ID NO:l ranging from about 25 to about 500 nucleotides with the polymorphism in any position of the context sequence. The alignment should require a degree of homology appropriate for the length of the context sequence in establishing alignment between the two sequences. Determining the degree of homology or the permissible number of mismatches between the two sequences is well within the skills of the routine practitioner of sequence alignment algorithms. Preferably, the context sequence from SEQ ID NO:l is about 50 to about 300 nucleotides, with the polymorphism positioned in the approximate center of the context sequence. The number of nucleotides in SEQ ID NO: 1 separating any two polymorphic sites shown in Table 5 including nucleotide positions in SEQ ID NO: 1 not sequenced by Applicants may not describe the number of nucleotides separating those polymorphic sites in a different GUCY1B2 reference sequence. The skilled practitioner would recognize the potential for variation in the relative spacing of the polymorphic sites depending on chosen reference sequence and would be able to use the context sequences for each polymorphic site to determine the presence of any particular combination of polymorphisms comprising a haplotype within the different GUCY1B2 reference sequence.
As described in more detail in Examples 1-3, the statin response markers of the invention are based on the discovery by the inventors of correlations between certain haplotypes in the GUCY1B2 gene and variation in reduction in LDLC levels in response to statin treatment in a cohort of individuals participating in a randomized, 16-week, open-label investigation of drag response in relationship to gene variants in adult subjects with primary hypercholesterolemia. In particular, the inventors herein discovered that copy number of GUCY1B2 haplotypes 1 to 64 in Table 1 significantly affected the reduction in LDLC levels observed in patients participating in the study following treatment at high dose with Lipitor® (atorvastatin calcium). The group of patients with zero copy of any of haplotypes 1 to 10 or 27 to 38 in Table 1, at least one copy of any of haplotypes 11-26 in Table 1 or two copies of any of haplotypes 39-64 in Table 1 are correlated with a smaller mean percent reduction in LDLC at a given atorvastatin dose (See Tables 6 and 7) than the patient group having one or two copies of any of haplotypes 1 to 10 or 27 to 38 in Table 1, zero copy of any of haplotypes 11-26 in Table 1 or zero or one copy of any of haplotypes 39-64 in Table 1. Therefore these haplotypes, in combination with their haplotype copy number, can be used to differentiate the LDLC reduction that would be predicted to occur in an individual or a trial population after treatment with a statin. Consequently, one or two copies of any of haplotypes 1 to 10 or 27 to 38 in Table 1 , zero copy of any of haplotypes 11 -26 in Table 1 or zero or one copy of any of haplotypes 39-64 in Table 1 are referred to herein as a statin response marker I, while zero copy of any of haplotypes 1 to 10 or 27 to 38 in Table 1, at least one copy of any of haplotypes 11-26 in Table 1 or two copies of any of haplotypes 39-64 in Table 1 is referred to herein as a statin response marker U.
The NCEP ATPUI report indicates that high LDLC is a risk factor for CHD. Therefore, herein, a "favorable", "better" or "best" LDLC response after treatment denotes that relative to the initial
baseline LDLC measured that the change in LDLC measured after statin treatment shows a larger or largest decrease in measured value. For example, no change or an decrease in LDLC after treatment relative to the baseline measurement is a better response than an increase in LDLC after treatment relative to the baseline measurement. Conversely, an "unfavorable", "worse" or "worst" LDLC response after treatment denotes herein that the change in LDLC relative to the initial baseline LDLC value that is measured after statin treatment shows a smaller or smallest decrease in measured value. For example, no change or a small decrease in LDLC after treatment relative to the baseline measurement is a worse response than a large decrease in LDLC after treatment relative to the baseline measurement. The comparison of two or more values for a change in LDLC relative to baseline after a treatment may be for a single individual before and after two different therapy regimens, e.g., a low dose vs. a high dose regimen, or for two different drugs. The comparison of two or more values for a change in LDLC relative to baseline after a treatment may be between two or more single individuals or two or more population groups before and after two different therapy regimens.
In addition, the skilled artisan would expect that there might be additional polymorphisms in the GUCY1B2 gene or elsewhere on chromosome 13 that are in high LD with one or more of the polymorphisms in the haplotypes comprising a statin response marker I or a statin response marker II. Two particular nucleotide alleles at different polymorphic sites are said to be in LD if the presence of one of the alleles at one of the sites tends to predict the presence of the other allele at the other site on the same chromosome (Stevens, JC, Mol. Diag. 4: 309-17, 1999). One of the most frequently used measures of linkage disequilibrium is Δ2, which is calculated using the formula described in Devlin, B. and Risch, N. (1995, Genomics, 29(2):311-22). Basically, Δ2 measures how well an allele X at a first polymorphic site predicts the occurrence of an allele Y at a second polymorphic site on the same chromosome. The measure only reaches 1.0 when the prediction is perfect (e.g., X if and only if Y). Thus, the skilled artisan would expect that all of the embodiments of the invention described herein may frequently be practiced by substituting the allele at any (or all) of the specifically identified GUCY1B2 polymorphic sites in GUCY1B2 haplotypes 1 to 64 in Table 1 with an allele at another polymorphic site that is in high LD with the allele at the specifically identified polymorphic site. This "substitute polymorphic site" may be one that is currently known or subsequently discovered and may be present at a polymorphic site in the GUCY1B2 gene or elsewhere on chromosome 1. Preferably, the substitute polymorphic site is present in the GUCY1B2 gene or in a genomic region of about 100 kilobases spanning the GUCY1B2 gene.
Further, the inventors contemplate that there will be other haplotypes in the GUCY1B2 gene or elsewhere on chromosome 13 that are in high LD with any one of GUCY1B2 haplotypes 1 to 64 in Table 1 that would therefore also be predictive of the LDLC response. Preferably, the linked haplotype is present in the GUCY1B2 gene or in a genomic region of about 100 kilobases spanning the GUCY1B2 gene. The linkage disequilibrium between a GUCY1B2 haplotype 1 to 64 in Table 1 and a linked haplotype can also be measured using Δ2.
In preferred embodiments, the linkage disequihbrium between the allele at a polymorphic site in
any of the GUCY1B2 haplotypes in Table 1 and the allele at a substitute polymorphic site to replace it, or between any of the GUCY1B2 haplotypes in Table 1 and a linked haplotype, has a Δ2 value, as measured in a suitable reference population, of at least 0.75, more preferably at least 0.80, even more preferably at least 0.85 or at least 0.90, yet more preferably at least 0.95, and most preferably 1.0. A suitable reference population for this Δ2 measurement is preferably selected from a population with the distribution of the ethnic background of its members reflecting the population of patients to be treated with statins, which may be the general population, a population using statins, a population with coronary heart disease (CHD), cardiovascular disease (CVD) or CHD (or CVD) risk factors, and the like.
LD patterns in genomic regions are readily determined empirically in appropriately chosen samples using various techniques known in the art for determining whether any two alleles (at two different polymorphic sites or two haplotypes) are in linkage disequilibrium (WeirB.S. 1996 Genetic Data Analysis II, Sinauer Associates, Inc. Publishers, Sunderland, MA). The skilled artisan may readily select which method of determining LD will be best suited for a particular sample size and genomic region. Similarly, the ability of substitute haplotypes, that contain an allele at one or more substitute polymorphic sites, or of linked haplotypes, that are in high LD with one or more of haplotype 1 to 64 in Table 1, to predict the LDLC response to one of the statins studied herein may also be readily tested by the skilled artisan
Thus, reference herein to a statin response marker I is deemed to include at least one copy of haplotypes that (A) either (1) have a polymorphism sequence that is similar to that of any one of haplotypes 1-10, 27-38 shown in Table 1, but in which the allele at one or more of the specifically identified GUCY1B2 polymorphic sites in that haplotype has been substituted with the allele at a polymorphic site in high LD with the allele at the specifically identified polymorphic site (a "substitute haplotype"); or (2) are in high linkage disequilibrium with any one of haplotypes 1-10, 27-38 shown in Table l(a "linked haplotype"); and (B) behave similarly to at least one copy of any of haplotypes 1-10, 27-38 in terms of predicting an individual's LDLC reduction in response to atorvastatin. Additionally, reference herein to a statin response marker I is deemed to include zero copy of haplotypes that (A) either (1) are a substitute haplotype for any one of haplotypes 11-26 shown in Table 1, or (2) are a linked haplotype to any one of haplotypes 11-26 shown in Table 1; and (B) behave similarly to zero copy of any of haplotypes 11-26 in terms of predicting an individual's LDLC reduction in response to atorvastatin. Further, reference herein to a statin response marker I is deemed to include zero or one copy of haplotypes that (A) either (1) are a substitute haplotype for any one of haplotypes 39-64 shown in Table 1, or (2) are a linked haplotype to any one of haplotypes 39-64 shown in Table 1; and (B) behave similarly to zero or one copy of any of haplotypes 39-64 in terms of predicting an individual's LDLC reduction in response to atorvastatin. Similarly, reference herem to a statin response marker II is deemed to include zero copy of haplotypes that (A) either (1) have a polymorphism sequence that is similar to that of any one of haplotypes 1-10, 27-38 shown in Table 1, but in which the allele at one or more of the specifically identified GUCY1B2 polymorphic sites in that hanlotype has been substituted with the allele at a
polymorphic site in high LD with the allele at the specifically identified polymorphic site (a "substitute haplotype"); or (2) are in high linkage disequilibrium with any one of haplotypes 1-10, 27-38 shown in Table l(a "linked haplotype"); and (B) behave similarly to zero copy of any of haplotypes 1-10, 27-38 in terms of predicting an individual's LDLC reduction in response to atorvastatin. Additionally, reference herein to a statin response marker II is deemed to include at least one copy of haplotypes that (A) either (1) are a substitute haplotype for any one of haplotypes 11-26 shown in Table 1, or (2) are a linked haplotype to any one of haplotypes 11-26 shown in Table 1; and (B) behave similarly to at least one copy of any of haplotypes 11-26 in terms of predicting an individual's LDLC reduction in response to atorvastatin. Further, reference herein to a statin response marker II is deemed to include two copies of haplotypes that (A) either (1) are a substitute haplotype for any one of haplotypes 39-64 shown in Table 1, or (2) are a linked haplotype to any one of haplotypes 39-64 shown in Table 1; and (B) behave similarly to two copies of any of haplotypes 39-64 in terms of predicting an individual's LDLC reduction in response to atorvastatin.
As described above and in the Examples below, the statin response markers of the invention are associated with effects on mean reduction in LDLC in response to treatment with atorvastatin calcium. Thus, the invention provides a method and kit for determining whether an individual has a statin response marker I or a statin response marker II.
In one embodiment, the invention provides a method for determining whether an individual has a statin response marker I or II. The method comprises determining the copy number present in the individual of a haplotype selected from the group consisting of haplotypes 1 to 64 in Table 1, a linked haplotype to any one of haplotypes 1 to 64 in Table 1; and a substitute haplotype for any one of haplotypes 1 to 64 in Table 1. The individual has a statin response marker I if the individual has at least one copy of any of haplotypes 1-10, 27-38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has zero copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has zero or one copy of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64. The individual has a statin response marker II if the individual has zero copy of any of haplotypes 1-10, 27- 38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has at least one copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has two copies of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64.
In another embodiment, the invention provides a method for assigning an individual to a first or second statin response marker group. The method comprises determining the copy number present in the individual of a haplotype selected from the group consisting of haplotypes 1 to 64 in Table 1, a linked haplotype to any one of haplotypes 1 to 64 in Table 1 and assigning the individual to a statin response marker group based on the copy number of that haplotype. The individual is assigned to the
first statin response marker group if the individual has at least one copy of any of haplotypes 1-10, 27- 38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has zero copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has zero or one copy of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64. The individual is assigned to the second statin response marker group if the individual has zero copy of any of haplotypes 1-10, 27-38, a haplotype linked to any of haplotypes 1-10, 27-38, or a substitute haplotype for any of haplotypes 1-10, 27-38; if the individual has at least one copy of any of haplotypes 11-26, a haplotype linked to any of haplotypes 11-26, or a substitute haplotype for any of haplotypes 11-26; or if the individual has two copies of any of haplotypes 39 to 64, a haplotype linked to any of haplotypes 39 to 64, or a substitute haplotype for any of haplotypes 39 to 64.
In preferred embodiments of the above methods, the haplotype comprises any one of haplotypes 1, 2, 27-37 in Table 1. More preferably, the selected haplotype is .one of haplotypes 1, 31, 32 or 33. In some embodiments, the individual is Caucasian and may be diagnosed with a coronary artery disease or a cardiovascular disease, such as Type Ila or Type lib hypercholesterolemia, may have risk factors associated with cardiovascular disease, or may be a candidate for treatment with a statin for an alternative reason.
In each of the above methods, the determining step comprises genotyping each polymorphic site in the set of polymorphic sites comprising the selected haplotype; and using the results of the genotyping step to identify the haplotype pair present in the individual. The genotyping step may be performed by any methods known to the art, including but not limited to those methods described herein. The determining step may also comprise consulting a data repository, such as a medical data card or a medical record for the individual, that provides information on the copy number present in the individual for the selected haplotype.
"Determining the copy number" of a haplotype may in some instances mean determining if zero, one or two copies is present in the individual, i.e. identifying the haplotype present on each chromosomal copy of the individual. In other instances, deteπnining the copy number of a haplotype in an individual may mean determining a lower or upper limit on the number of copies, such as determining that there is at least one copy or fewer than two copies present in the individual. In these latter instances, the haplotype for only one chromosomal copy of the individual is identified. For some individuals and some haplotypes selected from haplotypes 1 to 64 in Table 1, a linked haplotype to any one of haplotypes 1 to 64 in Table 1, or a substitate haplotype for any one of haplotypes 1 to 64 in Table 1, this is an adequate amount of information to determine if the individual has a statin response marker I or II or belongs to the first or second statin response marker group. For example, if it is determined that an individual has one copy of haplotype 1, but the haplotype of the second chromosomal copy is not determined, the individual may still be identified as having a statin response marker I or be assigned to the first statin response marker group. On the other hand, if an individual is determined to have one
copy of haplotype 39 and the haplotype of the individual's second chromosomal copy is not identified, the individual could have either a statin response marker I or U or belong to either statin response marker group. In such instances information on the haplotype of the individual's second chromosomal copy is essential to determining which marker is present in the individual or for assigning the individual to a statin response marker group.
The presence in an individual of a statin response marker I or II may be determined by a variety of indirect or direct methods well known in the art for determining haplotypes or haplotype pairs for a set of polymorphic sites in one or both copies of the individual's genome, including those discussed below. The genotype for a polymorphic site in an individual may be determined by methods known in the art or as described below.
One indirect method for determining the copy number of any one of GUCY1B2 haplotypes 1 to 64 in Table 1 present in an individual is by prediction based on the individual's genotype determined at one or more of the polymorphic sites in the set of polymorphic sites comprising the haplotype and using the determined genotype at each site to determine the GUCY1B2 haplotypes present in the individual. The presence of zero, one or two copies of a GUCY1B2 haplotype of interest can be determined by visual inspection of the alleles at the polymorphic sites that comprise the haplotype. The GUCY1B2 haplotype pair is assigned by comparing the individual's genotype at each polymorphic site in the set of polymorphic sites with the genotypes at the same set of polymorphic sites corresponding to the haplotype pairs known to exist in the general population or in a specific population group or to the haplotype pairs that are theoretically possible based on the alternative alleles possible at each polymorphic site, and determining which haplotype pair is most likely to exist in the individual. In a related indirect haplotyping method, the copy number present in an individual of an GUCY1B2 haplotype disclosed herein is predicted from the individual's genotype for a set of polymorphic sites comprising the selected haplotype using information on haplotype pahs known to exist in a reference population. In one embodiment, this haplotype pair prediction method comprises identifying a genotype for the individual at the set of polymorphic sites comprising the selected haplotype, accessing data containing haplotype pahs identified in a reference population for a set of polymorphic sites comprising the polymorphic sites of the selected haplotype, and assigning to the individual a haplotype pah that is consistent with the individual's genotype. Whether the individual has a statin response marker I or a statin response marker JJ can be subsequently determined based on the assigned haplotype pah. The haplotype pah can be assigned by comparing the individual's genotype with the genotypes corresponding to the haplotype pairs known to exist in the general population or in a specific population group, and determining which haplotype pah is consistent with the genotype of the individual. In some embodiments, the comparing step may be performed by visual inspection. When the genotype of the individual is consistent with more than one haplotype pah, frequency data may be used to determine which of these haplotype pairs is most likely to be present in the individual. If a particular haplotype pah consistent with the genotype of the individual is more frequent in the reference population than other haplotype pahs consistent with the genotype, then that haplotype pah with the
highest frequency is the most likely to be present in the individual The haplotype pah frequency data used in this determination is preferably for a reference population comprising the same ethnogeographic group as the individual. The determination of the haplotype pair of the individual may also be performed in some embodiments by visual inspection. In other embodiments, the comparison may be made by a computer-implemented algorithm with the genotype of the individual and the reference haplotype data stored in computer-readable formats. For example, as described in WO 01/80156, one computer-implemented algorithm to perform this comparison entails enumerating all possible haplotype pahs which are consistent with the genotype, accessing data containing GUCY1B2 haplotype pairs frequency data determined in a reference population to determine a probability that the individual has a possible haplotype pah, and analyzing the determined probabilities to assign a haplotype pah to the individual.
Typically, the reference population is composed of randomly-selected individuals representing one or more of the major ethnogeographic groups of the world. A preferred reference population for use in the methods of the present invention consists of Caucasian individuals, the number of which is chosen based on how rare a haplotype is that one wants to be guaranteed to see. For example, if one wants to have a q% chance of not missing a haplotype that exists in the population at a p% frequency of occurring in the reference population, the number of individuals (n) who must be sampled is given by 2n=log(l-q)/log(l-p) where p and q are expressed as fractions. A preferred reference population allows the detection of any haplotype whose frequency is at least 10% with about 99% certainty. A particularly preferred reference population includes a 3 -generation Caucasian family to serve as a control for checking quality of haplotyping procedures.
If the reference population comprises more than one ethnogeographic group, the frequency data for each group is examined to determine whether it is consistent with Hardy- Weinberg equilibrium. Hardy- Weinberg equilibrium (D.L. Hartl et al., Principles of Population Genomics, Sinauer Associates (Sunderland, MA), 3rd Ed., 1997) postalates that the frequency of finding the haplotype pah H I H2 is equal to pH_w(Hx l H2) = 2p(Hx)p(H2) if H ≠ H2 and pH_w(Hx l H2) = p(Hx)p(H2) if
H — H . A statistically significant difference between the observed and expected haplotype frequencies could be due to one or more factors including significant inbreeding in the population group, strong selective pressure on the gene, sampling bias, and/or errors in the genotyping process. If large deviations from Hardy- Weinberg equilibrium are observed in an ethnogeographic group, the number of individuals in that group can be increased to see if the deviation is due to a sampling bias. If a larger sample size does not reduce the difference between observed and expected haplotype pah frequencies, then one may wish to consider haplotyping the individual using a direct haplotyping method such as, for example, CLASPER System™ technology (U.S. Patent No. 5,866,404), single molecule dilution, or allele-specific long-range PCR (Michalotos-Beloin et al, Nucleic Acids Res. 24:4841-4843, 1996). In one embodiment of this method for predicting a haplotype pah for an individual, the assigning step involves performing the following analysis. First, each of the possible haplotype pairs is
compared to the haplotype pahs in the reference population. Generally, only one of the haplotype pahs in the reference population matches a possible haplotype pah and that pah is assigned to the individual. Occasionally, only one haplotype represented in the reference haplotype pahs is consistent with a possible haplotype pair for an individual, and in such cases the individual is assigned a haplotype pah containing this known haplotype and a new haplotype derived by subtracting the known haplotype from the possible haplotype pair. Alternatively, the haplotype pair in an individual may be predicted from the individual's genotype for that gene using reported methods (e.g., Clark et al. 1990, Mol Bio Evol 7:111- 22 or WO 01/80156) or through a commercial haplotyping service such as offered by Genaissance Pharmaceuticals, Inc. (New Haven, CT). In rare cases, either no haplotypes in the reference population are consistent with the possible haplotype pahs, or alternatively, multiple reference haplotype pairs are consistent with the possible haplotype pairs. In such cases, the individual is preferably haplotyped using a direct molecular haplotyping method such as, for example, CLASPER System™ technology (U.S. Patent No. 5,866,404), SMD, or allele-specific long-range PCR (Michalotos-Beloin et al., supra). Determination of the number of haplotypes present in the individual from the genotypes is illustrated here for GUCY1B2 haplotype 27 in Table 1, comprising cytosine at PS2 and guanine at PSl 1.
Table 2. Possible copy numbers of GUCY1B2 Haplotype 27 based on the enot es at PS2 and PSll
There are three genotypes that may possibly occur at each polymorphic site; thus there are 9 genotypes that could be detected at PS2 and PSl 1, using both chromosomal copies from an individual. Eight of the nine possible genotypes for the two sites allow unambiguous determination of the number of copies of the GUCY1B2 haplotype 27 present in the individual and therefore would allow unambiguous determination of whether the individual has a statin response marker I or H However, an individual with the C/G, A/G genotype could possess either of the following haplotype pahs: CA/GG or GA/CG, and thus could have either 1 copy of GUCY1B2 haplotype 27 which is a statin response marker II, or 0 copy of GUCY1B2 haplotype 27 which is a statin response marker I. For this instance where there is ambiguity in the haplotype pah underlying the determined genotype C/G, A/G, frequency information may be used to determine the most probable haplotype pair and therefore the most likely number of copies of GUCY1B2 haplotype 27 in the individual. If a particular GUCY1B2 haplotype pah consistent with the genotype of the individual is more frequent in a reference population than other
haplotype pairs consistent with the genotype, then that haplotype pah with the highest frequency is the most likely to be present in the individual. The copy number of the haplotype of interest in this haplotype pair can then be determined by visual inspection of the alleles at the polymorphic sites that comprise the response marker for each haplotype in the pah. The individual's genotype for the desired set of PSs may be determined using a variety of methods well-known in the art. Such methods typically include isolating from the individual a genomic • > DNA sample comprising both copies of the gene or locus of interest, amplifying from the sample one or more target regions containing the polymorphic sites to be genotyped, and detecting the nucleotide pair . present at each PS of interest in the amplified target region(s). It is not necessary to use the same procedure to determine the genotype for each PS of interest.
In addition, the identity of the allele(s) present at any of the novel polymorphic sites described herein may be indirectly determined by haplotyping or genotyping another polymorphic site that is in linkage disequilibrium with the polymorphic site that is of interest. Polymorphic sites in linkage disequilibrium with the presently disclosed polymorphic sites may be located in regions of the gene or in other genomic regions not examined herein. Detection of the allele(s) present at a polymorphic site in linkage disequilibrium with the novel polymorphic sites described herein may be performed by, but is not limited to, any of the above-mentioned methods for detecting the identity of the allele at a polymorphic site.
Alternatively, the presence in an individual of a haplotype or haplotype pair for a set of PSs comprising a statin response marker may be determined by directly haplotyping at least one of the copies of the individual's genomic region of interest, or suitable fragment thereof, using methods known in the art. Such direct haplotyping methods typically involve treating a genomic nucleic acid sample isolated from the individual in a manner that produces a hemizygous DNA sample that only has one of the two "copies" of the individual's genomic region which, as readily understood by the skilled artisan, may be the same allele or different alleles, amplifying from the sample one or more target regions containing the polymorphic sites to be genotyped, and detecting the nucleotide present at each PS of interest in the amplified target region(s). The nucleic acid sample may be obtained using a variety of methods known in the art for preparing hemizygous DNA samples, which include: targeted in vivo cloning (TTYC) in yeast as described in WO 98/01573, U.S. Patent No. 5,866,404, and U.S. Patent No. 5,972,614; generating hemizygous DNA targets using an allele specific oligonucleotide in combination with primer extension and exonuclease degradation as described in U.S. Patent No. 5,972,614; single molecule dilution (SMD) as described in Ruano et al., Prop. Natl. Acad. Sci. 87:6296-6300, 1990; and allele specific PCR (Ruano et al., 1989, supra; Ruaho et al., 1991, supra; Michalatos-Beloin et al, supra). As will be readily appreciated by those skilled in the art, any individual clone will typically only provide haplotype information on one of the two genomic copies present in an individual. If haplotype information is desired for the individual's other copy, additional clones will usually need to be examined. Typically, at least five clones should be examined to have more than a 90% probability of
haplotyping both copies of the genomic locus in an individual. In some cases, however, once the haplotype for one genomic allele is directly determined, the haplotype for the other allele may be inferred if the individual has a known genotype for the polymorphic sites of interest or if the haplotype frequency or haplotype pah frequency for the individual's population group is known. While dhect haplotyping of both copies of the gene is preferably performed with each copy of the gene being placed in separate containers, it is also envisioned that dhect haplotyping could be performed in the same container if the two copies are labeled with different tags, or are otherwise separately distinguishable or identifiable. For example, if first and second copies of the gene are labeled with different first and second fluorescent dyes, respectively, and an allele-specific oligonucleotide labeled with yet a third different fluorescent dye is used to assay the polymorphic site(s), then detecting a combination of the first and third dyes would identify the polymorphism in the first gene copy while detecting a combination of the second and third dyes would identify the polymorphism in the second gene copy.
The nucleic acid sample used in the above indirect and direct haplotyping methods is typically isolated from a biological sample taken from the individual, such as a blood sample or tissue sample. Suitable tissue samples include whole blood, saliva, tears, urine, skin and hair.
The target region(s) containing the PS of interest may be amplified using any oligonucleotide- directed amplification method, including but not limited to polymerase chain reaction (PCR) (U.S. Patent No. 4,965,188), ligase chain reaction (LCR) (Barany et al., Proc. Natl. Acad. Sci. USA 88:189- 193, 1991; WO90/01069), and oligonucleotide ligation assay (OLA) (Landegren et al., Science
241 : 1077-1080, 1988). Other known nucleic acid amplification procedures may be used to amplify the target region(s) including transcription-based amplification systems (U.S. Patent No. 5,130,238; EP 329,822; U.S. Patent No. 5,169,766, WO89/06700) and isothermal methods (Walker et al., Proc. Natl. Acad. Sci. USA 89:392-396, 1992). In both the direct and indirect haplotyping methods, the identity of a nucleotide (or nucleotide pah) at a polymorphic site(s) in the amplified target region may be determined by sequencing the amplified region(s) using conventional methods. If both copies of the gene are represented in the amplified target, it will be readily appreciated by the skilled artisan that only one nucleotide will be detected at a polymorphic site in individuals who are homozygous at that site, while two different nucleotides will be detected if the individual is heterozygous for that site. The polymoφhism may be identified directly, known as positive-type identification, or by inference, referred to as negative-type identification. For example, where a polymorphism is known to be guanine and cytosine in a reference population, a site may be positively determined to be either guanine or cytosine for an individual homozygous at that site, or both guanine and cytosine, if the individual is heterozygous at that site. Alternatively, the site may be negatively determined to be not guanine (and thus cytosine/cytosine) or not cytosine (and thus guanine/guanine).
A polymoφhic site in the target region may also be assayed before or after amplification using one of several hybridization-based methods known in the art. Typically, allele-specific oligonucleotides
are utilized in performing such methods. The allele-specific oligonucleotides may be used as differently labeled probe pahs, with one member of the pah showing a perfect match to one variant of a target sequence and the other member showing a perfect match to a different variant. In some embodiments, more than one polymoφhic site may be detected at once using a set of allele-specific oligonucleotides or oligonucleotide pairs. Preferably, the members of the set have melting temperatures within 5°C, and more preferably within 2°C, of each other when hybridizing to each of the polymoφhic sites being detected.
Hybridization of an allele-specific oligonucleotide to a target polynucleotide may be performed with both entities in solution, or such hybridization may be performed when either the oligonucleotide or the target polynucleotide is covalently or noncovalently affixed to a solid support. Attachment may be mediated, for example, by antibody-antigen interactions, poly-L-Lys, streptavidin or avidin-biotin, salt bridges, hydrophobic interactions, chemical linkages, UV cross-linking baking, etc. Allele-specific oligonucleotides may be synthesized directly on the solid support or attached to the solid support subsequent to synthesis. Solid-supports suitable for use in detection methods of the invention include substrates made of silicon, glass, plastic, paper and the like, which may be formed, for example, into wells (as in 96-well plates), slides, sheets, membranes, fibers, chips, dishes, and beads. The solid support may be treated, coated or derivatized to facilitate the immobilization of the allele-specific oligonucleotide or target nucleic acid.
Detecting the nucleotide or nucleotide pah at a PS of interest may also be determined using a mismatch detection technique, including but not limited to the RNase protection method using riboprobes (Winter et al., Proc. Natl. Acad. Sci. USA 82:7575, 1985; Meyers et al., Science 230:1242, 1985) and proteins which recognize nucleotide mismatches, such as the E. coli mutS protein (Modrich, P. Ann. Rev. Genet. 25:229-253, 1991). Alternatively, variant alleles can be identified by single strand conformation polymoφhism (SSCP) analysis (Orita et al., Genomics 5:874-879, 1989; Humphries et al., in Molecular Diagnosis of Genetic Diseases, R. Elles, ed., pp. 321-340, 1996) or denaturing gradient gel electrophoresis (DGGE) (Wartell et al., Nucl. Acids Res. 18:2699-2706, 1990; Sheffield et al, Proc. Natl. Acad. Sci. USA 86:232-236, 1989).
A polymerase-mediated primer extension method may also be used to identify the polymoφhism(s). Several such methods have been described in the patent and scientific literature and include the "Genetic Bit Analysis" method (W092/15712) and the ligase/polymerase mediated genetic bit analysis (U.S. Patent 5,679,524. Related methods are disclosed in WO91/02087, WO90/09455, W095/17676, U.S. Patent Nos. 5,302,509, and 5,945,283. Extended primers containing the complement of the polymoφhism. may be detected by mass spectrometry as described in U.S. Patent No. 5,605,798. Another primer extension method is allele-specific PCR (Ruano et al., Nucl. Acids Res. 17:8392, 1989; Ruano et al., Nucl. Acids Res. 19, 6877-6882, 1991; WO 93/22456; Turki et al., J. Clin. Invest. 95:1635-
1641, 1995). In addition, multiple polymoφhic sites may be investigated by simultaneously amplifying multiple regions of the nucleic acid using sets of allele-specific primers as described in Wallace et al.
(WO89/10414).
The genotype or haplotype for the GUCY1B2 gene of an individual may also be determined by hybridization of a nucleic acid sample containing one or both copies of the gene, mRNA, cDNA or fragment(s) thereof, to nucleic acid arrays and subarrays such as described in WO 95/11995. The arrays would contain a battery of allele-specific oligonucleotides representing each of the polymoφhic sites to be included in the genotype or haplotype.
The invention also provides a kit for determining whether an individual has a statin response marker I or a statin response marker U. The kit comprises a set of oligonucleotides designed for determining the allele(s) present at a set of polymoφhic sites (PS). In preferred embodiments, the set of polymoφhic sites comprises at least PSl 1 and may comprise one or more of PS2, PS5, PS6, PS 15, PS 16 and PS 17. In some embodiments, the set of polymoφhic sites comprises a set of polymoφhic sites selected from the group consisting of (1) PSll; (2) PSll and PS16; (3) PSll andPS17; (4) PSll, PS16 andPS17;(5)PS6andPSll;(6)PS6,PSll,PS16;(7)PS6,PSllandPS17;(8)PS6,PSll,PS16and PS17;(9)PS5andPSll;(10)PS5,PSll,PS16andPS17;(ll)PS5,PSllandPS17;(12)PS2,PS5, PS6andPSll;(13)PS5,PSllandPS16;(14)PS5,PS6,PSllandPS17;(15)PS2JPS5,PSlland PS16;(16)PS5,PS6,PSll ndPS16;(17)PS2,PS5,PSllandPS16;(18)PS2,PS5andPSll;(19) PS5,PSllandPS15;(20)PS5,PSll,PS15andPS16;(21)PS2,PS5,PSllandPS15;(22)PS5,PSll, PS15 andPS17; (23) PS5,PS6,PS11 andPS15; (24) PSl and PSll; (25)PS2,PS11 andPS16; (26) PS2, PSll andPS17; (27) PS2, PSll, PS16 andPS17; (28) PS2, PS6 and PSll; (29)PS2, PS6, PSll andPS16;(30)PS2,PS6,PSllandPS17;(31)PS2,PS6,PSllandPS15;(32)PSllandPS15;(33) PSll,PS15andPS16;(34)PSll,PS15,PS16andPS17;(35)PS6,PSll,PS15andPS16;(36)PS6, PSll,PS15andPS17;(37)PS6,PSllandPS15;(38)PS2,PSllandPS15;(39)PS2,PSll,PS15and PS16; (40) PS2, PSll, PS15 andPS17; (41) PS5, PS6 and PSll; (42) the set of PSs for a haplotype in linkage disequilibrium with any one of haplotypes 1 to 64 in Table 1 and (43) the set of PSs for a substitate haplotype for any one of haplotypes 1 to 64 in Table 1 in which the allele at one or more of the polymoφhic sites in the original haplotype is replaced with an allele at a substitute polymoφhic site in linkage disequilibrium with the allele at the replaced or substituted polymoφhic site. Preferred sets of polymoφhic sites comprise the sets of PSs for any one of haplotypes 1, 2, 27-37. More preferably the set of polymoφhic sites comprises the set of PSs for any one of haplotypes 1 (PS11),31 (PS2,PS11 and PS6), 32 (PS2, PS6, PSl 1 and PS 16), or 33 (PS2, PS6, PSl 1 and PS 17). In one embodiment, the kit comprises oligonucleotides for detecting at least one allele for each polymoφhic site in the set of polymoφhic sites, while in other embodiments the kit comprises oligonucleotides for detecting both alleles at each member of the set of polymoφhic sites. Each genotyping oligonucleotide provided in the kit may be placed in the same or separate receptacles and may be provided together in a package.
As used herein, a genotyping oligonucleotide is a probe or primer capable of hybridizing to a target region that contains, or that is located close to, a polymoφhic site of interest such as one of the polymoφhic sites comprising a statin response marker described herein. The term "oligonucleotide" refers to a polynucleotide molecule having less than about 100 nucleotides. A preferred oligonucleotide of the invention is 10 to 35 nucleotides long. More preferably, the oligonucleotide is between 15 and
30, and most preferably, between 20 and 25 nucleotides in length. The exact length of the oligonucleotide will depend on the nature of the genomic region containing the PS of interest as well as the genotyping assay to be performed and can readily be determined by the skilled artisan.
The oligonucleotides used to practice the invention may be comprised of any phosphorylation state of ribonucleotides, deoxyribonucleotides, and acyclic nucleotide derivatives, and other functionally equivalent derivatives. Alternatively, oligonucleotides may have a phosphate-free backbone, which may be comprised of linkages such as carboxymethyl, acetamidate, carbamate, polyamide (peptide nucleic acid (PNA)) and the like (Varma, R. in Molecular Biology and Biotechnology, A Comprehensive Desk Reference, Ed. R. Meyers, VCH Publishers, Inc. (1995), pages 617-620). Oligonucleotides of the invention may be prepared by chemical synthesis using any suitable methodology known in the art, or may be derived from a biological sample, for example, by restriction digestion. The oligonucleotides may be labeled, according to any technique known in the art, including use of radiolabels, fluorescent labels, enzymatic labels, proteins, haptens, antibodies, sequence tags and the like.
Oligonucleotides of the invention must be capable of specifically hybridizing to a target region of a polynucleotide containing a desired locus. As used herein, specific hybridization means the oligonucleotide forms an anti-parallel double-stranded structure with the target region under certain hybridizing conditions, while failing to form such a structure when incubated with another region in the polynucleotide or with a polynucleotide lacking the desired locus under the same hybridizing conditions. Preferably, the oligonucleotide specifically hybridizes to the target region under conventional high stringency conditions. The skilled artisan can readily design and test oligonucleotide probes and primers suitable for detecting polymoφhisms in the GUCY1B2 gene or adjacent regions of chromosome 13 in linkage disequilibrium with one of the haplotype (a), using the polymoφhism information provided herein in conjunction with the known sequence information for the GUCY1B2 gene, and adjacent regions of chromosome 1, and routine techniques. A nucleic acid molecule such as an oligonucleotide or polynucleotide is said to be a "perfect" or
"complete" complement of another nucleic acid molecule if every nucleotide of one of the molecules is complementary to the nucleotide at the corresponding position of the other molecule. A nucleic acid molecule is "substantially complementary" to another molecule if it hybridizes to that molecule with sufficient stability to remain in a duplex form under conventional low-stringency conditions. Conventional hybridization conditions are described, for example, by Sambrook J. et al., in Molecular Cloning, A Laboratory Manual, 2nd Edition, Cold Spring Harbor Press, Cold Spring Harbor, NY (1989) and by Haymes, B.D. et al. in Nucleic Acid Hybridization, A Practical Approach, IRL Press, Washington, D.C. (1985). While perfectly complementary oligonucleotides are preferred for detecting polymoφhisms, departures from complete complementarity are contemplated where such departures do not prevent the molecule from specifically hybridizing to the target region. For example, an oligonucleotide primer may have a non-complementary fragment at its 5 ' end, with the remainder of the primer being complementary to the target region. Alternatively, non-complementary nucleotides may be interspersed into the probe or primer as long as the resulting probe or primer is still capable of
specifically hybridizing to the target region.
Preferred oligonucleotides of the invention, useful in determining if an individual has a statin response marker I or II, are allele-specific oligonucleotides. As used herein, the term allele-specific oligonucleotide (ASO) means an oligonucleotide that is able, under sufficiently stringent conditions, to hybridize specifically to one allele of a gene, or other locus, at a target region containing a polymoφhic site while not hybridizing to the corresponding region in another allele(s). As understood by the skilled artisan, allele-specificity will depend upon a variety of readily optimized stringency conditions, including salt and formamide concentrations, as well as temperatures for both the hybridization and washing steps. Examples of hybridization and washing conditions typically used for ASO probes are found in Kogan et al, "Genetic Prediction of Hemophilia A" in PCR Protocols, A Guide to Methods and Applications, Academic Press, 1990 and Ruano et al., 87 Proc. Natl. Acad. Sci. USA 6296-6300, 1990. Typically, an ASO will be perfectly complementary to one allele while containing a single mismatch for another allele.
Allele-specific oligonucleotides of the invention include ASO probes and ASO primers. ASO probes which usually provide good discrimination between different alleles are those in which a central position of the oligonucleotide probe aligns with the polymoφhic site in the target region (e.g., approximately the 7th or 8th position in a 15mer, the 8* or 9th position in a 16mer, and the 10th or 11th position in a 20mer). An ASO primer of the invention has a 3 ' terminal nucleotide, or preferably a 3 ' penultimate nucleotide, that is complementary to only one of the nucleotide alleles of a particular polymoφhic site, thereby acting as a primer for polymerase-mediated extension only if that nucleotide allele is present at the PS in the sample being genotyped. ASO probes and primers hybridizing to either the coding or noncoding strand are contemplated by the invention. ASO probes and primers listed below use the appropriate nucleotide symbol (R= G or A, Y= T or C, M= A or C, K= G or T, S= G or C, and W= A or T; WIPO standard ST.25) at the position of the polymoφhic site to represent that the ASO contains either of the two alternative allelic variants observed at that polymoφhic site.
Preferred ASO probes for detecting the alleles at the polymoφhic sites comprising the preferred embodiments of the statin response markers I and II comprise a nucleotide sequence selected from the group consisting of:
PS2 GTTGGCASCTCGGGG (SEQ ID NO: 4) and its complement, PS5 CCTTGCTYGGAGGCT (SEQ ID NO: 5) and its complement,
PS6 TGGAGGCYAG GATG (SEQ ID NO: 6) and its complement,
PSll CTCC TTRCTACTCG (SEQ ID NO: 7) and its complement,
PS15 CTGGGCCRATAGATC (SEQ ID NO: 8) and its complement,
PS16 TCTGGGCRTTTCCCA (SEQ ID NO: 9) and its complement, and PS17 TTCCCATRCAGCTCT (SEQ ID NO: 10) and its complement.
Preferred ASO primers for detecting the alleles at the polymoφhic sites comprising the preferred embodiments of the statin response markers I and JJ comprise a nucleotide sequence selected from the group consisting of: PS2 TGTGCCGTTGGCASC (SEQ ID NO: 11); GCTTCTCCCCGAGST (SEQ ID NO: 12); PS5 TGTGCTCCTTGCTYG (SEQ ID NO: 13); ATCACTAGCCTCCRA (SEQ ID NO:14);
PSβ CTTGCTTGGAGGCYA (SEQ ID NO: 15] GGAGCTCATCACTRG (SEQ ID NO: 16);
PSll ATGTTTCTCCATTRC (SEQ ID NO: 17) TCTATCCGAGTAGYA (SEQ ID NO: 18);
PS15 TTTGAGCTGGGCCRA (SEQ ID NO: 19) CAGCTGGATCTATYG (SEQ ID NO: 20);
PS16 TTCTTCTCTGGGCRT (SEQ ID NO: 21) GCTGCATGGGAAAYG (SEQ ID NO:22);
PS17 GGGCGTTTCCCATRC (SEQ ID NO: 23) and TACCAGAGAGCTGYA (SEQ ID NO:24) .
Particularly preferred ASO probes and primers for genotyping PSl 1 are SEQ ID NO:7, and its complement, and SEQ ID NOS: 17 - 18.
Other oligonucleotides useful in practicing the invention hybridize to a target region located one to several nucleotides downstream of a polymoφhic site in a statin response marker. Such oligonucleotides are useful in polymerase-mediated primer extension methods for detecting one of the polymoφhisms described herein and therefore such oligonucleotides are referred to herein as "primer- extension oligonucleotides". In a preferred embodiment, the 3 '-terminus of a primer-extension oligonucleotide is a deoxynucleotide complementary to the nucleotide located immediately adjacent to the polymoφhic site. Particularly preferred primer extension oligonucleotides for detecting GUCY1B2 gene polymoφhisms at the different polymoφhic sites in the set comprising a preferred statin response marker haplotype terminate in a nucleotide sequence selected from the group consisting of:
PS2 GCCGTTGGCA (SEQ ID NO 25) TCTCCCCGAG (SEQ ID NO 26)
PS5 GCTCCTTGCT (SEQ ID NO 27) ACTAGCCTCC (SEQ ID NO 28)
PS 6 GCTTGGAGGC (SEQ ID NO 29) GCTCATCACT (SEQ ID NO 30)
PSll TTTCTCCATT (SEQ ID NO 31) ATCCGAGTAG (SEQ ID NO 32)
PS15 GAGCTGGGCC (SEQ ID NO 33) CTGGATCTAT (SEQ ID NO 34) PS16 TTCTCTGGGC (SEQ ID NO 35) GCATGGGAAA (SEQ ID NO 36) PS17 CGTTTCCCAT (SEQ ID NO 37) and CAGAGAGCTG (SEQ ID NO: 38)
Termination mixes are chosen to terminate extension of the oligonucleotide at the polymoφhic site of interest, or one base thereafter, depending on the alternative nucleotides present at the polymoφhic site.
In some embodiments, the genotyping oligonucleotides in a kit of the invention have different labels to allow probing of the identity of nucleotides or nucleotide pairs at two or more polymoφhic sites simultaneously. It is also contemplated that a kit of the invention may contain two or more sets of allele-specific primer pahs to allow simultaneous targeting and amplification of two or more regions containing a polymoφhic site in a statin response marker.
The oligonucleotides comprising a kit of the invention may also be immobilized on or synthesized on a solid surface such as a microchip, bead, or glass slide (see, e.g., WO 98/20020 and WO 98/20019). Such immobilized oligonucleotides may be used in a variety of polymoφhism detection assays, including but not limited to probe hybridization and polymerase extension assays. Immobilized oligonucleotides useful in practicing the invention may comprise an ordered array of oligonucleotides designed to rapidly screen a nucleic acid sample for polymoφhisms in multiple genes at the same time.
Kits of the invention may also contain other components such as hybridization buff er (e.g., where the oligonucleotides are to be used as allele-specific probes) or dideoxynucleotide triphosphates (ddNTPs; e.g., where the alleles at the polymoφhic sites are to be detected by primer extension). In a preferred embodiment, the set of oligonucleotides consists of primer extension oligonucleotides. The kit
may also contain a polymerase and a reaction buffer optimized for primer extension mediated by the polymerase. Preferred kits may also include detection reagents, such as biotin- or fluorescent-tagged oligonucleotides or ddNTPs and/or an enzyme-labeled antibody and one or more substrates that generate a detectable signal when acted on by the enzyme. It will be understood by the skilled artisan that the set of oligonucleotides and reagents for performing the genotyping or haplotyping assay will be provided in separate receptacles placed in the container if appropriate to preserve biological or chemical activity and enable proper use in the assay.
In a particularly preferred embodiment, each of the oligonucleotides and all other reagents in the kit have been quality tested for optimal performance in an assay for determining the alleles at a set of polymoφhic sites comprising a statin response marker I or statin response marker II. hi a further embodiment, the kit comprises a manual with instructions for performing genotyping assays on a nucleic acid sample from an individual and determining if the individual has a statin response marker I or a statin response marker II based on the results of the assay. The instructions may also contain information to help a physician determine whether or how to use particular statins, alone or in combination with other therapies affecting LDLC levels, to treat an individual with the determined statin response marker.
The methods and kits of the invention are useful for helping physicians make decisions about how to treat an individual. They can be used to predict the reduction in LDLC of an individual in response to treatment with a statin or in selecting a statin therapy for an individual to achieve an optimal LDLC reduction.
Thus, the invention provides a method for predicting the LDLC response of an individual to treatment with a statin. The method comprises determining whether the individual has a statin response marker I or a statin response marker II and making a response prediction based on the results of the determining step. The statin is atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. Preferably, the statin is atorvastatin calcium. In some embodiments, if the individual is determined to have a statin response marker I, then the response prediction is that the individual will likely experience a larger reduction in LDLC in response to treatment with the statin than an individual determined to have a statin response marker II. Also, if the individual is determined to have a statin response marker II, then the response prediction is that the individual will likely experience a smaller reduction in LDLC in response to treatment with the statin than an individual determined to have a statin response marker I. In some embodiments, the determining step comprises consulting a data repository that states whether the individual has a statin response marker I or a statin response marker JJ. The data repository may be the individual's medical records or a medical data card. In other embodiments, the determining step comprises determining the copy number of a haplotype selected from the group consisting of: haplotypes 1-64 in Table 1; a linked haplotype to any one of haplotypes 1 to 64 in Table 1; and a substitute haplotype for any one of haplotypes 1 to 64 in Table 1. If the selected haplotype is any one of haplotypes 1-10, 27-38, a linked haplotype to any one of haplotypes 1-10, 27-38, or a substitute haplotype for any one of haplotypes 1-10, 27-38, then the individual has a statin response marker I if the
individual has at least one copy of the selected haplotype and a statin response marker II if the individual has zero copy of the selected haplotype. If the selected haplotype is any one of haplotypes 11-26, a linked haplotype to any one of haplotypes 11-26, or a substitute haplotype for any one of haplotypes 11- 26, then the individual has a statin response marker I if the individual has zero copy of the selected haplotype and a statin response marker II if the individual has at least one copy of the selected haplotype. If the selected haplotype is any one of haplotypes 39-64, a linked haplotype to any one of haplotypes 39-64, or a substitute haplotype for any one of haplotypes 39-64, then the individual has a statin response marker I if the individual has zero or one copy of the selected haplotype and a statin response marker II if the individual has two copies of the selected haplotype. The determination of the statin response marker present in an individual can be made using one of the dhect or indirect methods described herein or known in the art. In some preferred embodiments, the determining step comprises identifying for one or both copies of the genomic locus present in the individual the identity of the nucleotide or nucleotide pah at each member of the set of polymoφhic sites comprising the selected haplotype. In preferred embodiments, the individual is Caucasian. The invention also provides a method of selecting a statin therapy to provide an optimal LDLC response in an individual. The method comprises determining whether the individual has a statin response marker I or a statin response marker II and selecting a statin therapy based on the results of the determining step. Preferably, the statin is atorvastatin or a pharmaceutically acceptable salt of atorvastatin; most preferably the statin is atorvastatin calcium. In some embodiments, if the individual has a statin response marker JJ, then the selected statin therapy comprises a higher than standard dose of atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid, while if the individual has a statin response marker I, then the selected statin therapy comprises a standard dose of atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. The preferred pharmaceutically acceptable salt of atorvastatin acid is atorvastatin calcium. A standard dose means a dose required to achieve a given mean LDLC reduction in a population which has not been stratified by statin response marker or which has a statin response marker I.
One method to determine whether a statin response marker I or II is present in the individual comprises determining the copy number of any one of haplotypes 1 to 64 in Table 1; a linked haplotype to any one of haplotypes 1 to 64; and a substitate haplotype for any one of haplotypes 1 to 64. Alternatively, the determining step may comprise consulting a data repository that states the individual's copy number for one or more haplotypes comprising a statin response marker I or II. The data repository may be the individual's medical records or a medical data card. In preferred embodiments, the individual is Caucasian.
In other aspects, the invention provides an article of manufacture. In one embodiment, an article of manufacture comprises a pharmaceutical formulation and at least one indicium identifying a population for which the pharmaceutical formulation is indicated. The pharmaceutical formulation comprises a statin as at least one active ingredient. Additionally, the pharmaceutical formulation may be regulated and the indicium may comprise the approved label for the pharmaceutical formulation. The
identified population is partially or wholly defined by having a statin response marker I or a statin response marker H The identified population preferably may be further defined as Caucasian. A population wholly defined by having a statin response marker I or JJ is one for which there are no other factors which should be considered in identifying the population for which the pharmaceutical formulation is indicated. In contrast, a population that is partially defined by having a statin response marker is one for which other factors may be pertinent to identification of the population for which the pharmaceutical formulation is indicated. Examples of other such factors are age, weight, gender, disease state, possession of other genetic markers or biomarkers, or the like.
The statin response marker I or II comprises a copy number of a specific haplotype. The statin response marker I is at least one copy of any of haplotypes 1-10, 27-38 in Table 1, or a linked haplotype to or a substitute haplotype for any one of haplotypes 1-10, 27-38; zero copy of any of haplotypes 11-26 in Table 1 or a linked haplotype to or a substitute haplotype for any one of haplotypes 11-26; or zero or one copy of any of haplotypes 3 -64 in Table 1, or a linked haplotype to or a substitate haplotype for any one of haplotypes 39-64. The statin response marker II is zero copy of any of haplotypes 1-10, 27- 38 in Table 1, or a linked haplotype to or a substitate haplotype for any one of haplotypes 1-10, 27-38; at least one copy of any of haplotypes 11-26 in Table 1, or a linked haplotype to or a substitate haplotype for any one of haplotypes 11-26; or two copies of any of haplotypes 39-64 in Table 1, or a linked haplotype to or a substitute haplotype for any one of haplotypes 39-64. The specific haplotype comprising the statin response marker I or II is preferably one of haplotypes 1, 2, 27-37 in Table 1, most preferably the haplotype is one of haplotypes 1, 31, 32, or 33.
In some embodiments, the pharmaceutical formulation is formulated, in any way known in the art, as a sustained release formulation, but most preferably as a transdermal patch. In other embodiments, the pharmaceutical formulation is a tablet or capsule and the article may further comprise an additional indicium comprising the color or shape of the table or capsule. In further embodiments, the article may further comprise an additional indicium comprising a symbol stamped on the tablet or capsule, or a symbol or logo printed on the approved label.
In some embodiments of this article, in a trial population, the group of individuals having a statin response marker II exhibits a smaller mean LDLC reduction in response to the statin than the group of individuals lacking the statin response marker II. The approved label may comprise a statement about the identified population having the statin response marker II for which a larger than standard dose of the pharmaceutical formulation is indicated to achieve a given LDLC reduction. In these embodiments, the approved label may comprise a statement that a differing dosing regimen for the pharmaceutical formulation is indicated for individuals identified as having the statin response marker II on a specified test, preferably a specified genetic test. In some or all of these embodiments, the label may describe the mean reduction in LDLC expected for the identified population.
In other embodiments of this article, a trial population having a statin response marker I exhibits a better mean Low Density Lipoprotein Cholesterol (LDLC) response to the statin than a trial population lacking the statin response marker I. The approved label may state that the pharmaceutical formulation
is indicated for individuals having the statin response marker I. The approved label may further state that a standard dose of the pharmaceutical formulation is indicated for individuals identified as having the statin response marker I on a specified genetic test.
Additionally, in some or all of these embodiments, the statin is present in the pharmaceutical formulation at an amount effective to reduce LDL cholesterol levels. In preferred embodiments, the statin comprises atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. More preferably, the statin is atorvastatin calcium and the effective amount is ranging from 10 to 80 mg.
An additional embodiment of the article of manufacture provided by the invention comprises packaging material and a pharmaceutical formulation contained within said packaging material. The pharmaceutical formulation comprises a statin as at least one active ingredient. The statin is atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. In some embodiments, the packaging material may comprise a label that may state that a different dosing regimen for the pharmaceutical formulation is indicated for a population partly or wholly defined by having a statin response marker JJ. The label may further state that a higher than standard dose of the pharmaceutical formulation is indicated for individuals having the statin response marker II to achieve a given LDLC reduction. A population having a statin response marker II exhibits a smaller mean reduction in LDLC in response to the statin than a population lacking the statin response marker TJ.
In other embodiments of this article, the packaging material comprises a label which may state that the pharmaceutical formulation is indicated for a population partly or wholly defined by having a statin response marker I. A population having the statin response marker I exlhbits a better mean LDLC response to the statin than a population lacking the statin response marker I.
The indicated population in any of the above articles may preferably be further defined as Caucasian. The label may further state that a specified test can be used to identify members of the indicated population. Preferably the specified test is a genetic test. Additionally, other aspects of the invention provide a method of manufacturing a drug product comprising a statin as at least one active ingredient. The statin is atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. The method comprises combining in a package a pharmaceutical formulation comprising the statin and a label. In some embodiments, the label states that a differing dosing regimen for the pharmaceutical formulation is indicated for treating a population partially or wholly defined by having a statin response marker JJ or that a higher than statndard dose of the pharmaceutical formulation is indicated for achieving a given LDLC reduction in individuals having a statin response marker II. In embodiments in which the defining statin response marker is a statin response marker H, a population having the statin response marker II was shown to have a smaller mean LDLC reduction in response to the pharmaceutical formulation than did those individuals lacking the statin response marker JJ.
Preferably, the statin is atorvastatin calcium and the effective amount ranges from about 10 to 80 mg. The indicated population having the defining statin response marker preferably may be further defined as Caucasian. The indicated and/or contraindicated populations may be identified on the
pharmaceutical formulation, on the label or on the package by at least one indicium,' such as a symbol or logo, color, or the like.
Detecting the presence of a statin response marker I or II in an individual is also useful in a method of seeking regulatory approval for marketing a pharmaceutical formulation for treating a disease or condition in a population defined by the statin response marker. The method comprises: (a) conducting at least one clinical trial which comprises administering the pharmaceutical formulation to first and second treatment groups of patients having the disease or condition, wherein each patient in the first treatment group has a statin response marker I and each patient in the second treatment group has a statin response marker II; (b) demonstrating that the second treatment group exhibits a mean per cent change in Low Density Lipoprotein Cholesterol (LDLC) that is worse than the mean per cent change in LDLC exhibited by the first treatment group at any given dose of the pharmaceutical formulation; and (c) filing with a regulatory agency an application for marketing approval of the pharmaceutical formulation with a label stating that a higher dose of the pharmaceutical formulation is indicated for achieving any given LDLC reduction in patients having the statin response marker II than in patients having the statin response marker I. In some embodiments, the pharmaceutical formulation comprises atorvastatin or a pharmaceutically acceptable salt of atorvastatin acid. More preferably the pharmaceutical formulation comprises atorvastatin calcium.
The clinical trial may be conducted by recruiting patients with the disease or condition, determining whether they have a statin response marker I or II and assigning the patients to the first and second treatment groups based on the results of the determining step. The disease or condition may include any for which statin therapy is indicated, e.g., hyperlipidemia, hypercholesterolemia, cardiovascular disease (CVD), presence of CVD risk factors, coronary artery disease, and the like. The patients in each treatment group are preferably administered the same dose of the pharmaceutical formulation, which includes a statin compound as at least one active ingredient. The pharmaceutical formulation may contain other active ingredients, for example another compound known or believed to have therapeutic activity in treating the disease or condition examined in the study or a compound that serves to reduce or block one or more side effects caused by the statin compound.
The regulatory agency may be any person or group authorized by the government of a country anywhere in the world to control the marketing or distribution of drugs in that country. Preferably, the regulatory agency is authorized by the government of a major industrialized country, such as Australia, Canada, China, a member of the European Union, Japan, and the like. Most preferably the regulatory agency is authorized by the government of the United States and the type of application for approval that is filed will depend on the legal requhements set forth in the last enacted version of the Food, Drag and Cosmetic Act that are applicable for the pharmaceutical formulation and may also include other considerations such as the cost of making the regulatory filing and the marketing strategy for the composition. For example, if the pharmaceutical formulation has previously been approved for the same cognitive function, then the application might be a paper NDA, a supplemental NDA or an abbreviated NDA, but the application would be a full NDA if the pharmaceutical formulation has never been
approved before; with these terms having the meanings applied to them by those skilled in the pharmaceutical arts or as defined in the Drug Price Competition and Patent Term Restoration Act of 1984.
Further, in performing any of the methods described herein which require information on the haplotype content of the individual (i.e., the haplotypes and haplotype copy number present in the individual for the polymoφhic sites in haplotypes comprising a statin response marker I or II) or which require knowing if a statin response marker I or II is present in the individual, the individual's GUCY1B2 haplotype content or statin response marker may be determined by consulting a data repository such as the individual's patient records, a medical data card, a file (e.g. a flat ASCII file) accessible by a computer or other electronic or non-electronic media on which information about the individual's GUCY1B2 haplotype content or statin response marker can be stored. As used herein, a medical data card is a portable storage device such as a magnetic data card, a smart card, which has an on-board processing unit and which is sold by vendors such as Siemens of Munich Germany, or a flash- memory card. The medical data card may be, but does not have to be, credit-card sized so that it easily fits into pocketbooks, wallets and other such objects carried by the individual. The medical data card may be swiped through a device designed to access information stored on the data card. In an alternative embodiment, portable data storage devices other than data cards can be used. For example, a touch-memory device, such as the "i-button" produced by Dallas Semiconductor of Dallas, Texas can store information about an individual's GUCY1B2 haplotype content or statin response marker, and this device can be incoφorated into objects such as jewelry. The data storage device may be implemented so that it can wirelessly communicate with routing/intelligence devices through IEEE 802.11 wireless networking technology or through other methods well known to the skilled artisan. Further, as stated above, information about an individual's GUCY1B2 haplotype content or statin response marker can also be stored in a file accessible by a computer; such files may be located on various media, including: a server, a client, a hard disk, a CD, a DVD, a personal digital assistant such as a Palm Pilot, a tape, a zip disk, the computer's internal ROM (read-only-memory) or the internet or worldwide web. Other media for the storage of files accessible by a computer will be obvious to one skilled in the art.
Any or all analytical and mathematical operations involved in practicing the methods of the present invention may be implemented by a computer. For example, the computer may execute a program that assigns GUCY1B2 haplotype pairs and/or a statin response marker I or II to individuals based on genotype data inputted by a laboratory technician or treating physician. In addition, the computer may output the predicted change in one or more lipoprotein levels in response to a statin following input of the individual's GUCY1B2 haplotype content or statin response marker, which was either determined by the computer program or input by the technician or physician. Data on which statin response markers were detected in an individual may be stored as part of a relational database (e.g., an instance of an Oracle database or a set of ASCII flat files) containing other clinical and/or haplotype data for the individual. These data may be stored on the computer's hard drive or may, for example, be stored on a CD ROM or on one or more other storage devices accessible by the computer.
For example, the data may be stored on one or more databases in communication with the computer via a network.
It is also contemplated that the above described methods and compositions of the invention may be utilized in combination with identifying genotype(s) and/or haplotype(s) for other genomic regions.
Preferred embodiments of the invention are described in the following examples. Other embodiments within the scope of the claims herein will be apparent to one skilled in the art from consideration of the specification or practice of the invention as disclosed herein. It is intended that the specification, together with the examples, be considered exemplary only, with the scope and spirit of the invention being indicated by the claims that follow the examples.
EXAMPLES
The Examples herein are meant to exemplify the various aspects of carrying out the invention and are not intended to limit the scope of the invention in any way. The Examples do not include detailed descriptions for conventional methods employed, such as in the performance of genomic DNA isolation, PCR and sequencing procedures. Such methods are well-known to those skilled in the art and are described in numerous publications, for example, Sambrook, Fritsch, and Maniatis, "Molecular Cloning: A Laboratory Manual", 2nd Edition, Cold Spring Harbor Laboratory Press, USA, (1989).
Example 1 This example illustrates the clinical and biochemical characterization of 679 patients in the patient cohort.
A multicenter, 17-week, (16 weeks controlled), open-label, clinical discovery trial was designed to assess the relationship between genetic haplotype markers and treatment response associated with 4 different commercially available medications, all of which act as 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase inhibitors (cerivastatin sodium [Baycol™], atorvastatin calcium [Lipitor®], simvastatin [Zocor®], and pravastat t sodium [Pravachol®]) in adult subjects with primary hypercholesterolemia. Study medications were packaged by their respective manufacturers and dispensed in a non-blinded fashion by a commercial pharmacist. The cerivastatin sodium arm of the study was discontinued at the time of the withdrawal of the drag from the market by the manufacturer and therefore data from the partially completed arm are excluded from this analysis.
Prior to randomization, all subjects underwent a screening and baseline period (up to 10 days). Then subjects were randomly assigned to the recommended starting dose, as stated in the package insert, of 1 of the 4 study medications. Following the initial 8 weeks of treatment, subjects proceeded to the highest allowed dose, as stated in the package insert, for an 8-week treatment period. As both periods incoφorate a fixed-dose design, dosing adjustments other than the Week 8 increase were not permitted in this study. Thus, the total duration of therapy was maximally 16 weeks/subject from the point of randomization (8 weeks at the recommended starting dose; plus 8 weeks at the highest allowed dose as stated in the package insert).
Male or female outpatients aged 18 to 75 years with a diagnosis of type JJa or JJb hypercholesterolemia who have been on the American Heart Association (AHA) Step I or Step II diet for at least 6 weeks prior to the onset of screening were eligible to participate. Subjects were either treatment-naive or previously treated for hypercholesterolemia with any approved medications. Previously treated subjects must have discontinued antmyperlipidemic medication 4 weeks prior to screening (8 weeks prior to screening if clofibrate [Atromid-S®] was in use) to be eligible.
Subject inclusion criteria were based upon medical history assessments and laboratory determinations of cholesterol levels as described by the National Cholesterol Education Program (NCEP)-recommended goal for LDL-cholesterol (> 160 mg/dL for subjects with 0 to 1 coronary heart disease [CHD] risk factor, > 130 mg/dL for those with 2 or more CHD risk factors, or > 100 mg/dL for those with documented CHD or peripheral vascular disease) and had triglyceride levels < 400 mg/dL prior to randomization. Eligible subjects had an LDL-cholesterol level < 240 mg/dL at screening and baseline. Subjects had to demonstrate dietary compliance with the AHA Step I or Step II diet as measured by a food diary at baseline to be eligible for randomization. The entire patient cohort comprised 679 patients. Subjects were randomly assigned to 1 of 4 treatment groups: 0.4 mg/day cerivastatin sodium, 10 mg/day atorvastatin calcium, 20 mg/day simvastatin, or 10 mg/day pravastatin sodium at baseline. All medication was taken once daily in the evening.
At the Week 8 visit, all subjects proceeded to the highest allowed dose, as stated in the package insert, of their assigned medication. The doses for the treatment groups were as follows: 0.8 mg/day cerivastatin sodium, 80 mg/day atorvastatin calcium, 80 mg/day simvastatin, and 40 mg/day pravastatin sodium. All medication was taken once daily in the evening.
The primary phenotypic endpoint used in the association of treatment response to genetic variability was the percent change from baseline in LDL-cholesterol values after 8 weeks and after 16 weeks of treatment, separately. The final Week 8 value was defined as the mean of the last 2 measurements (Weeks 6 and 8) during the first 8 weeks (low dose) of therapy. The final Week 16 value was defined as the mean of the last 2 measurements (Weeks 14 and 16) during the final 8 weeks (high dose) of therapy. Baseline was defined as the mean of the measurements taken at screening and baseline. The patient cohorts for the three completed statin arms were characterized with respect to statin taken in treatment as shown below.
Table 3. Demographics, baseline characteristics, and lipid changes for the low-dose compliant population.
„, . ■ .. Atorvastatin Simvastatin Pravastatin Pooled
Uharactenstic (n=155) (n=i68) (n=153) (n=476)
Male 65 (41.9%) 91 (54.2%) 64 (41.8%) 220 (46.2%)
Caucasian 135 (87.1%) 146 (86.9%) 126 (82.4%) 407 (85.5%)
Smoker 27 (17.4%) 34 (20.2%) 31 (20.3%) 92 (19.3%)
Atorvastatin Simvastatin Pravastatin Pooled
Characteristic
(n=155) (n=16S) (n=153) (n=476)
Drinker 95(61.3%) 104(61.9%) 84(54.9%) 283(59.5%)
Age(yr)* 56.8 ±9.7 56.0 ±10.4 57.0 ±10.4 56.6 ±10.2
Height (cm)* 168 ±11 169 ±10 169 ±10 169 ±10
Weight (kg)* 81.8 ±16.6 84.1 ±17.3 84.1 ± 19.0 83.3 ± 17.6
BMI(kg/m2)* 28.8 ±4.5 29.3 ±5.1 29.3 ±6.0 29.1 ±5.2
LDL-C* BL (mgdL) 172 ±27 175 ±25 173 ±25 173 ±26
8-week %Δ -39.3 ±10.0 -35.8 ±11.0 -21.3 ±11.3 -32.3 ±13.2
16-week%Δf -52.2 ±11.9 -45.1 ±11.4 -28.8 ±11.9 -42.0 ±15.2
HDL-C* BL (mg/dL) 50.6 ±13.7 47.3 ±11.2 48.9 ±12.4 48.9 ±12.5
8-week %Δ -0.3 ±10.5 2.1 ±9.7 0.5 ±8.6 0.9 ±9.7
16-week % Δf -3.3 ±10.4 1.6 ±10.9 1.2 ±9.1 -0.1 ±10.4
TGJ BL (mg/dL) 164(70,361) 173(60,384) 166(54,370) 167(54,384)
8-week % Δ -18 (-57, 52) -12 (-62, 234) -6 (-53, 183) -12 (-62, 234)
16-week % Δf -32 (-74, 45) -26 (-60, 70) -10 (-60, 300) -22 (-74, 300)
*Mean ± Standard Deviation shown; BL = baseline. fl6-week percent changes are based on the high-dose compliant population: pooled n=409.
{Median (Min, Max).
Statin
Ethnicity Lipitor® Zocor® Pravacol
Afr Am 4 (2%) 7 (4%) .12 (8%)
Amlnd - 1 (0.6%) -
Asian 5 (3%) 3 (2%) 1 (0.6%)
Cauc 133 (79%) 135 (75%) 123 (77%)
Hisp-Lat 10 (6%) 9 (5%) 9 (6%)
Other 1 (0.6%) 2 (1%) 2 (1%)
Not Assigned 15 (9%) 24 (13%) 12 (8%)
Missing - 1 (0.6%) -
Example 2
This example illustrates determination of the genotype of 854 individuals for the polymoφhic sites of interest herein by sequencing. The population of 854 individuals subjected to genotyping comprised individuals initially recruited into the statin study as well as a reference population. The reference population included 93 human individuals, organized into population subgroups by theh self- identified ethnogeographic origin. Within this reference population were 82 self-identified unrelated individuals belonging to one of four major population groups: Caucasian (21 individuals), African descent (20 individuals), Asian (20 individuals), or Hispanic/Latino (18 individuals). In addition, the reference population contained three unrelated indigenous American Indians (one from each of North, Central and South America), one three-generation Caucasian family (from the CEPH Utah cohort) and one two-generation African- American family.
Amplification of Target Regions
The following target regions of the GUCY1B2 gene were amplified using "tailed" PCR primers, each of which includes a universal sequence forming a noncomplementary "tail" attached to the 5 ' end of each unique sequence in the PCR primer pahs. The universal "tail" sequence for the forward PCR
primers comprises the sequence 5 '-TGTAAAACGACGGCCAGT-3 ' (SEQ ID NO:39) and the universal
"tail" sequence for the reverse PCR primers comprises the sequence 5 '-
AGGAAACAGCTATGACCAT-3' (SEQ ID NO:40). The nucleotide positions of the first and last nucleotide of the forward and reverse primers for each region amplified are presented below and correspond to positions in SEQ ID NO:l (Figure 1).
Table 4. PCR Primer Pairs
Fragment No. Forward Primer Reverse Primer PCR Product
Fragment 1 1000-1021 complement of 1546-1527 547 nt
Fragment 2 1319-1339 complement of 1851-1831 533 nt Fragment 3 9058-9082 complement of 9455-9435 398 nt
Fragment 4 23115-23136 complement of 23651-23628 537 nt
These primer pairs were used in PCR reactions containing genomic DNA isolated from immortalized cell lines for each member of the Index Repository. The PCR reactions were carried out under the following conditions:
Reaction volume = 10 μl
10 x Advantage 2 Polymerase reaction buffer (Clontech) = 1 μl
100 ng of human genomic DNA = 1 μl
10 mM dNTP = 0.4 μl Advantage 2 Polymerase enzyme mix (Clontech) = 0.2 μl
Forward Primer (10 μM) = 0.4 μl
Reverse Primer (10 μM) = 0.4 μl
Water = 6.6μl Amplification profile: 97°C - 2 min. 1 cycle
97°C - 15 sec. ^
70°C - 45 sec. I 10 cycles 72°C - 45 sec. J
97°C - 15 sec. ->
64°C - 45 sec. L 35 cycles
72°C - 45 sec. J Sequencing of PCR Products
The PCR products were purified using a Whatman/Polyfiltronics 100 μl 384 well unifilter plate essentially according to the manufacturers protocol. The purified DNA was eluted in 50 μl of distilled water. Sequencing reactions were set up using Applied Biosystems Big Dye Terminator chemistry essentially according to the manufacturers protocol. The purified PCR products were sequenced in both dhections using the appropriate universal 'tail' sequence as a primer. Reaction products were purified by isopropanol precipitation, and run on an Applied Biosystems 3700 DNA Analyzer.
Analysis of Sequences for Alleles at the Polymoφhic Sites
Sequence information was analyzed for the presence of the alleles present at the polymoφhic sites of interest using the Polyphred program (Nickerson et al., Nucleic Acids Res. 14:2745-2751, 1997).
The presence of the alleles on each strand was generally determined for each individual. The polymoφhisms studied and theh locations in the GUCY1B2 reference genomic sequence Figure 1 (SEQ ID NO: 1) are listed in Table 5 below.
Table 5. Polymoφhic Sites Identified in the GUCY1B2 Gene
Polymoφhic Nucleotide Reference Variant CDS Variant AA
Site Number Poly Id(a) Position Allele Allele Position Variant
PS2 13768223 1234 C G
PS5 13768505 1398 T C
PS6 13768599 1405 T C
PSl l 13769810 9253 A G 164 Y55C
PS15 13770753 23363 G A
PS16 13770847 23419 G A
PS17 13770941 23428 G A
(a) Polyld is a unique identifier assigned to each PS by Genaissance Pharmaceuticals, Inc.
Example 3
This example illustrates analysis of the genetic data for the individuals for association with response to statin.
Haplotypes for the members of the experimental population were assigned using a computer- implemented algorithm for assigning haplotypes to unrelated individuals in a population sample described in WO 01/80156. All 679 randomized patients in the STRENGTH stady were part of the experimental population used for collection of genetic data. However, for analyses assessing the associations between genetic markers and percent change in LDLC at each dose, the cohort was limited to patients taking one of the three statins Lipitor®, Pravachol®, and Zocor®. For low-dose analyses, patients who were not compliant in taking theh statins (i.e., who had at least two consecutive visits with an answer "No" to the question of whether they had been between 80% and 120% compliant since the previous visit) during the low-dose period of the study were excluded; for the high-dose analyses, the patients who were not compliant during the low-dose period of the stady were excluded as well as those who were not compliant during the high-dose period by the same definition (i.e., had at least two consecutive visits with an answer "No" to the question of whether they had been between 80% and 120% compliant since the previous visit). Additionally, patients with incomplete covariate information were excluded. The percent change in LDL cholesterol was determined from the clinical data using the formulas: Low-dose percent change in LDL = (Low-dose follow-up LDL - Baseline LDL)/(Basehne LDL)* 100.
High-dose percent change in LDL = (High-dose follow-up LDL - Baseline LDL)/(Baseline LDL)*100, The parameters in these formulas were defined as follow
Baseline LDL - the average of the screening and baseline visits' LDL values, unless the screening and baseline LDL values were more than 15% apart from one another, in which case a second
baseline sample of all lipids was collected, and baseline LDL was the average of the two baseline LDL values.
Low-dose follow-up LDL - the average of the 6-week and 8-week values if both were available; otherwise the last single value from among the 4-week, 6-week, 8-week and early termination (if applicable) values was used.
High-dose follow-up LDL - the average of the 14-week and 16-week values if both were available; otherwise the last single value from among the 12- week, 14- week, 16-week and early termination (if applicable) values was used.
All possible haplotypes of the GUCY1B2 gene containing up to a maximum of four polymoφhisms were enumerated for the polymoφhic sites disclosed in Table 5. Each unique haplotype with a frequency of >1% was tested for association with the clinical response. Each individual in the analysis cohort was classified as having 0, 1, or 2 copies of the haplotype. Only dominant (i.e., 1 or 2 copies of the marker vs. 0 copy of the marker) and recessive (i.e., 0 or 1 copy of the marker vs 2 copies) models were considered in this analysis. Analysis of covariance (ANCOVA) models with one degree of freedom for marker were used. Association of the marker with percent change in LDLC at high dose or low dose was tested for the pooled statin class, and for each of the 3 individual statins (atorvastatin calcium, simvastatin, or pravastatin sodium) was analyzed as well. The genetic analysis utilized as covariates age, gender, statin assignment (in the all statins combined model only), ethnicity, baseline level of LDLC, alcohol consumption, smoking status and body mass index (BMI). Since many haplotypes were assessed in this gene, a multiple comparison adjustment to the raw marker p- values was necessary. A permutation test (Good, PI (2000) Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses (Springer Series in Statistics)) was performed to adjust for multiple comparisons, while appropriately accounting for the non-independence of the markers in the gene. In this procedure, the clinical outcome and covariates were held constant, and the set of dominant and recessive markers generated by the buildup routine was randomly permuted 1 ,000 times. The minimum p-value from among the many markers was noted for each of the 1,000 permutations. Then an observed p-value's quantile in this distribution was used as the adjusted p-value. For example, if 4.5% of the minimum p-values from the permutations are smaller than a marker's observed p-value, then that marker's adjusted p-value would be 0.045. Analyzing the data by statin showed that there were highly significant associations for the atorvastatin calcium group, but no significant associations for the other two statin groups or for the pooled statin grouping. Tables 6 and 7 below present least squares means for the percent change in LDLC observed in the group possessing the indicated number of copies of this marker at low and high statin doses for atorvastatin calcium (Lipitor®). Additionally, lower and upper 95% confidence limits on the means are presented. The unadjusted p~values characterizing the significance of the difference in percent LDLC for each subset and the number of people (count) in the dose compliant cohort having a particular number of copies of the marker are also shown in Tables 6 and 7. No significant difference in
least squares means for the percent change in LDLC as a function of having or lacking any of these markers was observed among individuals taking either dose of pravastatin sodium (Pravachol®) or simvastatin (Zocor®).
Table 6 and Table 7 below present the data showing the association between least squares means for the percent change in LDLC at high dose or low dose observed for patients treated with atorvastatin calcium possessing the indicated number of copies of a given marker, or marker class. Table 6 presents the data for dominant markers with associations to percent change in LDLC after atorvastatin calcium treatment at high dose (HD). For selected markers in Table 6 percent change in LDLC after the low dose (LD) treatment is also presented. Table 7 presents the data for recessive markers with associations to percent change in LDLC after the high dose statin course, with low dose data provided for selected markers. In Tables 6 and 7, all haplotypes identified with Marker Numbers starting with the same letter indicate that those haplotypes were equivalent (i.e., divided the cohort identically into the two copy number groups) for the cohort in the analysis after the high dose treatment and therefore had identical statistical results. Since additional patients were analyzed for the low dose analyses, some of the markers in an equivalent marker group for the high dose analyses have differing low dose statistical results as the HD marker group divided into two groups of equivalent LD markers. The column labeled Mean % Change in LDLC (LC, UC) for each of the two copy number groups (i.e., 0 copy vs. 1 or 2 copies for the dominant model; 0 or 1 copy vs. 2 copies for recessive markers) presents the mean percent change in LDLC relative to baseline after the 8 week low dose regimen or after the 16 week high dose regimen with lower and upper 95% confidence limits for the mean provided in the parentheses for that number of copies. The Count columns present the number of members of the statin treatment cohort belonging to each of these copy number groups. Unadj. P presents the p value calculated for the marker-phenotype association using the recessive or dominant ANCOVA model, as appropriate, while the column labeled Perm. P presents the p value for the association determined by the permutation test described above.
For the haplotypes presented in Table 6 below, individuals with 1 or 2 copies of any haplotype marker in Marker Classes A-C, F-J show a statistically significant greater mean percent reduction in LDLC after the high dose statin treatment than individuals with zero copy of those markers. In contrast individuals with 1 or 2 copies of any haplotype marker in Marker Classes D and E in Table 6 show a smaller mean percent reduction in LDLC after the high dose statin treatment than do individuals with zero copies of those markers. For the markers in which low dose data is also provided, the same trend with copy number is observed after the low dose treatment for all haplotypes in Table 6 even if the p value was not significant after permutation.
Individuals with 0 or 1 copy of any haplotype marker in Marker Classes K-N in Table 7 below show a greater mean percent reduction in LDLC after high dose statin treatment than do individuals with 2 copies of these markers. Similar to the results for the haplotypes analyzed using a dominant genetic model, for the haplotypes in which low dose data is also provided in Table 7, the same trend with copy number is observed after the low dose treatment for all haplotypes in the table even if the p value was
not significant after permutation.
Thus, each of these markers is useful in predicting potential LDLC response by an individual to treatment with a statin or in optimizing the treatment selected for the individual.
aThe least squares mean percent change in LDLC is presented. (LC,UC) denotes the lower and upper 95% confidence limits on the mean.
bCount denotes the number of individuals with that number of markers in the indicated statin subset. An asterick in a PS indicates that the PS is not part of the haplotype.
'T e least squares mean percent change in LDLC is presented. (LC,UC) denotes the lower and upper 95% confidence limits on the mean. bCount denotes the number of individuals with that number of markers in the indicated statin subset. An asterick in a PS indicates that the PS is not part of the haplotype.
In view of the above, it will be seen that the several advantages of the invention are achieved and other advantageous results attained.
For any and all embodiments of the present invention discussed herein, in which a feature is described in terms of a Markush group or other grouping of alternatives, the inventors contemplate that such feature may also be described by, and that their invention specifically includes, any individual member or subgroup of members of such Markush group or other group.
As various changes could be made in the above methods and compositions without departing from the scope of the invention, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be inteφreted as illustrative and not in a limiting sense.
All references cited in this specification, including patents and patent applications, are hereby incoφorated in theh enthety by reference. The discussion of references herein is intended merely to summarize the assertions made by theh authors and no admission is made that any reference constitutes prior art. Applicants reserve the right to challenge the accuracy and pertinency of the cited references.