WO2005111239A2 - Haplotypes in the human thioredoxin interacting protein homologue (arrdc3) gene associated with obesity - Google Patents

Haplotypes in the human thioredoxin interacting protein homologue (arrdc3) gene associated with obesity Download PDF

Info

Publication number
WO2005111239A2
WO2005111239A2 PCT/US2005/013900 US2005013900W WO2005111239A2 WO 2005111239 A2 WO2005111239 A2 WO 2005111239A2 US 2005013900 W US2005013900 W US 2005013900W WO 2005111239 A2 WO2005111239 A2 WO 2005111239A2
Authority
WO
WIPO (PCT)
Prior art keywords
haplotype
arrdc3
risk
obesity
gene
Prior art date
Application number
PCT/US2005/013900
Other languages
French (fr)
Other versions
WO2005111239A3 (en
Inventor
Valur Emilsson
Gudmar Thorleifsson
Jesus Sainz
Gudmundur Bragi Walters
Jeffrey R. Gulcher
John R. Lamb
Eric E. Schadt
Original Assignee
Decode Genetics Ehf.
Rosetta Inpharmatics Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Decode Genetics Ehf., Rosetta Inpharmatics Llc filed Critical Decode Genetics Ehf.
Publication of WO2005111239A2 publication Critical patent/WO2005111239A2/en
Publication of WO2005111239A3 publication Critical patent/WO2005111239A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/172Haplotypes

Definitions

  • Obesity is one of the most serious and widespread health problems facing the world community. It is estimated that currently in the United States, 55% of adults are obese or overweight and 20% of teenagers are either obese or significantly overweight. In addition, 6% of the total population of the United States is morbidly obese (defined as having a body mass index (BMI) of more than forty). This data is alarming, as it indicates an obesity epidemic. Many health conditions are consequences of being overweight. For example, obesity is a known risk factor for the development of diabetes and is asserted to be the cause of approximately 80% of Type 2 diabetes (e.g., adult onset diabetes) in the United States.
  • BMI body mass index
  • Obesity is also a substantial risk factor for a wide range of cardiovascular, metabolic and other diseases and disorders (e.g., coronary artery disease, dyslipidemias (e.g., hyperlipidemia), stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis and so ne forms of cancer (e.g., colorectal cancer, breast cancer)). More recently, researchers have documented links between obesity and infertility, and obesity and miscarriages.
  • cardiovascular, metabolic and other diseases and disorders e.g., coronary artery disease, dyslipidemias (e.g., hyperlipidemia), stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis and so ne forms of cancer (e.g., colorectal cancer, breast cancer)). More recently, researchers have documented links between obesity and infertility, and obesity and miscarriages.
  • Susceptibility to obesity is determined by genetic, environmental (e.g., food availability, sociocultural factors, lifestyle) and regulatory factors (e.g., pregnancy, increases in fat cells and adipose tissue during infancy, childhood and/or adulthood, brain damage, drugs, endocrine factors and psychological factors (e.g., binge eating disorder, night-eating disorder)).
  • environmental e.g., food availability, sociocultural factors, lifestyle
  • regulatory factors e.g., pregnancy, increases in fat cells and adipose tissue during infancy, childhood and/or adulthood, brain damage, drugs, endocrine factors and psychological factors (e.g., binge eating disorder, night-eating disorder)
  • Studies using twins, adopted children and animal models of obesity have demonstrated that genetic factors are clearly implicated in the dynamics of gaining weight. Morbid obesity in humans appears to have a particularly strong genetic component. Genetic risk is conferred by subtle differences in genes among individuals in a population.
  • SNPs single nucleotide polymorphisms
  • SNPs are located on average every 1000 base pairs in the human genome. Accordingly, a typical human gene containing 250,000 base pairs may contain 250 different SNPs. Only a minor number of SNPs are located in exons and alter the amino acid sequence of the protein encoded by a gene. Most SNPs have no effect on gene function, while others may alter transcription, splicing, translation or stability of the mRNA encoded by a gene.
  • the gene thioredoxin interacting protein homologue known as ARRDC3 (and sometimes referred to as TXNTPH or KIAA136), is . associated with obesity and obesity-associated conditions. It has been discovered that particular combinations of genetic markers (“haplotypes”) are present at a higher than expected frequency in obese patients. The markers that are included in the haplotypes described herein are associated with the genomic region that directs expression of the ARRDC3 gene.
  • the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the presence of the at-risk haplotype associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8.
  • the at-risk haplotype is selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
  • the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype I, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • the invention in another aspect, relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype II, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • the invention in another aspect, relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype III, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • the invention in another aspect, relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype IV, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • the invention in another aspect, relates to a method of determining the genetic basis of obesity or an obesity-associated condition, comprising detecting the presence of an at-risk haplotype associated with the ARRDC3 gene, wherein the presence of the at-risk haplotype is indicative that the obesity and/or obesity- associated condition is mediated by ARRDC3.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8.
  • the at-risk haplotype is selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
  • the invention features a method of treating or preventing obesity and/or an obesity-associated condition in a subject, comprising administering a compound that increases the expression or biological activity of AJ RDC3 to the subject in need thereof, in a therapeutically effective amount.
  • the subject has an at-risk haplotype associated with the ARRDC3 gene.
  • the subject has decreased ARRDC3 expression or activity.
  • the subject has increased thioredoxin expression or activity.
  • the invention relates to a method of reducing triglyceride levels in a subject, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount.
  • the subject has an at-risk haplotype associated with the ARRDC3 gene.
  • the subject has decreased ARRDC3 expression or activity.
  • the subject has increased thioredoxin expression or activity.
  • the invention relates to a method of increasing fatty acid oxidation in a subject, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount.
  • the subject has an at-risk haplotype associated with the ARRDC3 gene. In another embodiment, the subject has decreased ARRDC3 expression or activity. In another embodiment, the subject has increased thioredoxin expression or activity. In another aspect, the invention relates to a method of assessing a subject for an increased risk of obesity and/or an obesity-associated condition, comprising assessing the interaction between ARRDC3 and thioredoxin in the subject, wherein an increased level of interaction is indicative of a decreased risk of obesity and/or an obesity-associated condition. In one embodiment, the subject has an at-risk haplotype associated with the ARKDC3 gene. In another embodiment, the subject has decreased ARRDC3 expression or activity.
  • the subject has increased thioredoxin expression or activity.
  • the invention relates to a method of assessing response to treatment with a compound that increases the level of expression or biological activity of ARRDC3 by a subject in a target population, comprising: assessing the level of expression or biological activity of ARRDC3 in the subject before treatment with a compound that increases the expression or biological activity of ARRDC3; assessing the level of expression or biological activity of ARRDC3 in the subject during or after treatment with the compound that increases the expression or biological activity of ARRDC3; and comparing the level of the expression or biological activity of ARRDC3 with the level of the expression or biological activity of ARRDC3 during or after treatment, wherein a level of the expression or biological activity of ARRDC3 during or after treatment that is significantly higher than the level of the expression or biological activity of ARRDC3 before treatment, is indicative of efficacy of treatment with the compound that increases the expression or biological activity of ARRDC3.
  • the invention features a method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene, the marker having a p-value of lxl 0 "5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity.
  • the invention features a method of diagnosing a predisposition or susceptibility to an obesity-associated condition in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene, the marker having a p-value of lxl 0 "5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to an obesity-associated condition.
  • the at-risk haplotype comprises haplotype I, haplotype II, haplotype III, haplotype IV, or combinations of haplotype I, haplotype II, haplotype III, and haplotype IV.
  • determination of the presence or absence of the at-risk haplotype comprises enzymatic amplification, electrophoretic analysis, sequence analysis and/or restriction fragment length polymorphism analysis.
  • the at-risk haplotype has a relative risk of at least 1.5, at least 2.5 or at least 3.0.
  • the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0 "5 or less, lxl 0 "6 or less, lxlO "7 or less or lxlO "8 or less.
  • the method is for diagnosing an obesity-associated condition such as diabetes (e.g., Type 2 or adult onset diabetes), coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages, or cancer or a susceptibility to an obesity-associated condition in a subject.
  • diabetes e.g., Type 2 or adult onset diabetes
  • coronary artery disease e.g., peripheral arterial occlusive disease
  • myocardial infarction e.g., myocardial infarction
  • peripheral arterial occlusive disease e.g., myocardial infarction
  • peripheral arterial occlusive disease e.g., myocardial infarction
  • peripheral arterial occlusive disease e.g., myocardial infarction
  • the invention features a kit for assaying a sample from a subject to detect a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, wherein the kit comprises one or more reagents for detecting an at-risk haplotype associated with the ARRDC3 gene.
  • the nucleic acid comprises at least one contiguous nucleotide sequence that is completely complementary to a region comprising at least one of the markers of the at-risk haplotype.
  • the one or more reagents comprise one or more nucleic acids that are capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2, for example, a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype TV.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8.
  • the invention features a kit for assaying a sample from a subject to detect a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, wherein the kit comprises: a) one or more labeled nucleic acids capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene; and b) reagents for detection of the label.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2, for example, a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype TV.
  • the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8.
  • the obesity-associated condition can be, for example, diabetes (e.g., Type 2 or adult onset diabetes), coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages or cancer.
  • diabetes e.g., Type 2 or adult onset diabetes
  • coronary artery disease e.g., peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages or cancer.
  • diabetes e.g., Type 2 or adult onset diabetes
  • coronary artery disease e.g., peripheral arterial occlusive disease
  • FIG. 1 is a graph depicting linkage of obesity in males to human chromosome 5. Lod scores are plotted for human chromosome 5.
  • FIG. 2 is a table depicting haplotype I and its association with severe obesity (BMI>35) in males.
  • FIG. 3.1 is a schematic of annotated sequences contained in the 280 kb region defined by the boundary markers of haplotype I.
  • FIG. 3.2 is a table summarizing the analysis of the expressed sequences depicted in FIG. 3.1. Genes containing or missing exons are indicated.
  • FIG. 4 is a schematic of expressed sequences ARRDC3 (KIAA1376) and XM_299045, which are contained in the region defined by the boundary markers of haplotype I.
  • FIG. 5 is a schematic of expressed sequences ARRDC3 (KIAA1376) and XM_299045, which are contained in the region defined by the boundary markers of haplotype I. ARRDC3 is also contained within the LD block.
  • FIG. 6 is a table depicting haplotype II and its association with severe obesity (BMI: top 10%) in males.
  • FIG. 7.1 is a table depicting SNPs and markers (haplotype III) associated with severe obesity (BMI: top 10%) in males.
  • FIG. 7.2 is a table showing that the markers exhibiting the strongest association are found in the same LD block and are highly correlated.
  • FIG. 8 is a table showing a SNP-only haplotype (haplotype IV) identified across the LD block that is associated with severe obesity (BMI: top 10%) in males.
  • FIG. 9 is a schematic representation of a hypothetical model for the role of TXFflPH in the regulation of obesity. It is proposed that ARRDC3 shifts fuel metabolism from storage to oxidative by inhibiting the reducing activity of thioredoxin (TXN).
  • TXN undergoes a reversible oxidation to the cystine disulfide through transfer of reducing equivalents (a disulfide substrate).
  • the oxidized TXN is reduced back to the Cys form by the NADPH-dependent thioredoxin reductase (TXNR).
  • TXNR NADPH-dependent thioredoxin reductase
  • ARRDC3 may interact with reduced form of TXN and inhibit its reducing activity.
  • increased activity of ARRDC3 would alter the cellular redox state that results in decreased ratio NADH to NAD + .
  • Increased levels of NAD + would then activate the TCA cycle and shift fuel metabolism from storage to oxidative.
  • FIGS. 11.1 to 11.2 are the cDNA sequence of ARRDC3 (SEQ ID NO: 2; GenBank Accession No.: NM_020801).
  • FIG. 12 is the ARRDC3 polypeptide sequence (SEQ ID NO: 3; GenBank Accession No.: NP_ 065852).
  • FIG. 13 is a schematic showing a role in fuel oxidation and/or desensitization of G-protein coupled receptors (GPCR) for ARRDC3.
  • FIG. 14 shows synteny between the regions of human and murine chromosome 5 associated with obesity.
  • FIG. 15 shows the mean expression levels of ARRDC3 in over 80 human tissues and cell lines.
  • FIG. 16 shows the lod score curve for the chromosome 9 linkage with the obesity related locus on chromosome 5.
  • FIG. 17 shows correlation plots for correlation between ARRDC3 expression and BMI in males and females, using either subcutenous or omental fat.
  • FIG. 18 highlights a pattern of expression associated with ARRDC3 expression in adipose tissue.
  • FIG. 19 shows an adipose cluster of genes most correlated with ARRDC3 in adipose and blood tissues.
  • FIG. 20 shows clustering over the same set of genes described in FIG. 19, but using the expression results from the blood profiling study.
  • FIG. 20 shows clustering over the same set of genes described in FIG. 19, but using the expression results from the blood profiling study.
  • FIG. 21 is a schematic representation of fasting/feeding schedule of the volunteers in Group 2 in the study of fasting signature in human subcutaneous fat, Example 8.
  • FIG. 22 shows clustering results of 420 genes identified in the study of fasting signature in human subcutaneous fat in Group 2, Example 8.
  • FIG. 23 shows clustering results of 420 genes identified in the study of fasting signature in human subcutaneous fat in Groups 1 and 2, Example 8.
  • FIG. 24.1 shows reduction in transcript levels of ARRDC3 in adipose tissue upon food intake in Group 2, Example 8.
  • FIG. 24.2 shows reduction in transcript levels of ARRDC3 in adipose tissue upon food intake in the combined pool of Group 1 and Group 2, Example 8.
  • FIG. 25.1 shows reduction in transcript levels of PDK4 in adipose tissue upon food intake in Group 2, Example 8.
  • FIG. 25.2 shows reduction in transcript levels of PDK4 in adipose tissue upon food intake in Group 2, Example 8.
  • AREDC3 thioredoxin interacting protein homolog
  • haplotypes a combination of SNPs and/or microsatellites
  • Kits for assaying a subject to detect a predisposition or susceptibility to obesity and/or an obesity-associated condition are also encompassed by the invention.
  • methods for treating obesity, a susceptibility to obesity, an obesity-associated conditions and/or a susceptibility to an obesity-associated condition in a subject are also encompassed by the invention.
  • the ARRDC3 -associated haplotypes describe a set of genetic markers associated with ARRDC3.
  • the haplotype can comprise one or more markers, two or more markers, three or more markers, four or more markers, or five or more markers, six or more markers, seven or more markers, eight or more markers, nine or more markers, ten or more markers, eleven or more markers, twelve or more markers, thirteen or more markers or fourteen or more markers.
  • the genetic markers are particular "alleles" at "polymorphic sites" associated with ARRDC3.
  • a nucleotide position at which more than one nucleotide is possible in a population is referred to herein as a "polymorphic site".
  • a polymorphic site is a single nucleotide in length
  • the site is referred to as a single nucleotide polymorphism ("SNP").
  • SNP single nucleotide polymorphism
  • Polymorphic sites can allow for differences in sequences based on substitutions, insertions or deletions. Each version of the sequence with respect to the polymorphic site is referred to herein as an "allele" of the polymorphic site.
  • an allele of the polymorphic site.
  • the SNP allows for both an adenine allele and a thymine allele.
  • a reference sequence is referred to for a particular sequence. Alleles that differ from the reference are referred to as "variant" alleles.
  • the reference genomic sequence that contains the gene encoding ARRDC3, and which is associated with obesity or a predisposition or susceptibility to obesity and/or an obesity-associated condition is described herein by SEQ ID NO: 1.
  • variant ARRDC3 refers to a ARRDC3 sequence that differs from SEQ ID NO: 1, but is otherwise substantially similar.
  • the genetic markers that make up the haplotypes described herein are ARRDC3 variants.
  • the variants of ARRDC3 that are used to determine the haplotypes disclosed herein are associated with a susceptibility to a number of obesity and obesity-associated phenotypes. Additional variants can include changes that affect a polypeptide, e.g., the ARRDC3 polypeptide.
  • sequence differences when compared to a reference nucleotide sequence, can include the insertion or deletion of a single nucleotide, or of more than one nucleotide, resulting in a frame shift; the change of at least one nucleotide, resulting in a change in the encoded amino acid; the change of at least one nucleotide, resulting in the generation of a premature stop codon; the deletion of several nucleotides, resulting in a deletion of one or more amino acids encoded by the nucleotides; the insertion of one or several nucleotides, such as by unequal recombination or gene conversion, resulting in an interruption of the coding sequence of a reading frame; duplication of all or a part of a sequence; transposition; or a rearrangement of a nucleotide sequence.
  • Such sequence changes alter the polypeptide encoded by a ARRDC3 nucleic acid.
  • the change in the nucleic acid sequence causes a frame shift
  • the frame shift can result in a change in the encoded amino acids, and/or can result in the generation of a premature stop codon, causing generation of a truncated polypeptide.
  • a polymorphism associated with obesity, a susceptibility to obesity, an obesity- associated condition and/or a susceptibility to an obesity-associated condition can be a synonymous change in one or more nucleotides (i.e., a change that does not result in a change in the encoded ARRDC3 amino acid sequence).
  • polypeptides encoded by the ARRDC3 cDNA sequence is the "reference” ARRDC3 polypeptide (SEQ ID NO: 3).
  • Polypeptides encoded by variant alleles are referred to as "variant" polypeptides with variant amino acid sequences.
  • the reference genomic sequence that contains the gene encoding ARRDC3 is described herein by SEQ ID NO: 1.
  • ARRDC3 refers to a ARRDC3 sequence that differs from SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, but is otherwise substantially similar.
  • a "substantially similar sequence” as used herein refers to a ARRDC3 sequence that shares at least about 80% amino acid or nucleotide sequence identity with a naturally occurring ARRDC3 sequence.
  • a variant ARRDC3 sequence shares at least about 90% sequence identity, and more preferably at least about 95% sequence identity with a naturally occurring TXNIH sequence (e.g., SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3).
  • ARRDC3 variants The variants of ARRDC3 that are used to determine the haplotypes disclosed herein are associated with a susceptibility to a number of obesity and obesity-associated conditions.
  • Haplotype analysis involves defining a candidate susceptibility locus using LOD scores. The defined regions are then ultra- fine mapped with microsatellite markers with an average spacing between markers of less than 100 kb. All usable microsatellite markers that are found in public databases and mapped within that region can be used. In addition, microsatellite markers identified within the deCODE genetics sequence assembly of the human genome can be used. Further additional discussion of haplotype analysis and identification follows.
  • diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject is made by detecting a haplotype associated with ARRDC3 as described herein.
  • haplotypes described herein are a combination of various genetic markers, e.g., SNPs and/or microsatellites, detecting haplotypes can be accomplished by methods known in the art for detecting sequences at polymorphic sites.
  • diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject is made by detecting one of the haplotypes listed in Table 1 and/or markers listed in Table 2.
  • an obesity-associated condition refers to a condition, disease and/or disorder (e.g., a cardiovascular disorder or a metabolic disorder) that is associated with obesity.
  • Such obesity-associated conditions include, e.g., diabetes (e.g., Type 2 or adult onset diabetes), coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias (e.g., hyperlipidemia), stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer (e.g., colorectal cancer or breast cancer).
  • Other diseases, conditions and/or disorders associated with obesity are known to those of skill in the art. Diagnostic assays can be designed for assessing markers near or in the ARRDC3 locus.
  • Such assays can be used alone or in combination with other assays for identifying a predisposition or susceptibility to obesity (e.g., determining BMI, determining waist-to-hip ratio, or determining relative body fat (e.g., by bioimpedance)).
  • Anthropometry can be used to determine whether the obesity is a result of defects in the thioredoxin system.
  • Combinations of genetic markers are referred to herein as "haplotypes,” and the present invention describes methods whereby detection of particular haplotypes is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • detection of the markers can also determine the genetic basis for obesity or an obesity- associated condition (e.g., by determining if the obesity or obesity-associated condition is mediated through the thioredoxin pathway or through ARRDC3 specifically).
  • the detection of the particular genetic markers that make up the particular haplotypes can be performed by a variety of methods described herein and known in the art.
  • genetic markers can be detected at the nucleic acid level, e.g., by direct nucleotide sequencing or at the amino acid level if the genetic marker affects the coding sequence of ARP DC3, e.g., by protein sequencing or by immunoassays using antibodies that recognize the ARRDC3 protein or a particular ARRDC3 variant protein.
  • the assays are used in the context of a biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether a subject is susceptible to (i.e., is at risk for or has a predisposition for) obesity and/or an obesity-associated condition.
  • a biological sample e.g., blood, serum, cells, tissue
  • the invention also provides for prognostic (or predictive) assays for determining whether a subject is predisposed or susceptible to obesity and/or an obesity-associated condition. For example, variations in a nucleic acid sequence can be assayed in a biological sample.
  • Such assays can be used for prognostic or predictive purposes to thereby allow for the prophylactic treatment of a subject prior to the onset of symptoms associated with obesity and/or an obesity- associated condition.
  • hybridization methods such as Southern analysis, Northern analysis, and/or in situ hybridizations, can be used (see Current Protocols in Molecular Biology, Ausubel, F. et al., eds., John Wiley & Sons, including all supplements).
  • a biological sample from a test subject (a "test sample") of genomic DNA, RNA, or cDNA, is obtained from a subject who is obese (if trying to determine the genetic basis of the obesity) or is suspected of having, being susceptible to or predisposed for, obesity and/or an obesity-associated condition (the "test subject").
  • the subject can be an adult, child, or fetus.
  • the test sample can be from any source that contains genomic DNA, such as a blood sample, sample of amniotic fluid, sample of cerebrospinal fluid, or tissue sample from skin, muscle, buccal or conjunctival mucosa, placenta, gastrointestinal tract or other organs.
  • a test sample of DNA from fetal cells or tissue can be obtained by appropriate methods, such as by amniocentesis or chorionic villus sampling.
  • the DNA, RNA, or cDNA sample is then examined to determine whether a polymorphism that is associated with the region that directs expression of ARRDC3 is present.
  • nucleic acid probe specific for the particular allele.
  • a sequence-specific probe can be directed to hybridize to genomic DNA, RNA, or cDNA.
  • a "nucleic acid probe”, as used herein, can be a DNA probe or an RNA probe that hybridizes to a complementary sequence.
  • One of skill in the art would know how to design such a probe such that sequence specific hybridization will occur only if a particular allele is present in a genomic sequence from a test sample.
  • a hybridization sample is formed by contacting the test sample containing a nucleic acid encoding ARRDC3, with at least one nucleic acid probe.
  • a non- limiting example of a probe for detecting mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to mRNA or genomic DNA sequences described herein.
  • the nucleic acid probe can be, for example, a full-length nucleic acid molecule, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to appropriate mRNA or genomic DNA.
  • the nucleic acid probe can be all or a portion of SEQ ID NO: 1 or SEQ ID NO: 2, optionally comprising at least one allele contained in the haplotypes described herein, or the probe can be the complementary sequence of such a sequence.
  • Other suitable probes for use in the diagnostic assays of the invention are described herein.
  • the hybridization sample is maintained under conditions that are sufficient to allow specific hybridization of the nucleic acid probe to the nucleic acid encoding ARRDC3.
  • Specific hybridization indicates exact hybridization (e.g., with no mismatches). Specific hybridization can be performed under high stringency conditions or moderate stringency conditions (see below). In one embodiment, the hybridization conditions for specific hybridization are high stringency (e.g., as described herein). Specific hybridization, if present, is then detected using standard methods. If specific hybridization occurs between the nucleic acid probe and the nucleic acid encoding ARRDC3 in the test sample, then the sample contains the allele that is complementary to the nucleotide that is present in the nucleic acid probe.
  • the process can be repeated for the other markers that make up the haplotype, or multiple probes can be used concurrently to detect more than one marker at a time. It is also possible to design a single probe containing more than one marker of a particular haplotype (e.g., a probe containing alleles complementary to 2, 3, 4, 5 and/or all of the markers that make up a particular haplotype). Detection of the particular markers of the haplotype in the sample is indicative that the source of the sample has the particular haplotype (e.g., an at-risk haplotype) and therefore is predisposed or susceptible to obesity and/or an obesity-associated condition. In another hybridization method, Northern analysis (see Current Protocols in Molecular Biology, Ausubel, F.
  • RNA from the subject is obtained from the subject by appropriate means.
  • Specific hybridization of a nucleic acid probe, as described above, to RNA from the subject is indicative of a particular allele complementary to the probe.
  • nucleic acid probes see, for example, U.S. Patent Nos. 5,288,611 and 4,851,330.
  • a peptide nucleic acid (PNA) probe can be used in addition or instead of a nucleic acid probe in the hybridization methods described above.
  • a PNA is a DNA mimic having a peptide-like, inorganic backbone, such as N-(2-aminoethyl)glycine units, with an organic base (A, G, C, T or U) attached to the glycine nitrogen via a methylene carbonyl linker (see, for example, Nielsen, P., et al, Bioconjug. Chem., 5:3-7 (1994)).
  • the PNA probe can be designed to specifically hybridize to a molecule in a sample suspected of containing one or more of the genetic markers of a haplotype that is associated with obesity, a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • Hybridization of the PNA probe is diagnostic for a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • Hybridization of the probe can also confirm that the obesity or obesity-associated condition is mediated through the thioredoxin pathway or through ARRDC3 specifically.
  • diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or the determination of the genetic basis of the obesity or obesity-associated condition is accomplished through enzymatic amplification of a nucleic acid from the subject.
  • a test sample containing genomic DNA can be obtained from the subject and the polymerase chain reaction (PCR) can be used to amplify the genomic ARRDC3 region (including flanking sequences if necessary) in the test sample.
  • PCR polymerase chain reaction
  • identification of a particular haplotype (e.g., an at-risk haplotype) associated with the amplified ARRDC3 region can be accomplished using a variety of methods (e.g., sequence analysis, analysis by restriction digestion, specific hybridization, single stranded conformation polymorphism assays (SSCP), electrophoretic analysis, etc.).
  • diagnosis is accomplished by expression analysis using quantitative PCR (kinetic thermal cycling). This technique can, for example, utilize commercially available technologies such as TaqMan ® (Applied Biosystems, Foster City, CA), to allow the identification of polymorphisms and haplotypes (e.g., at-risk haplotypes).
  • the technique can assess the presence of an alteration in the expression or composition of the polypeptide encoded by ARRDC3 or splicing variants. Further, the expression of the variants can be quantified as physically or functionally different.
  • analysis by restriction digestion can be used to detect a particular allele if the allele results in the creation or elimination of a restriction site relative to a reference sequence.
  • a test sample containing genomic DNA is obtained from the subject. PCR can be used to amplify the genomic DNA
  • ARRDC3 region (including flanking sequences if necessary) in the test sample from the test subject.
  • Restriction fragment length polymorphism (RFLP) analysis is conducted as described (see, e.g., Current Protocols in Molecular Biology, supra). The digestion pattern of the relevant DNA fragment indicates the presence or absence of the particular allele in the sample. Sequence analysis can also be used to detect specific alleles at polymorphic sites associated with ARRDC3. Therefore, in one embodiment, determination of the presence or absence of a particular haplotype (e.g., an at-risk haplotype) comprises sequence analysis. For example, a test sample of DNA or RNA can be obtained from the test subject. PCR or other appropriate methods can be used to amplify
  • Allele-specific oligonucleotides can also be used to detect the presence of a particular allele at a polymorphic site associated with ARRDC3, through the use of dot-blot hybridization of amplified oligonucleotides with allele-specific oligonucleotide (ASO) probes (see, for example, Saiki, R. et al., Nature, 324:163- 166 (1986)).
  • ASO allele-specific oligonucleotide
  • an “allele-specific oligonucleotide” (also referred to herein as an “allele-specific oligonucleotide probe”) is an oligonucleotide of approximately 10- 50 base pairs or approximately 15-30 base pairs, that specifically hybridizes to ARRDC3 or its flanking sequences, and that contains a specific allele at a polymorphic site as indicated by the polymorphisms and haplotypes described herein.
  • An allele-specific oligonucleotide probe that is specific for one or more particular polymorphisms in ARRDC3 can be prepared, using standard methods (see, e.g., Current Protocols in Molecular Biology, supra).
  • PCR can be used to amplify all or a fragment of ARRDC3, as well as genomic flanking sequences.
  • the DNA containing the amplified ARRDC3 (or fragment of the gene and/or flanking sequences) is dot-blotted, using standard methods (see, e.g., Current Protocols in Molecular Biology, supra), and the blot is contacted with the oligonucleotide probe. The presence of specific hybridization of the probe to the amplified ARRDC3 is then detected. Specific hybridization of an allele-specific oligonucleotide probe to DNA from the subject is indicative of a specific allele at a polymorphic site associated with ARRDC3.
  • An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphic site and only primes amplification of an allelic form to which the primer exhibits perfect complementarity (see, e.g., Gibbs, R. et al, Nucleic Acids Res., 17:2437-2448 (1989)).
  • This primer is used in conjunction with a second primer, which hybridizes at a distal site on the opposite strand. Amplification proceeds from the two primers, resulting in a detectable product, which indicates that the particular allelic form is present.
  • a control is usually performed with a second pair of primers, one of which contains a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarity to a distal site.
  • the single-base mismatch prevents amplification and no detectable product is formed.
  • the method works best when the mismatch is included in the 3'-most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).
  • LNAs locked nucleic acids
  • LNAs are a novel class of bicyclic DNA analogs in which the 2' and 4' positions in the furanose ring are joined via an O-methylene (oxy-LNA), S-methylene (thio-LNA), or amino methylene (amino-LNA) moiety.
  • oxy-LNA O-methylene
  • thio-LNA S-methylene
  • amino-LNA amino methylene
  • Common to all of these LNA variants is an affinity toward complementary nucleic acids, which is by far the highest reported for a DNA analog.
  • particular all oxy-LNA nonamers have been shown to have melting temperatures (T m ) of 64 °C and 74°C when in complex with complementary DNA or RNA, respectively, as opposed to 28°C for both DNA and RNA for the corresponding DNA nonamer.
  • T m Substantial increases in T m are also obtained when LNA monomers are used in combination with standard DNA or RNA monomers.
  • the T m could be increased considerably.
  • arrays of oligonucleotide probes that are complementary to target nucleic acid sequence segments from a subject can be used to identify polymorphisms in a nucleic acid encoding ARRDC3 and/or its flanking sequence.
  • an oligonucleotide array can be used.
  • Oligonucleotide arrays typically comprise a plurality of different oligonucleotide probes that are coupled to a surface of a substrate in different known locations. These oligonucleotide arrays, also described as “Genechips TM ,” have been generally described in the art (see, e.g., U.S. Patent No. 5,143,854, PCT Patent Publication Nos. WO 90/15070 and WO 92/10092). These arrays can generally be produced using mechanical synthesis methods or light directed synthesis methods that incorporate a combination of photolithographic methods and solid phase oligonucleotide synthesis methods (Fodor, S.
  • a nucleic acid of interest is allowed to hybridize with the array.
  • Detection of hybridization is a detection of a particular allele in the nucleic acid of interest.
  • Hybridization and scanning are generally carried out by methods described herein and also in, e.g., published PCT Application Nos. WO 92/10092 and WO 95/11995, and U.S. Patent No. 5,424,186, the entire teachings of each of which are incorporated by reference herein.
  • a target nucleic acid sequence which includes one or more previously identified polymorphic markers, is amplified by well-known amplification techniques, e.g., PCR.
  • arrays can include multiple detection blocks, and thus be capable of analyzing multiple, specific polymorphisms (e.g., multiple polymorphisms of a particular haplotype (e.g. , an at-risk haplotype)).
  • detection blocks can be grouped within a single array or in multiple, separate arrays so that varying, optimal conditions can be used during the hybridization of the target to the array. For example, it will often be desirable to provide for the detection of those polymorphisms that fall within G-C rich stretches of a genomic sequence, separately from those falling in A-T rich segments.
  • oligonucleotide arrays for detection of polymorphisms can be found, for example, in U.S. Patent Nos. 5,858,659 and 5,837,832, the entire teachings of both of which are incorporated by reference herein.
  • Other methods of nucleic acid analysis can be used to detect a particular allele at a polymorphic site associated with ARRDC3. Representative methods include, for example, direct manual sequencing (Church and Gilbert, Proc. Natl. Acad. Sci. USA, 81:1991-1995 (1988); Sanger, F., et al., Proc. Natl. Acad. Sci.
  • diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or the genetic basis or the obesity or obesity-associated condition can be made by examining expression and/or composition of a ARRDC3 polypeptide in those instances where the genetic marker contained in a haplotype described herein results in a change in the expression of the polypeptide (e.g., a resulting altered amino acid sequence leading to decreased or increased expression, or altered 5' or 3' nucleic acid sequences that flank the ARRDC3 gene and alter transcription of the gene).
  • a variety of methods can be used to make such a detection, including enzyme linked immunosorbent assays (ELISA), Western blots, immunoprecipitations and immunofluorescence.
  • ELISA enzyme linked immunosorbent assays
  • Western blots Western blots
  • immunoprecipitations immunofluorescence.
  • a test sample from a subject is assessed for the presence of an alteration in the expression and/or an alteration in composition of the polypeptide encoded by
  • An alteration in expression of a polypeptide encoded by ARRDC3 can be, for example, an alteration in the quantitative polypeptide expression (i.e., the amount of polypeptide produced).
  • An alteration in the composition of a polypeptide encoded by ARRDC3 is an alteration in the qualitative polypeptide expression (e.g., expression of a mutant ARRDC3 polypeptide or of a different splicing variant).
  • diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or determination of the genetic basis of obesity or an obesity-associated condition is made by detecting a particular splicing variant encoded by ARRDC3, or a particular pattern of splicing variants. Both such alterations (quantitative and qualitative) can also be present.
  • alteration in the polypeptide expression or composition refers to an alteration in expression or composition in a test sample, as compared to the expression or composition of polypeptide by ARRDC3 in a control sample.
  • a control sample is a sample that corresponds to the test sample (e.g., is from the same type of cells), and is from a subject who is not affected by obesity or a predisposition or susceptibility to obesity and/or an obesity-associated condition (e.g., a subject that does not possess an at-risk haplotype as described herein).
  • the presence of one or more different splicing variants in the test sample, or the presence of significantly different amounts of different splicing variants in the test sample, as compared with the control sample can be indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • An alteration in the expression or composition of the polypeptide in the test sample, as compared with the control sample can be indicative of a specific allele in the instance where the allele alters a splice site relative to the reference in the control sample.
  • Various means of examining expression or composition of the polypeptide encoded by ARRDC3 can be used, including spectroscopy, colorimetry, electrophoresis, isoelectric focusing, and immunoassays (e.g., David et al, U.S. Patent No. 4,376,110) such as immunoblotting (see, e.g., Current Protocols in Molecular Biology, particularly chapter 10).
  • an antibody e.g., an antibody with a detectable label
  • Antibodies can be polyclonal or monoclonal.
  • an intact antibody, or a fragment thereof can be used.
  • labeled with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled.
  • indirect labeling include detection of a primary antibody using a fluorescently-labeled secondary antibody and end-labeling of a DNA probe with biotin such that it can be detected with fluorescently-labeled streptavidin.
  • Western blot analysis e.g., using an antibody that specifically binds to a polypeptide encoded by a variant ARRDC3, or an antibody that specifically binds to a polypeptide encoded by a reference allele
  • Western blot analysis can be used to identify the presence in a test sample of a polypeptide encoded by a variant ARRDC3 allele, or the absence in a test sample of a polypeptide encoded by the reference allele.
  • the level or amount of polypeptide encoded by ARRDC3 in a test sample is compared with the level or amount of the polypeptide encoded by ARRDC3 in a control sample.
  • a level or amount of the polypeptide in the test sample that is higher or lower than the level or amount of the polypeptide in the control sample, such that the difference is statistically significant, is indicative of an alteration in the expression of the polypeptide encoded by ARRDC3, and is diagnostic for a particular allele responsible for causing the difference in expression.
  • the composition of the polypeptide encoded by ARRDC3 in a test sample is compared with the composition of the polypeptide encoded by ARRDC3 in a control sample.
  • both the level or amount and the composition of the polypeptide can be assessed in the test sample and in the control sample.
  • the diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or determination of the genetic basis of obesity or an obesity-associated condition is made by detecting at least one ARRDC3 -associated allele in combination with an additional assay (e.g., determining BMI, determining waist-to-hip ratio, determining relative body fat (e.g., by bioimpedance).
  • an additional assay e.g., determining BMI, determining waist-to-hip ratio, determining relative body fat (e.g., by bioimpedance).
  • DIAGNOSTIC KITS Kits useful in the methods of diagnosis comprise components useful in any of the methods described herein, including for example, hybridization probes, restriction enzymes (e.g., for RFLP analysis), allele-specific oligonucleotides, antibodies that bind to an altered ARRDC3 polypeptide (e.g., to a polypeptide having the sequence depicted in SEQ ID NO: 3, but comprising at least one genetic marker included in the haplotypes described herein) or to non-altered (native) ARRDC3 polypeptide (e.g., to a polypeptide having the sequence depicted in SEQ ID NO: 3), means for amplification of nucleic acids comprising ARRDC3, means for analyzing the nucleic acid sequence of ARRDC3, means for analyzing the amino acid sequence of a ARRDC3 polypeptide, etc.
  • hybridization probes e.g., restriction enzymes (e.g., for RFLP analysis), allele-specific oligonucleo
  • kits can provide reagents for assays to be used in combination with the methods of the present invention, e.g., reagents for use in determining BMI (e.g., a scale, a tape measure), waist-to-hip ratio (e.g., a tape measure) and/or relative body fat (e.g., calipers, a bioimpedance-measuring device).
  • BMI e.g., a scale, a tape measure
  • waist-to-hip ratio e.g., a tape measure
  • relative body fat e.g., calipers, a bioimpedance-measuring device.
  • Kits useful in the methods of diagnosis comprise components useful in any of the methods described herein, including for example, hybridization probes or primers as described herein (e.g., labeled probes or primers), reagents for detection of labeled molecules, restriction enzymes (e.g., for RFLP analysis), allele-specific oligonucleotides, antibodies that bind to altered or to non- altered (native) ARRDC3 polypeptide, means for amplification of nucleic acids comprising ARRDC3, means for analyzing the nucleic acid sequence of a ARRDC3 nucleic acid, means for analyzing the amino acid sequence of a ARRDC3 polypeptide, etc.
  • hybridization probes or primers as described herein e.g., labeled probes or primers
  • restriction enzymes e.g., for RFLP analysis
  • allele-specific oligonucleotides e.g., antibodies that bind to altered or to non- altered (native) ARRDC3
  • the invention is a kit for assaying a sample from a subject to determine the genetic basis of obesity and/or an obesity-associated condition, or to detect a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, wherein the kit comprises one or more reagents for detecting an at-risk haplotype associated with the ARRDC3 gene.
  • the kit can comprise, e.g., at least one contiguous nucleotide sequence that is completely complementary to a region comprising at least one of the markers of the at-risk haplotype, one or more nucleic acids that are capable of detecting one or more specific markers of an at-risk haplotype.
  • nucleic acids can be designed using portions of the nucleic acids flanking SNPs that are indicative of obesity or an obesity-associated condition or a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • nucleic acids e.g., oligonucleotide primers
  • Such nucleic acids are designed to amplify regions of the ARRDC3 nucleic acid (and/or flanking sequences) that are associated with an at- risk haplotype for obesity or an obesity-associated condition.
  • the kit comprises one or more labeled nucleic acids capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene and reagents for detection of the label.
  • Suitable labels include, e.g., a radioisotope, a fluorescent label, an enzyme label, an enzyme co-factor label, a magnetic label, a spin label, an epitope label.
  • Table 1 depicts such at-risk haplotypes (e.g., haplotype I, haplotype II; haplotype III; haplotype IV) and markers (see also, Table 2).
  • the at-risk haplotype to be detected by the reagents of the kit comprises two or more markers selected from the group consisting of the markers in Table 1.
  • the kit comprises two or more markers selected from the markers comprising haplotype I, haplotype II, haplotype III, or haplotype IN.
  • the presence of the at-risk haplotype is indicative of obesity or an obesity-associated condition, or a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • Haplotypes and single markers associated with obesity or an obesity-associated condition, or a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • RR Relative Risk
  • ARRDC3-ASSOCIATED OBESITY AND/OR ASSOCIATED CONDITIONS USING HAPLOTYPES Certain haplotypes described herein, such as those shown in FIGs. 1 and 2, have been found more frequently in individuals with obesity than in individuals without obesity. Therefore, these "at-risk" haplotypes can be used to diagnose ARRDC3-associated obesity and/or associated condition(s). Identification of ARRDC3 -associated obesity and/or associated condition(s) facilitates treatment planning, as treatment can be designed and therapeutics selected to target components involved in fuel metabolism, for example, those components shown in FIG. 9.
  • diagnosis of ARRDC3 -associated obesity or an associated condition is made by detecting a polymorphism in a ARRDC3 nucleic acid (e.g., using the methods described above and/or other methods known in the art).
  • the invention pertains to a method for the diagnosis and identification of ARRDC3 -associated obesity or an associated condition in a subject, by identifying the presence of an at-risk haplotype in ARRDC3 as described in detail herein.
  • the haplotypes described herein in Table 1 are found more frequently in obese individuals and/or individuals having an obesity-associated condition than in individuals not affected by these conditions.
  • an at-risk haplotype is characterized by the presence of polymorphism(s) depicted in Table 2.
  • the at-risk haplotype is selected from the group consisting of haplotype I, haplotype II, haplotype III and haplotype IV.
  • the at-risk haplotype can also comprise a combination of the markers in haplotype I, haplotype II haplotype III and haplotype IV.
  • the methods described herein can be used to assess a sample from a subject for the presence or absence of an at-risk haplotype; the presence of an at-risk haplotype is indicative of ARRDC3 -associated obesity or an associated condition.
  • the invention is a method for the diagnosis and identification of a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, or for determining the genetic basis of obesity or an obesity-associated condition, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene.
  • the at-risk haplotype is one that confers a significant risk of predisposition or susceptibility to obesity.
  • the at-risk haplotype is one that confers a significant risk of a predisposition or susceptibility to an obesity-associated condition.
  • significance associated with a haplotype is measured by relative risk (RR).
  • RR is the ratio of the incidence of the condition among subjects who contain the haplotype to the incidence of the condition among subjects who do not contain the haplotype.
  • the at-risk haplotype has a relative risk of at least 1.8. In other embodiments, the at-risk haplotype has a relative risk of at least 2.7, or at least 3.0.
  • the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0 "5 or less, lxlO "6 or less, lxlO "7 or less or lxlO "8 or less.
  • significance associated with a haplotype is measured by an odds ratio.
  • a significant risk is measured as an odds ratio of at least about 1.2, including by not limited to: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9.
  • an odds ratio of at least 1.2 is significant.
  • an odds ratio of at least about 1.5 is significant.
  • an odds ratio of at least about 1.7 is significant.
  • the significance is measured by a percentage.
  • a significant increase in risk is at least about 20%, including but not limited to about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% and 98%.
  • a significant increase in risk is at least about 50%. It is understood however, that identifying whether a risk is medically significant may also depend on a variety of factors, including the specific disease, the haplotype, and often, environmental factors.
  • the invention also pertains to methods of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, or determining the genetic basis of obesity or an obesity-associated condition, comprising screening for an at-risk haplotype associated with the ARRDC3 nucleic acid that is more frequently present in a subject who is obese or has an obesity- associated condition, or is predisposed or susceptible to obesity and/or an obesity- associated condition (affected), compared to the frequency of its presence in a healthy subject (control).
  • the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity- associated condition.
  • the method comprises assessing in a subject the presence or frequency of one or more specific SNP alleles and/or microsatellite alleles (e.g., alleles that are present in an at-risk haplotype) associated with ARRDC3 and linked to obesity or a an obesity- associated condition or a predisposition or susceptibility to obesity and/or an obesity-associated condition.
  • an excess or higher frequency of the allele(s), as compared to a healthy control subject, is indicative that the subject is predisposed or susceptible to obesity and/or an obesity-associated condition.
  • the invention also relates to methods of assessing an individual for an increased risk of obesity or an obesity-associated condition comprising assessing the interaction between ARRDC3 and thioredoxin (TXN).
  • TXN thioredoxin
  • An individual can be screened by obtaining a biological sample from the individual, and measuring the binding of ARRDC3 to TXN (e.g., by detecting the reducing activity of TXN in the sample).
  • the reducing activity can be detected and measured, for example, using an NADPH/TXN reductase dependent insulin reducing assay (Spyrou, G., et al, J. Biol. Chem., 272:2936-2941 (1997)). If the reducing activity of TXN in the sample is decreased (i.e., less inhibition of reducing activity) compared to than that in a healthy subject (control) then the patient has an increased risk of obesity or an obesity-associated condition.
  • the individual has 10% less thioredoxin reducing activity compared to controls. In other embodiments the individual has 10%, 20%, 25%, 50%, 75%, 80%, 90% or 95% less thioredoxin reducing activity compared to controls.
  • ARRDC3 gene product may affect long-term body weight through fuel metabolism. Based on the homology of ARRDC3 with TXNIP, the role of ARRDC3 in fuel metabolism and regulation of body weight can be hypothesized.
  • Thioredoxin (TXN) is a multifunctional oxidoreductase with a conserved amino-acid sequence (-Cys-Gly-Pro-Cys-) at its active site, regulating a number of cellular processes via thiol redox control (TXNred / TXNox) (Holgrem, A. (Annu. Rev. Biochem., 54:237-271 (1985)) (FIG. 9).
  • TXN-1 Three thioredoxin molecules exist in humans (Powis and Montfort (Annu. Rev. Biomol. Struct., 30:421- 455 (2001)). Of the TXN molecules, TXN-1 was been characterized the most. In addition to the two catalytic Cys residues, Cys 32 and Cys 35 , TXN-1 contains three other Cys residues, Cys 62 , Cys 69 and Cys 73 (with noncatalytic activities). TXN-1 is a secreted protein and is able to homodimerize resulting in a loss of reducing activity. A larger protein TXN-2 (of unknown function) contains a mitochondrial import sequence and is predominantly found in mitochondria.
  • TXN-2 contains the catalytic Cys residues but lacks the other Cys sites found in TXN-1.
  • TXN-like cytosolic protein has recently been cloned from a human testis cDNA library. TXN is induced by various stress components including ultraviolet-induced cytocide and hydrogen peroxide (oxidizing agent) and its induction is thought to mediate cytoprotective responses (Powis and Monfort (supra)).
  • Transgenic (Tg) overexpression (3 fold increase) of TXN in mice has been performed (Mitsui, et al, Antioxid. Redox Signal., 4:693-696 (2002)). Here, the Tg mice are fertile showing normal growth and normal behavior.
  • mice show increased resistance to oxidative stress and have extended lifespans.
  • Mouse TXN homozygous knock-out results in early embryonic lethality suggesting a major role for TXN in differentiation and morphogenesis (Matsui, et al, Develop. Biol, 178:179-185
  • TXNs affect a wide range of cellular processes including growth and antiapoptosis.
  • TXNIP originally identified as vitamin D3-upregulated gene 1 (VDUP1) (Chen and DeLucca, Biochim. Biophys. Acta, 1219:26-32 (1994)), binds reduced but not oxidized TXN and inhibits its reducing activity (measured by NADPH/TXN reductase dependent insulin reducing assay). Further, increased expression of ARRDC3 results in reduction of TXN expression (Nishiyama et al, J. Biol. Chem., 274:21645-21650 (1999)). Recently, TXNIP mRNA expression was found to be markedly induced in pancreatic islet cells by glucose treatment (Shalev et al,
  • TXNIP regulated redox state in cells is important for metabolism and signal fransduction. While not wishing to be limited to a particular theory, given the close identity of ARRDC3 (5ql4) to TXNIP (lq21) it is reasonable to believe that their activities overlap to some extent.
  • the nonsense mutation in TXNIP is a loss of function (reduced transcript levels and amino acids essential for binding to TXN are missing) mutation, resulting in a 2-fold increase in liver content of triglycerides (Bodnar et al (supra)).
  • FIG. 13 highlights a role in fuel oxidation and/or desensitization of G-protein coupled receptors (GPCRs) for ARRDC3. Regarding the role it may play in the development of obesity, FIG. 13 shows how ARRDC3 may regulate fatty acid oxidation and fatty acid biosynthesis.
  • GPCRs G-protein coupled receptors
  • ARRDC3 is a gene that is linked to obesity.
  • the ARRDC3 protein has sequence homology to thioredoxin interacting protein (TXNIP).
  • TXNIP thioredoxin interacting protein
  • TXN oxidized thioredoxin
  • TXN regulated redox state in cells is important for metabolism and signal transduction.
  • the invention is a method of treating or preventing obesity and/or an obesity-associated condition in a subject comprising administering to the subject an agonist (e.g., a promoter) of ARRDC3.
  • an agonist e.g., a promoter
  • an agonist of ARRDC3 refers to an agent (compound) that increases the expression or biological activity or function of ARRDC3 (e.g., the inhibition of TXN reducing activity).
  • Such agonists include proteins, fusion proteins, polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small molecules or other drugs, and ribozymes.
  • Test agents (compounds) can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection.
  • ARRDC3 ARRDC3 -specific substrate processing assay.
  • an assay can be performed wherein ARRDC3 and a substrate of ARRDC3 (e.g., TXN) are contacted in the presence and absence of the agent to be tested. If the presence of the agent results in an increase in the binding of the substrate, by an amount that is statistically significant, then the agent is an agonist of ARRDC3.
  • the present invention also relates to an assay for identifying agents that alter (preferably increase) the expression of the ARRDC3 gene (e.g., fusion proteins, polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small molecules or other drugs, or ribozymes) which alter (e.g., increase) expression (e.g., transcription or translation) of the ARRDC3 gene, as well as agents identifiable by the assays.
  • agents that alter (preferably increase) the expression of the ARRDC3 gene e.g., fusion proteins, polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small molecules or other drugs, or ribozymes
  • alter (e.g., increase) expression e.g., transcription or translation) of the ARRDC3 gene
  • agents identifiable by the assays e.g., fusion proteins, polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small molecules
  • the solution can comprise, for example, cells containing the nucleic acid or cell lysate containing the nucleic acid; alternatively, the solution can be another solution that comprises elements necessary for transcription/translation of the nucleic acid. Cells not suspended in solution can also be employed, if desired.
  • the level and/or pattern of ARRDC3 expression e.g., the level and/or pattern of mRNA or of protein expressed
  • a confrol i.e., the level and/or pattern of the ARRDC3 expression in the absence of the agent to be tested.
  • agents which alter (e.g., increase) the expression of the ARRDC3 gene can be identified using a cell, cell lysate, or solution containing a nucleic acid encoding the promoter region (or other 5' or 3' sequences flanking the ARRDC3 gene) of the ARRDC3 gene operably linked to a reporter gene.
  • the level of expression of the reporter gene (e.g., the level of mRNA or of protein expressed) is assessed, and is compared with the level of expression in a control (i.e., the level of the expression of the reporter gene in the absence of the agent, to be tested). If the level in the presence of the agent differs (e.g., is increased), by an amount or in a manner that is statistically significant, from the level in the absence of the agent, then the agent is an agent that alters the expression of ARRDC3, as indicated by its ability to alter expression of a gene that is operably linked to the ARRDC3 gene promoter. Enhancement of the expression of the reporter indicates that the agent is an agonist of ARRDC3 expression and/or biological activity.
  • inhibition of the expression of the reporter indicates that the agent is an antagonist of ARRDC3 expression and/or biological activity.
  • the level of expression of the reporter in the presence of the agent to be tested is compared with a control level that has previously been established. A level in the presence of the agent that differs from the control level by an amount or in a manner that is statistically significant indicates that the agent alters ARRDC3 expression. Agents that increase ARRDC3 expression or biological activity are particularly useful for treating or preventing obesity or an obesity-associated condition.
  • ARRDC3 agonists identified as described herein can be used not only to treat or prevent obesity/or an obesity-associated condition, but also to reduce triglyceride levels in a subject, or to increase fatty acid oxidation in a subject, or to decrease the interaction between TXN and ARRDC3 in an individual in need thereof.
  • Other agents that regulate the activity of ARRDC3 indirectly e.g., agents that regulate the activity of ARRDC3 by regulating the activity of other proteins which in tarn regulate the activity of ARRDC3 are also encompassed by the invention.
  • the invention also encompasses agents (e.g., agonists of ARRDC3) that are identified by the methods of the invention.
  • an agent identified as described herein in an appropriate animal model.
  • an agent identified as described herein can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent.
  • an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent.
  • this invention pertains to uses of novel agents identified by the above- described screening assays for treatments as described herein.
  • an agent identified as described herein can be used to alter activity of a polypeptide encoded by ARRDC3, or to alter expression of ARRDC3, by contacting the polypeptide or the gene (or contacting a cell comprising the polypeptide or the gene) with the agent identified as described herein.
  • PHARMACEUTICAL COMPOSITIONS The present invention also pertains to pharmaceutical compositions comprising agents (compounds) described herein, and/or an agent that alters (e.g., enhances or inhibits) ARRDC3 gene expression or ARRDC3 polypeptide biological activity as described herein.
  • a polypeptide, protein, an agent that alters ARRDC3 gene expression or biological activity can be fo ⁇ nulated with a physiologically acceptable carrier or excipient to prepare a pharmaceutical composition.
  • a physiologically acceptable carrier or excipient can be fo ⁇ nulated with a physiologically acceptable carrier or excipient to prepare a pharmaceutical composition.
  • the carrier and composition can be sterile.
  • the formulation should suit the mode of administration.
  • Suitable pharmaceutically acceptable carriers include but are not limited to water, salt solutions (e.g., NaCl), saline, buffered saline, alcohols, glycerol, ethanol, gum arabic, vegetable oils, benzyl alcohols, polyethylene glycols, gelatin, carbohydrates such as lactose, amylose or starch, dextrose, magnesium stearate, talc, silicic acid, viscous paraffin, perfume oil, fatty acid esters, hydroxymethylcellulose, polyvinyl pyrolidone, etc., as well as combinations thereof.
  • the pharmaceutical preparations can, if desired, be mixed with auxiliary agents, e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, flavoring and/or aromatic substances and the like which do not deleteriously react with the active agents.
  • auxiliary agents e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, flavoring and/or aromatic substances and the like which do not deleteriously react with the active agents.
  • the composition if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents.
  • the composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder.
  • the composition can be formulated as a suppository, with traditional binders and carriers such as trigly
  • Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, polyvinyl pyrolidone, sodium saccharine, cellulose, magnesium carbonate, etc.
  • Methods of introduction of these compositions include, but are not limited to, intradermal, intramuscular, intraperitoneal, intraocular, intravenous, subcutaneous, topical, oral and intranasal.
  • Other suitable methods of introduction can also include gene therapy (as described below), rechargeable or biodegradable devices, particle acceleration devises ("gene guns”) and slow release polymeric devices.
  • the pharmaceutical compositions of this invention can also be administered as part of a combinatorial therapy with other agents.
  • compositions for intravenous administration typically are solutions in sterile isotonic aqueous buffer.
  • the composition may also include a solubilizing agent and a local anesthetic to ease pain at the site of the injection.
  • the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampule or sachette indicating the quantity of active agent.
  • the composition is to be administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water, saline or dextrose/water.
  • composition is administered by injection
  • an ampule of sterile water for injection or saline can be provided so that the ingredients may be mixed prior to administration.
  • nonsprayable forms, viscous to semi-solid or solid forms comprising a carrier compatible with topical application and having a dynamic viscosity preferably greater than water, can be employed.
  • Suitable formulations include but are not limited to solutions, suspensions, emulsions, creams, ointments, powders, enemas, lotions, sols, liniments, salves, aerosols, etc., which are, if desired, sterilized or mixed with auxiliary agents, e.g., preservatives, stabilizers, wetting agents, buffers or salts for influencing osmotic pressure, etc.
  • auxiliary agents e.g., preservatives, stabilizers, wetting agents, buffers or salts for influencing osmotic pressure, etc.
  • the agent may be incorporated into a cosmetic formulation.
  • sprayable aerosol preparations wherein the active ingredient, preferably in combination with a solid or liquid inert carrier material, is packaged in a squeeze bottle or in admixture with a pressurized volatile, normally gaseous propellant, e.g., pressurized air.
  • a pressurized volatile, normally gaseous propellant e.g., pressurized air.
  • Agents described herein can be formulated as neutral or salt forms.
  • Pharmaceutically acceptable salts include those formed with free amino groups such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., and those formed with free carboxyl groups such as those derived from sodium, potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2- ethylamino ethanol, histidine, procaine, etc.
  • the agents are administered in a therapeutically effective amount.
  • the amount of agents which will be therapeutically effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and can be determined by standard clinical techniques.
  • in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges.
  • the precise dose to be employed in the formulation will also depend on the route of administration, and the seriousness of the symptoms of obesity, and should be decided according to the judgment of a practitioner and each patient's circumstances. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems.
  • the invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention.
  • Optionally associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use of sale for human administration.
  • the pack or kit can be labeled with information regarding mode of administration, sequence of drug administration (e.g., separately, sequentially or concurrently), or the like.
  • the pack or kit may also include means for reminding the patient to take the therapy.
  • the pack or kit can be a single unit dosage of the combination therapy or it can be a plurality of unit dosages.
  • the agents can be separated, mixed together in any combination, present in a single vial or tablet. Agents assembled in a blister pack or other dispensing means is preferred.
  • unit dosage is intended to mean a dosage that is dependent on the individual pharmacodynamics of each agent and administered in FDA approved dosages in standard time courses.
  • the therapeutic agent(s) are administered in a therapeutically effective amount (i.e., an amount that is sufficient to treat the disease (e.g., obesity or an obesity-associated condition), such as by ameliorating symptoms associated with the disease, preventing or delaying the onset of the disease, and/or also lessening the severity or frequency of symptoms of the disease).
  • a therapeutically effective amount i.e., an amount that is sufficient to treat the disease (e.g., obesity or an obesity-associated condition), such as by ameliorating symptoms associated with the disease, preventing or delaying the onset of the disease, and/or also lessening the severity or frequency of symptoms of the disease).
  • the amount which will be therapeutically effective in the treatment of a particular individual's disorder or condition will depend on the symptoms and severity of the disease, and can be determined by standard clinical techniques.
  • in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges.
  • Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems.
  • haplotypes can be used to identify individuals at risk for obesity and associated conditions.
  • Haplotypes are a combination of genetic markers, e.g., particular alleles at polymorphic sites. Markers can include, for example, SNPs and microsatellites.
  • the haplotypes can comprise a combination of various genetic markers; therefore, detecting haplotypes can be accomplished by methods known in the art for detecting sequences at polymorphic sites. For example, standard techniques for genotyping for the presence of SNPs and/or microsatellite markers can be used, such as fluorescent based techniques (Chen, et al, Genome Res.
  • markers and SNPs can be identified in at-risk haploptypes. Certain ethods of identifying relevant markers and SNPs include the use of linkage disequilibrium (LD) and/or LOD scores.
  • LD linkage disequilibrium
  • Linkage Disequilibrium Linkage Disequilibrium refers to a non-random assortment of two genetic elements. For example, if a particular genetic element (e.g., "alleles" at a polymorphic site; see below) occurs in a population at a frequency of 0.25 and another occurs at a frequency of 0.25, then the predicted occurrance of a person's having both elements is 0.125, assuming a random distribution of the elements. However, if it is discovered that the two elements occur together at a frequency higher than 0.125, then the elements are said to be in linkage disequilibrium since they tend to be inherited together at a higher rate than what their independent allele frequencies would predict.
  • a particular genetic element e.g., "alleles" at a polymorphic site; see below
  • LD linkage disequilibrium
  • r 2 sometimes denoted ⁇ 2
  • Both measures range from 0 (no disequilibrium) to 1 ('complete' disequilibrium), but their interpretation is slightly different.
  • is defined in such a way that it is equal to 1 if just two or three of the possible haplotypes are present, and it is ⁇ 1 if all four possible haplotypes are present.
  • that is ⁇ 1 indicates that historical recombination has occurred between two sites (recurrent mutation can also cause
  • the measure r 2 represents the statistical correlation between two sites, and takes the value of 1 if only two haplotypes are present. It is arguably the most relevant measure for association mapping, because there is a simple inverse relationship between r 2 and the sample size required to detect association between susceptibility loci and SNPs.
  • a determination of how sfrong LD is across an entire region that contains many polymorphic sites might be desirable (e.g., testing whether the strength of LD differs significantly among loci or across populations, or whether there is more or less LD in a region than predicted under a particular model).
  • Measuring LD across a region is not straightforward, but one approach is to use the measure r, which was developed in population genetics. Roughly speaking, r measures how much recombination would be required under a particular population model to generate the LD that is seen in the data. This type of method can potentially also provide a statistically rigorous approach to the problem of determining whether LD data provide evidence for the presence of recombination hotspots.
  • a significant r 2 value can be 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1.0.
  • haplotype analysis involves defining a candidate susceptibility locus using LOD scores. The defined regions are then ultra-fine mapped with microsatellite markers with an average spacing between markers of less than 100 kb. All usable microsatellite markers that are found in public databases and mapped within that region can be used. In addition, microsatellite markers identified within the deCODE genetics sequence assembly of the human genome can be used. The frequencies of haplotypes in the patient and the control groups can be estimated using an expectation-maximization algorithm (Dempster A. et al, 1977. J. R. Stat. Soc. B, 39:1-389).
  • haplotype analysis is then repeated and the most significant p-value registered is determined.
  • This randomization scheme can be repeated, for example, over 100 times to construct an empirical distribution of p-values.
  • a p-value of ⁇ 0.05 is indicative of an at-risk haplotype.
  • haplotype analysis One general approach to haplotype analysis involves using likelihood-based inference applied to NEsted MOdels. The method is implemented in the program NEMO, which allows for many polymorphic markers, SNPs and microsatellites. The method and software are specifically designed for case-control studies where the purpose is to identify haplotype groups that confer different risks. It is also a tool for studying LD structures. When investigating haplotypes constructed from many markers, apart from looking at each haplotype individually, meaningful summaries often require putting haplotypes into groups. A particular partition of the haplotype space is a model that assumes haplotypes within a group have the same risk, while haplotypes in different groups can have different risks.
  • Two models/partitions are nested when one, the alternative model, is a finer partition compared to the other, the null model, i.e, the alternative model allows some haplotypes assumed to have the same risk in the null model to have different risks.
  • One common way to handle uncertainty in phase and missing genotypes is a two-step method of first estimating haplotype counts and then treating the estimated counts as the exact counts, a method that can sometimes be problematic (e.g., see the information measure section below) and may require randomization to properly evaluate statistical significance.
  • maximum likelihood estimates, likelihood ratios and p-values are calculated directly, with the aid of the EM algorithm, for the observed data treating it as a missing-data problem.
  • NEMO allows complete flexibility for partitions. For example, the first haplotype problem described in the Methods section on Statistical analysis considers testing whether hi has the same risk as the other haplotypes fa, ..., h k .
  • the alternative grouping is [h , [h 2 , ..., fa] and the null grouping is [hi, ..., fa].
  • the alternative grouping is [hi], [h 2 ], [fa] and the null grouping is [hi, fa], [fa]. If composite alleles exist, one could collapse these alleles into one at the data processing stage, and performed the test as described. This is a perfectly valid approach, and indeed, whether we collapse or not makes no difference if there were no missing information regarding phase.
  • the alternative grouping is [h ], [h 2a , fa b , ...., fa e ], [fa a , fab, ..., fae] and the null grouping is [fa, h 2a , fab, ...., fa e ], [fa , fab, ⁇ .., fa e ].
  • the same method can be used to handle composite where collapsing at the data processing stage is not even an option since Lc represents multiple haplotypes constructed from multiple SNPs.
  • ⁇ * is useful because the ratio ⁇ / ⁇ * happens to be a good measure of information, or 1 - ( ⁇ ⁇ *) is a measure of the fraction of information lost due to missing information.
  • This information measure for haplotype analysis is described in Hoffman and Kong, Technical Report 537, Department of Statistics, University of Statistics, University of Chicago, Revised for Biometrics (2003) as a natural extension of information measures defined for linkage analysis, and is implemented in NEMO.
  • the Fisher exact test can be used to calculate two-sided p-values for each individual allele. All p-values are presented unadjusted for multiple comparisons unless specifically indicated. The presented frequencies (for microsatellites, SNPs and haplotypes) are allelic frequencies as opposed to carrier frequencies. To minimize any bias due the relatedness of the patients who were recruited as families for the linkage analysis, first and second- degree relatives can be eliminated from the patient list. Furthermore, the test can be repeated for association correcting for any remaining relatedness among the patients, by extending a variance adjustment procedure (e.g., as described in Risch, N.
  • a variance adjustment procedure e.g., as described in Risch, N.
  • Cohorts of patients and controls can be randomized and the association analysis redone multiple times (e.g., up to 500,000 times) and the p-value is the fraction of replications that produced a p-value for some marker allele that is lower than or equal to the p-value we observed using the original patient and control cohorts.
  • relative risk (RR) and the population attributable risk (PAR) can be calculated assuming a multiplicative model (haplotype relative risk model), (Terwilliger, J.D. & Ott, J., Hum Hered, 42, 337-46 (1992) and Falk, CT.
  • haplotype counts of the affecteds and controls each have multinomial distributions, but with different haplotype frequencies under the alternative hypothesis.
  • haplotype frequencies fa and fa
  • ⁇ s (hi)/ ⁇ sk(hj) fi/pi)/(jjlpj) > where/and p denote respectively frequencies in the affected population and in the control population.
  • p denote respectively frequencies in the affected population and in the control population.
  • p-values are always valid since they are computed with respect to null hypothesis.
  • haplotype frequencies are estimated by maximum likelihood and tests of differences between cases and controls are performed using a generalized likelihood ratio test (Rice, J.A.
  • 2 [£(r,p 1 2 , ..., ⁇ l ) - £(l,p 1 , ⁇ 2 , ..., ⁇ k till 1 )]
  • ⁇ de denotes log e likelihood
  • ⁇ and ⁇ denote maximum likelihood estimates under the null hypothesis and alternative hypothesis respectively.
  • has asymptotically a chi-square distribution with 1-df, under the null hypothesis. Slightly more complicated null and alternative hypotheses can also be used. For example, let fa be GO, fa be GX and fa be AX.
  • the second P-value can be calculated by comparing the observed LOD-score with its complete data sampling distribution under the null hypothesis (e.g., Gudbjartsson et al, Nat. Genet. 25:12-3, 2000). When the data consist of more than a few families, these two P-values tend to be very similar.
  • haplotype analysis involves defining a candidate susceptibility locus based on "haplotype blocks.” It has been reported that portions of the human genome can be broken into series of discrete haplotype blocks containing a few common haplotypes; for these blocks, linkage disequilibrium data provided little evidence indicating recombination (see, e.g., Wall., J.D. and
  • haplotype block includes blocks defined by either characteristic. Representative methods for identification of haplotype blocks are set forth, for example, in U.S. Published Patent Applications 20030099964; 20030170665; 20040023237; 20040146870. Haplotype blocks can be used readily to map associations between phenotype and haplotype status.
  • the main haplotytpes can be identified in each haplotype block, and then a set of "tagging" SNPs or markers (the smallest set of SNPs or markers needed to distinguish among the haplotypes) can then be identified These tagging SNPs or markers can then be used in assessment of samples from groups of individuals, in order to identify association between phenotype and haplotype. If desired, neighboring haplotype blocks can be assessed concurrently, as there may also exist linkage disequilibrium among the haplotype blocks.
  • NUCLEIC ACIDS AND POLYPEPTIDES OF THE INVENTION All nucleotide positions are relative to SEQ ID NO: 1 (FIGS. 10.1 to 10.122; Hs_Build34_ chromosome 5_90600642-90925734) as indicated.
  • the nucleic acids, polypeptides and antibodies described herein can be used in methods of diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, as well as in kits useful for such diagnosis.
  • the reference amino acid sequence for human ARRDC3 (GenBank Accession No.: NP_065852) is described by SEQ ID NO: 3 (FIG. 12).
  • association with the ARRDC3 gene means in proximity to the ARRDC3 gene as described herein.
  • a haplotype is within about 350 kb, 300 kb, 250 kb, 200 kb, 150 kb, 100 kb, 75 kb, 50 kb, 25 kb, 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of the ARRDC3 gene, and is thereby associated with the ARRDC3 gene.
  • an "isolated" nucleic acid molecule is one that is separated from nucleic acids that normally flank the gene or nucleotide sequence (as in genomic sequences) and/or has been completely or partially purified from other transcribed sequences (e.g., as in an RNA library).
  • an isolated nucleic acid of the invention can be substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized.
  • the isolated material will form part of a composition (for example, a crude extract containing other substances), buffer system or reagent mix.
  • the material can be purified to essential homogeneity, for example as determined by poiyacrylamide gel electrophoresis (PAGE) or column chromatography (e.g., HPLC).
  • An isolated nucleic acid molecule of the invention can comprise at least about 50%, at least about 80% or at least about 90% (on a molar basis) of all macromolecular species present.
  • genomic DNA the term “isolated” also can refer to nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated.
  • the isolated nucleic acid molecule can contain less than about 350 kb, 300 kb, 250 kb, 200 kb, 150 kb, 100 kb, 75 kb, 50 kb, 25 kb, 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of the nucleotides that flank the nucleic acid molecule in the genomic DNA of the cell from which the nucleic acid molecule is derived.
  • the nucleic acid molecule can be fused to other coding or regulatory sequences and still be considered isolated.
  • recombinant DNA contained in a vector is included in the definition of "isolated" as used herein.
  • isolated nucleic acid molecules include recombinant DNA molecules in heterologous host cells or heterologous organisms, as well as partially or substantially purified DNA molecules in solution. "Isolated" nucleic acid molecules also encompass in vivo and in vitro RNA transcripts of the DNA molecules of the present invention.
  • An isolated nucleic acid molecule or nucleotide sequence can include a nucleic acid molecule or nucleotide sequence that is synthesized chemically or by recombinant means.
  • nucleotide sequences are useful, for example, in the manufacture of the encoded polypeptide, as probes for isolating homologous sequences (e.g., from other mammalian species), for gene mapping (e.g., by in situ hybridization with chromosomes), or for detecting expression of the gene in tissue (e.g., human tissue), such as by Northern blot analysis or other hybridization techniques.
  • the invention also pertains to nucleic acid molecules that hybridize under high stringency hybridization conditions, such as for selective hybridization, to a nucleotide sequence described herein (e.g., nucleic acid molecules that specifically hybridize to a nucleotide sequence containing a polymorphic site associated with a haplotype described herein).
  • the invention includes variants that hybridize under high stringency hybridization and wash conditions (e.g., for selective hybridization) to a nucleotide sequence that comprises SEQ ID NO: 1 or a fragment thereof (or a nucleotide sequence comprising the complement of SEQ ID NO: 1 or a fragment thereof), wherein the nucleotide sequence comprises at least one polymorphic allele contained in the haplotypes (e.g., at-risk haplotypes) described herein.
  • Such nucleic acid molecules can be detected and/or isolated by allele- or sequence-specific hybridization (e.g., under high stringency conditions).
  • Specific hybridization refers to the ability of a first nucleic acid to hybridize to a second nucleic acid in a manner such that the first nucleic acid does not hybridize to any nucleic acid other than to the second nucleic acid (e.g., when the first nucleic acid has a higher complementarity to the second nucleic acid than to any other nucleic acid in a sample wherein the hybridization is to be performed).
  • “Stringency conditions” for hybridization is a term of art that refers to the incubation and wash conditions, e.g., conditions of temperature and buffer concentration, that permit hybridization of a particular nucleic acid to a second nucleic acid; the first nucleic acid can be perfectly (i.e., 100%) complementary to the second, or the first and second can share some degree of complementarity that is less than perfect (e.g., 70%, 75%, 85%, 95%). For example, certain high stringency conditions can be used to distinguish perfectly complementary nucleic acids from those of less complementarity.
  • the exact conditions that determine the sfringency of hybridization depend not only on ionic strength (e.g., 0.2XSSC, 0.1XSSC), temperature (e.g., room temperature, 42°C, 68°C) and the concenfration of destabilizing agents such as formamide or denaturing agents such as SDS, but also on factors such as the length of the nucleic acid sequence, base composition, percent mismatch between hybridizing sequences and the frequency of occurrence of subsets of that sequence within other non- identical sequences.
  • equivalent conditions can be determined by varying one or more of these parameters while maintaining a similar degree of identity or similarity between the two nucleic acid molecules.
  • conditions are used such that sequences of at least about 60% identity, at least about 70% identity, at least about 80% identity, at least about 90% identity or at least about 95% identity remain hybridized to one another.
  • hybridization conditions By varying hybridization conditions from a level of stringency at which no hybridization occurs to a level at which hybridization is first observed, conditions that will allow a given sequence to hybridize (e.g., selectively) with the most complementary sequences in the sample can be determined.
  • Exemplary conditions that describe the determination of wash conditions for moderate or low stringency conditions are described in Kraus, M. and Aaronson, S., Methods Enzymol, 200:546-556 (1991); and in, Ausubel, F.
  • washing is the step in which conditions are usually set so as to determine a minimum level of complementarity of the hybrids. Generally, starting from the lowest temperature at which only homologous hybridization occurs, each °C by which the final wash temperature is reduced (holding SSC concentration constant) allows an increase by 1% in the maximum mismatch percentage among the sequences that hybridize. Generally, doubling the concentration of SSC results in an increase in T m of about 17°C. Using these guidelines, the wash temperature can be determined empirically for high, moderate or low sfringency, depending on the level of mismatch sought.
  • a low stringency wash can comprise washing in a solution containing 0.2XSSC/0.1% SDS for 10 minutes at room temperature; a moderate sfringency wash can comprise washing in a pre-warmed (42°C) solution containing 0.2XSSC/0.1% SDS for 15 minutes at 42°C; and a high stringency wash can comprise washing in a pre-warmed (68°C) solution containing 0.1XSSC/0.1%SDS for 15 minutes at 68°C.
  • washes can be performed repeatedly or sequentially to obtain a desired result, as is known in the art.
  • Equivalent conditions can be determined by varying one or more of the above parameters, as is known in the art, while maintaining a similar degree of complementarity between the target nucleic acid molecule and the primer or probe used (e.g., the sequence to be hybridized).
  • the length of a sequence aligned for comparison purposes is at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 95% of the length of the reference sequence.
  • the actual comparison of the two sequences can be accomplished by well-known methods, for example, using a mathematical algorithm.
  • a non-limiting example of such a mathematical algorithm is described in Karlin, S. and Altschul, S. (Proc. Natl. Acad. Sci. USA, 90:5873-5877 (1993)). Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0), as described in Altschul, S. et al. (Nucleic Acids Res., 25:3389-3402 (1997)).
  • Another non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, CABIOS (1989). Such an algorithm is incorporated into the ALIGN program (version 2.0), which is part of the GCG sequence alignment software package.
  • a PAM120 weight residue table When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. Additional algorithms for sequence analysis are known in the art and include ADVANCE and ADAM as described in Torellis, A. and Robotti, C. (Comput. Appl. Biosci, 10:3-5 (1994); and FASTA described in Pearson, W. and Lipman, D., (Proc. Natl. Acad. Sci. USA, ⁇ 5:2444-8 (1988)).
  • the percent identity between two amino acid sequences can be accomplished using the GAP program in the GCG software package (Accelrys, Cambridge, UK) using either a Blossom 63 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4.
  • the percent identity between two nucleic acid sequences can be accomplished using the GAP program in the GCG software package, using a gap weight of 50 and a length weight of 3.
  • the present invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleic acid that comprises SEQ ID NO: 1 or a fragment thereof (or a nucleotide sequence comprising the complement of SEQ ID NO: 1 or a fragment thereof), wherein the nucleotide sequence comprises at least one polymorphic allele contained in the haplotypes (e.g., at-risk haplotypes) described herein.
  • the invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleotide sequence encoding an amino acid sequence selected from SEQ ID NO: 3, a polymorphic variant thereof, or a fragment or portion thereof.
  • the nucleic acid fragments of the invention are at least about 15, at least about 18, 20, 23 or 25 nucleotides, and can be 30, 40, 50, 100 or 200 or more nucleotides in length. Longer fragments, for example, 30 or more nucleotides in length, which encode antigenic polypeptides described herein, are particularly useful, such as for the generation of antibodies as described below.
  • the nucleic acid fragments of the invention are used as probes or primers in assays such as those described herein.
  • "Probes" or “primers” are oligonucleotides that hybridize in a base-specific manner to a complementary strand of a nucleic acid molecule.
  • probes and primers include polypeptide nucleic acids (PNA), as described in Nielsen, P. et al, (Science, 254:1497-1500 (1991)).
  • PNA polypeptide nucleic acids
  • a probe or primer comprises a region of nucleotide sequence that hybridizes to at least about 15, typically about 20-25, and in certain embodiments about 40, 50 or 75, consecutive nucleotides of a nucleic acid molecule comprising a contiguous nucleotide sequence from SEQ ID NO: 1 and comprising at least one allele contained in one or more haplotypes described herein, and the complement thereof.
  • the invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleotide sequence encoding an amino acid sequence selected from SEQ ID NO: 3, a polymorphic variant thereof, or a fragment or portion thereof.
  • a probe or primer can comprise 100 or fewer nucleotides; for example, in certain embodiments from 6 to 50 nucleotides, or, for example, from 12 to 30 nucleotides.
  • the probe or primer is at least 70% identical, at least 80% identical, at least 85% identical, at least 90% identical or at least 95% identical to the contiguous nucleotide sequence or to the complement of the contiguous nucleotide sequence.
  • the probe or primer is capable of selectively hybridizing to the contiguous nucleotide sequence or to the complement of the contiguous nucleotide sequence.
  • the probe or primer further comprises a label, e.g., a radioisotope, a fluorescent label, an enzyme label, an enzyme co-factor label, a magnetic label, a spin label, an epitope label.
  • a label e.g., a radioisotope, a fluorescent label, an enzyme label, an enzyme co-factor label, a magnetic label, a spin label, an epitope label.
  • nucleic acid molecules can be amplified and isolated by the polymerase chain reaction using synthetic oligonucleotide primers that are designed based on the sequence provided in SEQ ID NO: 1 (and optionally comprising at least one allele contained in one or more haplotypes described herein) and/or the complement thereof.
  • synthetic oligonucleotide primers that are designed based on the sequence provided in SEQ ID NO: 1 (and optionally comprising at least one allele contained in one or more haplotypes described herein) and/or the complement thereof.
  • the nucleic acid molecules can be amplified using cDNA, mRNA or genomic DNA as a template, cloned into an appropriate vector and characterized by DNA sequence analysis.
  • Other suitable amplification methods include the ligase chain reaction (LCR; see Wu, D. and Wallace, R, Genomics, 4:560-469 (1989); Landegren, U. etal, Science, 241:1077-1080 (1988)), transcription amplification (Kwoh, D.
  • ssRNA single-stranded RNA
  • dsDNA double-stranded DNA
  • the amplified DNA can be labeled (e.g., radiolabeled) and used as a probe for screening a cDNA library derived from human cells.
  • the cDNA can be derived from mRNA and contained in zap express (Stratagene, La Jolla, CA), ZIPLOX (Gibco BRL, Gaithersburg, MD) or other suitable vector.
  • Corresponding clones can be isolated, DNA can obtained following in vivo excision, and the cloned insert can be sequenced in either or both orientations by art recognized methods to identify the correct reading frame encoding a polypeptide of the appropriate molecular weight.
  • the direct analysis of the nucleotide sequence of nucleic acid molecules of the present invention can be accomplished using well-known methods that are commercially available. See, for example, Sambrook et al, Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al, Recombinant DNA Laboratory Manual, (Acad. Press, 1988)). Additionally, fluorescence methods are also available for analyzing nucleic acids (Chen, X. et al, Genome Res., 9:492-498 (1999)) and polypeptides. Using these or similar methods, the polypeptide and the DNA encoding the polypeptide can be isolated, sequenced and further characterized.
  • the isolated nucleic acid sequences of the invention can be used as molecular weight markers on Southern gels, and as chromosome markers that are labeled to map related gene positions.
  • the nucleic acid sequences can also be used to compare with endogenous DNA sequences in patients to identify genetic disorders (e.g., obesity, a susceptibility to obesity, an obesity-associated condition and/or a susceptibility to an obesity-associated condition), and as probes, such as to hybridize and discover related DNA sequences or to subtract out known sequences from a sample (e.g., subtractive hybridization).
  • RNA interference small double-stranded interfering RNA
  • RNAi is a post-transcription process, in which double-stranded RNA is introduced, and sequence-specific gene silencing results, though catalytic degradation of the targeted mRNA.
  • RNAi is used routinely to investigate gene function in a high throughput fashion or to modulate gene expression in human diseases (Chi, et al, Proc. Natl. Acad. Sci. £/&4,100(l l):6343-6346 (2003)). Introduction of long double standed RNA leads to sequence-specific degradation of homologous gene transcripts.
  • the long double stranded RNA is metabolized to small 21-23 nucleotide siRNA (small interfering RNA).
  • siRNA small interfering RNA
  • the siRNA then binds to protein complex RISC (RNA-induced silencing complex) with dual function helicase.
  • the helicase has RNAas activity and is able to unwind the RNA.
  • the unwound si RNA allows an antisense strand to bind to a target. This results in sequence dependent degradation of cognate mRNA.
  • exogenous RNAi chemically synthesized or recombinantly produced can also be used.
  • two polypeptides are substantially homologous or identical when the amino acid sequences are at least about 45-55%o. In other embodiments, two polypeptides (or a region of the polypeptides) are substantially homologous or identical when they are at least about 70-75%, at least about 80-85%, at least about 90%, at least about 95% or identical.
  • a substantially homologous amino acid sequence will be encoded by a nucleic acid molecule comprising SEQ ID NO: 1 or a portion thereof, and further comprising at least one polymo ⁇ hism as shown in Table 1 or Table 2, wherein the encoding nucleic acid will hybridize to SEQ ID NO: 1 under stringent conditions as more particularly described above.
  • a substantially homologous amino acid sequence will also be encoded by a nucleic acid molecule hybridizing to a nucleic acid sequence encoding SEQ ID NO: 3 or a portion thereof, or a polymo ⁇ hic variant thereof, under stringent conditions as more particularly described above.
  • variant polypeptide can differ in amino acid sequence by one or more substitutions, deletions, insertions, inversions, fusions, and truncations or a combination of any of these. Further, variant polypeptides can be fully functional or can lack function in one or more activities. Fully functional variants typically contain only conservative variation or variation in non-critical residues or in non- critical regions. Functional variants can also contain substitution of similar amino acids that result in no change or an insignificant change in function. Alternatively, such substitutions can positively or negatively affect function to some degree. Nonfunctional variants typically contain one or more non-conservative amino acid substitutions, deletions, insertions, inversions, or truncation or a substitution, insertion, inversion, or deletion in a critical residue or critical region.
  • Amino acids that are essential for function can be identified by methods known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham, B. and Wells, J., Science, 244:1081-1085 (1989)). The latter procedure introduces single alanine mutations at every residue in the molecule. The resulting mutant molecules are then tested for biological activity (e.g., using an in vitro assay). Sites that are critical for polypeptide activity can also be determined by structural analysis, for example, by crystallization, nuclear magnetic resonance or photoaffinity labeling (Smith, L. et al, J. Mol. Biol, 224:899-904 (1992); de Vos, A.
  • the isolated polypeptide can be purified from cells that naturally express it, purified from cells that have been altered to express it (recombinant), or synthesized using known protein synthesis methods.
  • the polypeptide is produced by recombinant DNA techniques. For example, a nucleic acid molecule encoding the polypeptide is cloned into an expression vector, the expression vector introduced into a host cell and the polypeptide expressed in the host cell. The polypeptide can then be isolated from the cells by an appropriate purification scheme using standard protein purification techniques.
  • polypeptides of the present invention can be used as a molecular weight marker on SDS-PAGE gels or on molecular sieve gel filtration columns using art-recognized methods.
  • the polypeptides of the present invention can be used to raise antibodies or to elicit an immune response.
  • the polypeptides can also be used as a reagent, e.g., a labeled reagent, in assays to quantitatively determine levels of the polypeptide or a molecule to which it binds (e.g., a receptor or a ligand) in biological fluids.
  • the polypeptides can also be used as markers for cells or tissues in which the corresponding pol j jieptide is preferentially expressed, either constitutively, during tissue differentiation, or in a diseased state.
  • the polypeptides can be used to isolate a corresponding binding partner, e.g., receptor or ligand, such as, for example, in an interaction trap assay, and to screen for peptide or small molecule antagonists or agonists of the binding interaction.
  • Antibodies that specifically bind one form of the gene product but not to the other form of the gene product are also provided. Antibodies are also provided that bind a portion of either the variant or the reference gene product that contains the polymorphic site or sites.
  • the invention provides antibodies to polypeptides having an amino acid sequence of SEQ ID NO: 3 or a variant ARRDC3 polypeptide.
  • antibody refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds an antigen.
  • a molecule that specifically binds to a polypeptide of the invention is a molecule that binds to that polypeptide or a fragment thereof, but does not substantially bind other molecules in a sample, e.g., a biological sample that naturally contains the polypeptide.
  • immunologically active portions of immunoglobulin molecules include Fv, Fab, Fab' and F(ab') 2 fragments. Such fragments can be produced by enzymatic cleavage or by recombinant techniques. For example, papain or pepsin cleavage can generate Fab or F(ab') 2 fragments, respectively. Other proteases with the requisite substrate specificity can also be used to generate Fab or F(ab') 2 fragments.
  • the invention provides polyclonal and monoclonal antibodies that bind to a polypeptide of the invention.
  • a monoclonal antibody composition thus typically displays a single binding affinity for a particular polypeptide of the invention with which it immunoreacts.
  • Polyclonal antibodies can be prepared as described above by immunizing a suitable subject with a desired immunogen, e.g., polypeptide of the invention or fragment thereof.
  • the antibody titer in the immunized subject can be monitored over time by standard techniques, such as with an enzyme linked immunosorbent assay (ELISA) using an immobilized polypeptide.
  • ELISA enzyme linked immunosorbent assay
  • the antibody molecules directed against the polypeptide can be isolated from the mammal (e.g., from the blood) and further purified by well-known techniques, such as protein A chromatography (e.g., to obtain the IgG fraction).
  • protein A chromatography e.g., to obtain the IgG fraction.
  • antibody-producing cells can be obtained from the subject and used to prepare monoclonal antibodies by standard techniques, such as the hybridoma technique (Kohler, G.
  • an immortal cell typically a myeloma
  • a lymphocyte typically a splenocyte
  • the culture supernatants of the resulting hybridoma cells are screened to identify a hybridoma producing a monoclonal antibody that binds a polypeptide of the invention.
  • a monoclonal antibody to a polypeptide of the invention can be identified and isolated by screening a recombinant combinatorial immunoglobulin library (e.g., an antibody phage display library) with the polypeptide to thereby isolate immunoglobulin library members that bind the polypeptide.
  • Kits for generating and screening phage display libraries are commercially available (e.g., Pharmacia Recombinant Phage Antibody System, Catalog No. 27-9400-01; Stratagene SurfZAFTM Phage Display Kit, Catalog No.240612). Additionally, examples of methods and reagents particularly amenable for use in generating and screening antibody display library can be found in, for example, U.S.
  • Patent No. 5,223,409 published PCT Application Nos. WO 92/18619, WO 91/17271, WO 92/20791, WO 92/15679, WO 93/01288, WO 92/01047, WO 92/09690 and WO 90/02809; Fuchs, P. et al, Biotechnology (NY), 9:1369-1372 (1991); Hay, B. et al, Hum. Antibodies Hybridomas, 3:81-85 (1992); Huse, W. et al, Science, 246:1275- 1281 (1989); and Griffiths, A. etal, EMBOJ, 12:725-734 (1993).
  • recombinant antibodies such as chimeric and humanized antibodies (e.g., antibodies comprising both human and non-human portions), which can be made using standard recombinant DNA techniques, are within the scope of the invention.
  • chimeric and humanized antibodies e.g., monoclonal antibodies
  • monoclonal antibodies can be produced by recombinant DNA techniques known in the art.
  • antibodies of the invention e.g., a monoclonal antibody
  • antibodies of the invention can be used to detect a polypeptide (e.g., in a cellular lysate, cell supernatant, tissue sample) in order to evaluate the abundance and pattern of expression of the polypeptide.
  • Antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, for example, to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials.
  • suitable enzymes include horseradish peroxidase, alkaline phosphatase, ⁇ -galactosidase, or acetylcholinesterase;
  • suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin;
  • suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin;
  • an example of a luminescent material includes luminol;
  • examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125 1, 131 I, 35 S, 32 P, 33 P, 14 C or 3 H.
  • Example 1 Identification of at-Risk Haplotypes Associated with the ARRDC3 Gene and with Obesity Genome-scans were carried out with Icelandic family material that included > 2 affecteds and 5 or 6 meiotic events per cluster (cormectivity ascertained by the Icelandic genealogy database).
  • Subjects suffering from clinical obesity (BMI > 30) were initially ascertained based on various end-point complications including type 2 diabetes, hypertension, stroke, myocardial infarction, familial combined hyperlipidemia and peripheral arterial occlusive disease. Different overlapping phenotypes (e.g., BMI >27, >30, >31....>35) were then examined.
  • Genome-wide Scans for Obesity Susceptibily Loci The genome- wide scan was performed using a set of 1100 framework markers (microstatellites) with an average intermarker distance of 4 cM.
  • HRGM high-resolution genetic map
  • LOD logarithms of odds
  • NPL nonparametric linkage
  • the allegro program produces LOD scores on the basis of multipoint calculations.
  • the baseline linkage analysis utilizes the S pa i rs scoring function (Whittemore and Halpern (Biometrics, 50: 118-127 (1994); and Kruglyak et al. (J. Hum. Genet., 58:1347-1363 (1996)), the exponential allele-sharing model, and a family-weighting scheme that is half-way on the log scale between weighting each affected pairs equally and similarly weighting each family equally (Gretarsdottir et al. (Am. J. Genet., 70:593-603 (2002)).
  • the Genome- wide scan was initiated using 51 pedigrees containing 124 affected males and 257 relatives.
  • the unrestricted analysis (all) used 168 families containing 520 affected persons and 883 relatives and the lod score in the peak regions was 2.15.
  • the one-lod drop region is 7 cM.
  • the corresponding p-value was defined as the minimum p-value for the pair of markers over the same combination, provided the joint probability was higher than 0.05.
  • the LD assessment was carried out across the one-lod drop interval showing a near-uniform distribution of LD blocks.
  • the marker density at the 5ql4 locus was one marker per 42 kb.
  • Haplotype frequency in patients and random controls was estimated using an EM-algorithm (Expectation Maximum) as follows. To assess the correct significance levels multiple testing were adjusted for using randomization methods. For haplotypes consisting of one, two, or three markers, all possible haplotype combinations were tested for association with the disease.
  • the combined patient and confrol group was then randomly divided into two groups, in size equal to the original patients and control groups, and the association analysis was repeated.
  • N 100 to 1000
  • an iterative procedure was used (for computational reasons).
  • the most significant 3 markers haplotypes were extended to 4 markers, by including the remaining markers one-by-one, and those haplotypes tested for association. This procedure was then iterated, i.e., selecting at each step the most significant haplotypes and adding other markers, until haplotypes including (typically) 6 to 10 markers were reached.
  • This haplotype involves microsatellite markers DG5S744, DG5S740, DG5S1387 and DG5S748 (haplotype I; see Table 1).
  • the CEPH sample 1347-02 CEPH genomics repository
  • the lower allele of each microsatellite in this sample is set at 0 and all other alleles in other samples are numbered accordingly in relation to this reference.
  • allele 1 is 1 bp longer than the lower allele in the CEPH sample 1347-02
  • allele 2 is 2 bp longer than the lower allele in the CEPH sample 1347-02
  • allele 3 is 3 bp longer than the lower allele in the CEPH sample 1347-02
  • allele 4 is 4 bp longer than the lower allele in the CEPH sample 1347-02
  • allele -1 is 1 bp shorter than the lower allele in the CEPH sample 1347-02
  • allele -2 is 2 bp shorter than the lower allele in the CEPH sample 1347-02
  • so on this same CEPH sample is a standard that is widely used throughout the world for calibration and comparison of alleles.
  • the KIAA1376 protein (e.g., GenBank Accession No.: NP_065852), one of the two included genes, shows a relatively strong identity (40-60%) to the human thioredoxin interacting protein (TXNIP) located on chromosome lq21 (FIG. 4).
  • TXNIP is a major locus for familial combined hyperlipidemia in various populations (Pajukanta et al.
  • KIAA1376 was named "TXNIP homologue" (ARRDC3). Further characterization revealed arrestin-like motifs in the ARRDC3 protein.
  • a number of SNPs in and around the ARRDC3 gene were identified by sequencing 282 obese male humans. Using microsatellite markers and SNPs, a strong LD block of 120 kb was identified. The LD block encompasses the 14 kb ARRDC3 (KIAA1376) gene and 106 kb in the 5' UTR region including the ARRDC3 (KIAA1376) promoter. The LD block does not encompass any other genes but ARRDC3 (KIAA1376) (FIG. 5).
  • haplotype II involves the markers DG5S745, SG05S41, SG05S422, DG5S741, SG05S32, SG05S31, SG05S30 and SG05S651 (See Table 1 and Table 2).
  • SNPs in combination with microsatellite markers
  • carrier frequency in affecteds males with BMI in the top 10% of the distribution
  • PAR population attributable risk
  • the number of affecteds was 755, while the number of confrols was 406.
  • This haplotype includes the markers: SG05S40, SG05S37, SG05S421, SG05S31, SG05S30 and DG5S743.
  • these markers all reside within the LD block and are highly correlated (FIG. 7.2).
  • a SNP-only haplotype was also identified.
  • This haplotype includes the markers SG05S40, SG05S436, SG05S435, SG05S433, SG05S37,
  • DG5S748 CCGATCAGGATCTCATTTAATCTGTCCATATTAAATGCAATAGCCTCCTC AGTATAAAGATGTGTGTGTATATGAAATGCATATGTCGTGTGTGTGTG TGTGTGTGTGTACATATATATAGGATAAAGGTTCCGACAGCT (SEQ ID NO: 4)
  • DG5S743 ATTCTTGGGACAACCAATGCCAGAAATTTTGACAAATATTCTCATAAGTG CTAAAGCAGATGAGTGAGGAAGCACAAGGTGTTAGTTAAAGAGTAACTG AAAGCAAGGAGATTTTATCTGGGTGTCACCAGAAGTGCAGCATTTGTAT TGAAGTAGCTGGAGGACATTAGAAAAAACAGACAGACAGACAGGAATA AATGAGGATGTTAAGGAAGAATTTTACAAAATCTCTCTACTAAGTTTCCCA GATCAGCAAACATTTTCAGACAGAGAGAGAGAGAGAGAGAGAGAGA GAGAGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT
  • Example 2 The ARRDC3 human obesity linkage region is syntenic to a murine obesity linkage region Following positional cloning of the male obesity susceptibility locus described in Example 1, a syntenic murine obesity susceptibility locus has been identified.
  • the term "syntenic” describes a group of two or more chromosomal regions, found on one locus in a first species, that are also found in a homologous locus in a second species.
  • FIG. 1 highlights the human chromosome 5 linkage to BMI in obese males supporting the ARRDC3 locus. Existance of two distinct loci controlling for BMI in males is supported by the broad linkage region.
  • the linkage is clearly significant, with a peak lod score greater than 4, the region supported by the linkage is quite large.
  • One method that has proven useful in narrowing such linkage regions is "intersecting" them with linkages in other species.
  • the left panel depicts several peaks corresponding to linkages on murine chromosome 13 to obesity-related traits in the BXD mouse cross described by Drake et al Physiol Genomics (2001) Apr 27; 5(4): 205-15.
  • the traits represented include insulin levels, leptin levels, and fat mass.
  • ARRDC3 is a candidate for the distal peak in both species identified through positional cloning.
  • ARRDC3 gene region was examined to determine if DNA variations in this gene could be found that were polymo ⁇ hic between these two strains. In total, three single nucleotide polymo ⁇ hisms were identified between the transcription START and STOP sites, with two falling in the 3'UTR and 1 in an intron. Further, ARRDC3 is located in a region that was determined not to be identical by descent between the B6 and DBA strains of mice, using the existing genomic sequence for these strains. These data are consistent with ARRDC3 as a positional candidate for the murine QTL (i.e., we would expect the B6 and DBA sframs to carry different haplotypes for this gene if the gene were the gene carrying the QTL).
  • Example 3 ARRDC3 is expressed in diverse tissues The general expression of ARRDC3 was assessed to determine whether it was expressed in various tissue types in both human and mouse populations.
  • FIG. 15 depicts the mean expression levels of ARRDC3 in over 80 human tissues and cell lines. This expression atlas was constructed by examining the relative transcript abundances in multiple samples for each of these tissue types for approximately 23,000 genes. ARRDC3 is seen to be more active in peripheral tissue like blood, adipose and liver tissue, compared to more central tissues like hypothalamus known to be involved in obesity related traits. In fact, effectively no expression of ARRDC3 was seen in hypothalamus tissue, suggesting that its function related to obesity may in fact be specific to peripheral tissues where it is seen to be more highly expressed.
  • Example 4 ARRDC3 expression in blood is linked to a hotspot eLOD region on chromosome 9
  • NPL nonparametric linkage analysis
  • a genome-wide scan for linkage was conducted for the ARRDC3 gene expression trait.
  • the chromosome 9 locus is a hotspot for eLOD activity, with the expression of hundreds of genes in blood partially explained by this locus.
  • the GO functional categories associated with genes represented on the array are tested using the Fisher Exact Test to determine if the categories are over represented in the 250 gene set.
  • a significant p-value suggests possible functional roles for the cluster.
  • Proteome GeneOntology Functional Categories P-value nucleus 1.91E-08 transcription factor activity 2.10E-05 small nucleolar ribonucleoprotein complex 0.000162887 negative regulation of G-protein coupled receptor protein signaling pathway 0.000222109 transcriptional activator activity 0.000347302 regulation of transcription, DNA-dependent 0.000413329 nucleosome spacing 0.000759217
  • the group of genes co-localizing to the chromosome 9 locus strongly controlling ARRDC3 expression in females is significantly enriched for genes that are significantly correlated with obesity-related traits. This is of note because the ARRDC3 gene has been implicated in obesity for males with much less of an effect observed in females. Therefore, it may be that the control of ARRDC3 in females is different than in males, so that even though females may carry the specific alleles that lead to obesity when carried in males, there may be compensatory factors at play in females that result in less exposure to disease. The female specific genetic control of the expression of this gene observed in the family pilot study is consistent with these observations.
  • Example 5 ARRDC3 expression in adipose tissue is significantly correlated with BMI in males but not females
  • the expression of ARRDC3 was further explored to examine patterns of expression in adipose associated with obesity-related traits.
  • subcutaneous and omental fat tissue samples were collected from approximately 80 individuals.
  • the same survey/phenotyping protocol employed for the experiment described in Example 4 was employed in this study.
  • RNA was isolated from the adipose samples of each individual and hybridized to a gene expression microarray representing 23,000+ genes, including ARRDC3.
  • FIG. 17 highlights a significant correlation between ARRDC3 expression and BMI for those individuals participating in this pilot study.
  • Example 6 Patterns of adipose tissue expression associated with ARRDC3 exression discriminates high BMI individuals from low BMI individuals
  • Two hundred and fifty genes (the GenBank accession numbers of the genes used in this experiment are given in Table 4) representing the genes most correlated with ARRDC3 expression in adipose tissue were clustered using 2-dimensional, unsupervised agglomerative hierarchical clustering, where in both dimensions the heuristic criteria parameter was set to Average Link and the similarity measure was error weighted Pearson correlation coefficient.
  • FIG. 18 shows a pattern of expression associated with ARRDC3 expression in adipose tissue.
  • the pattern of expression more strongly supports the association to BMI compared to the correlation of ARRDC3 expression alone.
  • the genes are clustered along the x-axis, and the adipose samples are clustered along the y-axis.
  • ARRDC3 -associated genes and BMI indicate that this pattern is biologically meaningful, but the over-representation of Gene Ontology (GO) functional categories in the set of genes used in the clustering procedure also supports the significance of the cluster as it relates to obesity traits and suggests possible functional roles for ARRDC3. If the set of genes whose expression were associated with ARRDC3 expression were an artifact, we would not expect to see significant over representation of the GO categories.
  • Table 4 A list of GO categories over represented in the 250 gene set is given in Table 4.
  • NM 022734 Genes associated with ARRDC3 expression in adipose tissue are also associated with ARRDC3 expression in blood It was of interest to determine whether genes interacting with ARRDC3 expression in adipose tissue were also interacting with ARRDC3 expression in other tissues. Lists of genes associated with ARRDC3 in blood and adipose were constructed and the overlap between the lists was examined to determine if there was a statistically significant enrichment. Of the top 950 genes most significantly associated with ARRDC3 in blood and adipose tissues, there were 116 genes in the overlap, a very significant enrichment, given there were more than 23,000 genes represented on the array. Table 5 gives the GenBank accession numbers for the 116 genes. This set of genes potentially represents a more "core" set of genes most strongly interacting with ARRDC3, given their interaction with ARRDC3 is seen in multiple tissues.
  • Example 8 Expression pattern ofARRDC3 gene in human subcutaneous fat distinguishes fed and fasted states
  • gene expression pattern in adipose tissue was examined in either fasted or fed individuals. Samples of RNA were collected from two groups of individuals. In Group 1, two biopsies of subcutaneous adipose were collected one week apart tissue from ten healthy donors. All had been fasting overnight. In Group 2, two biopsies of subcutaneous adipose tissue were collected one week apart from ten healthy donors.
  • FIG. 21 illustrates the experimental design for Group 2.
  • RNA was isolated from the fat tissue samples and hybridised in DNA microarrays using 24,000 gene-specific probes. Expression levels were analyzed using ANalysis Of VAriance (ANOVA), a calculation procedure to allocate the amount of variation in data and determine if it is significant or is caused by random noise.
  • ANOVA ANalysis Of VAriance
  • Infra-individualistic comparison between two fasting data points (week 1 and week 2) in Group 1 (n 10), yielded 0 results (0 genes responding to feeding) at ANOVA p-value ⁇ 0.001 and 4 genes responding to feeding at ANOVA p-value ⁇ 0.01.
  • infra- and inter-individualistic comparison yielded 114 genes that are responding to feeding at ANOVA p-value ⁇ 0.001 and 402 genes at ANOVA p- value ⁇ .01. 402 genes found responsing to feeding in the combined infra- and inter- individualistic comparisons at ANOVA p-value ⁇ 0.01 were subjected to clustering analysis similar to the one described in Example 6. As shown in FIG. 22 and FIG.
  • PDK4 is also found among the genes that cluster with ARRDC3 gene expression in visceral fat, a cluster that discriminates between subjects with high BMI from subjects with low BMI. As shown in FIGs. 25.1 and 25.2 for Group 2 (vertical axis shows log ratio of the expression level), the transcript levels of PDK4 in adipose tissue were reduced upon food intake.
  • fasting signature distinguishes between fasted and fed states.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

A role of the human ARRDC3 (KIAA1376) gene in obesity is disclosed. Methods for diagnosis of obesity and treatment for obesity are also disclosed.

Description

HAPLOTYPES IN THE HUMAN THIOREDOXIN INTERACTING PROTEIN HOMOLOGUE (ARRDC3) GENE ASSOCIATED WITH OBESITY
RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No.
60/649,895, filed on February 2, 2005 and U.S. Provisional Application No. 60/566,982, filed on April 30, 2004. The entire teachings of the above applications are incorporated herein by reference.
BACKGROUND OF THE INVENTION , Obesity is one of the most serious and widespread health problems facing the world community. It is estimated that currently in the United States, 55% of adults are obese or overweight and 20% of teenagers are either obese or significantly overweight. In addition, 6% of the total population of the United States is morbidly obese (defined as having a body mass index (BMI) of more than forty). This data is alarming, as it indicates an obesity epidemic. Many health conditions are consequences of being overweight. For example, obesity is a known risk factor for the development of diabetes and is asserted to be the cause of approximately 80% of Type 2 diabetes (e.g., adult onset diabetes) in the United States. Obesity is also a substantial risk factor for a wide range of cardiovascular, metabolic and other diseases and disorders (e.g., coronary artery disease, dyslipidemias (e.g., hyperlipidemia), stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis and so ne forms of cancer (e.g., colorectal cancer, breast cancer)). More recently, researchers have documented links between obesity and infertility, and obesity and miscarriages. Susceptibility to obesity is determined by genetic, environmental (e.g., food availability, sociocultural factors, lifestyle) and regulatory factors (e.g., pregnancy, increases in fat cells and adipose tissue during infancy, childhood and/or adulthood, brain damage, drugs, endocrine factors and psychological factors (e.g., binge eating disorder, night-eating disorder)). Studies using twins, adopted children and animal models of obesity have demonstrated that genetic factors are clearly implicated in the dynamics of gaining weight. Morbid obesity in humans appears to have a particularly strong genetic component. Genetic risk is conferred by subtle differences in genes among individuals in a population. Genes differ between individuals most frequently due to single nucleotide polymorphisms (SNPs), although other variations can occur (e.g., insertions, deletions, translocations or inversions of either short or long stretches of DNA). SNPs are located on average every 1000 base pairs in the human genome. Accordingly, a typical human gene containing 250,000 base pairs may contain 250 different SNPs. Only a minor number of SNPs are located in exons and alter the amino acid sequence of the protein encoded by a gene. Most SNPs have no effect on gene function, while others may alter transcription, splicing, translation or stability of the mRNA encoded by a gene. As genetic polymorphisms that confer risk of disease are uncovered, genetic testing for such risk factors becomes important for clinical medicine. At the present time, the prognosis for obesity is poor. If left untreated, obesity tends to progress. There are a number of treatments that patients undertake in an attempt to lose weight, including dieting, exercise, behavior therapy, drugs and surgery. Even with treatment resulting in weight loss, most patients return to their pre-treatment weight within five years of treatment. Given the genetic link to obesity, there is a need to identify molecular markers for the early detection of susceptible individuals so that intervention regimes may be instituted for delay or prevention of obesity and its complications. Further, identification of specific susceptibility genes will allow for the design of more effective treatments that can target specific molecular targets.
SUMMARY OF THE INVENTION As described herein, the gene thioredoxin interacting protein homologue, known as ARRDC3 (and sometimes referred to as TXNTPH or KIAA1376), is . associated with obesity and obesity-associated conditions. It has been discovered that particular combinations of genetic markers ("haplotypes") are present at a higher than expected frequency in obese patients. The markers that are included in the haplotypes described herein are associated with the genomic region that directs expression of the ARRDC3 gene. In one aspect, the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the presence of the at-risk haplotype associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. In one embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2. In another embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8. In another embodiment, the at-risk haplotype is selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV. In another aspect, the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype I, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. In another aspect, the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype II, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. In another aspect, the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype III, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. In another aspect, the invention relates to a method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype IV, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. In another aspect, the invention relates to a method of determining the genetic basis of obesity or an obesity-associated condition, comprising detecting the presence of an at-risk haplotype associated with the ARRDC3 gene, wherein the presence of the at-risk haplotype is indicative that the obesity and/or obesity- associated condition is mediated by ARRDC3. In one embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2. In another embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8. In another embodiment, the at-risk haplotype is selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV. In another aspect, the invention features a method of treating or preventing obesity and/or an obesity-associated condition in a subject, comprising administering a compound that increases the expression or biological activity of AJ RDC3 to the subject in need thereof, in a therapeutically effective amount. In one embodiment, the subject has an at-risk haplotype associated with the ARRDC3 gene. In another embodiment, the subject has decreased ARRDC3 expression or activity. In another embodiment, the subject has increased thioredoxin expression or activity. In another aspect, the invention relates to a method of reducing triglyceride levels in a subject, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount. In one embodiment, the subject has an at-risk haplotype associated with the ARRDC3 gene. In another embodiment, the subject has decreased ARRDC3 expression or activity. In another embodiment, the subject has increased thioredoxin expression or activity. In another aspect, the invention relates to a method of increasing fatty acid oxidation in a subject, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount. In one embodiment, the subject has an at-risk haplotype associated with the ARRDC3 gene. In another embodiment, the subject has decreased ARRDC3 expression or activity. In another embodiment, the subject has increased thioredoxin expression or activity. In another aspect, the invention relates to a method of assessing a subject for an increased risk of obesity and/or an obesity-associated condition, comprising assessing the interaction between ARRDC3 and thioredoxin in the subject, wherein an increased level of interaction is indicative of a decreased risk of obesity and/or an obesity-associated condition. In one embodiment, the subject has an at-risk haplotype associated with the ARKDC3 gene. In another embodiment, the subject has decreased ARRDC3 expression or activity. In another embodiment, the subject has increased thioredoxin expression or activity. In another aspect, the invention relates to a method of assessing response to treatment with a compound that increases the level of expression or biological activity of ARRDC3 by a subject in a target population, comprising: assessing the level of expression or biological activity of ARRDC3 in the subject before treatment with a compound that increases the expression or biological activity of ARRDC3; assessing the level of expression or biological activity of ARRDC3 in the subject during or after treatment with the compound that increases the expression or biological activity of ARRDC3; and comparing the level of the expression or biological activity of ARRDC3 with the level of the expression or biological activity of ARRDC3 during or after treatment, wherein a level of the expression or biological activity of ARRDC3 during or after treatment that is significantly higher than the level of the expression or biological activity of ARRDC3 before treatment, is indicative of efficacy of treatment with the compound that increases the expression or biological activity of ARRDC3. In another embodiment, the invention features a method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene, the marker having a p-value of lxl 0"5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity. In another embodiment, the invention features a method of diagnosing a predisposition or susceptibility to an obesity-associated condition in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene, the marker having a p-value of lxl 0"5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to an obesity-associated condition. In particular embodiments of the methods of the invention, the at-risk haplotype comprises haplotype I, haplotype II, haplotype III, haplotype IV, or combinations of haplotype I, haplotype II, haplotype III, and haplotype IV. In other particular embodiments of the methods of the invention, determination of the presence or absence of the at-risk haplotype comprises enzymatic amplification, electrophoretic analysis, sequence analysis and/or restriction fragment length polymorphism analysis. In other embodiments, the at-risk haplotype has a relative risk of at least 1.5, at least 2.5 or at least 3.0. In other embodiments, the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"5 or less, lxl 0"6 or less, lxlO"7 or less or lxlO"8 or less. In still other embodiments, the method is for diagnosing an obesity-associated condition such as diabetes (e.g., Type 2 or adult onset diabetes), coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages, or cancer or a susceptibility to an obesity-associated condition in a subject. In another aspect, the invention features a kit for assaying a sample from a subject to detect a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, wherein the kit comprises one or more reagents for detecting an at-risk haplotype associated with the ARRDC3 gene. In one embodiment, the nucleic acid comprises at least one contiguous nucleotide sequence that is completely complementary to a region comprising at least one of the markers of the at-risk haplotype. In another embodiment, the one or more reagents comprise one or more nucleic acids that are capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene. In another embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2, for example, a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype TV. In another embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8. In another aspect, the invention features a kit for assaying a sample from a subject to detect a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, wherein the kit comprises: a) one or more labeled nucleic acids capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene; and b) reagents for detection of the label. In one embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2, for example, a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype TV. In another embodiment, the at-risk haplotype comprises a haplotype comprising one or more markers selected from FIGS. 2, 6, 7.1, or 8. In the above kit claims, the obesity-associated condition can be, for example, diabetes (e.g., Type 2 or adult onset diabetes), coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages or cancer. The invention helps meet unmet medical needs in at least two major ways. First, the invention provides a means to define patients at higher risk for obesity or an obesity-associated condition than the general population. These at-risk patients can be more aggressively managed by their physicians in an effort to prevent obesity. Secondly, the invention identifies a drug target that can be used to screen and develop therapeutic agents that can be used to treat or prevent obesity and/or an obesity-associated condition. BRIEF DESCRIPTION OF THE DRAWINGS The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. FIG. 1 is a graph depicting linkage of obesity in males to human chromosome 5. Lod scores are plotted for human chromosome 5. The region between microsatellites D5S428 and D5S644 and particular genes contained therein are shown below the Lod graph. FIG. 2 is a table depicting haplotype I and its association with severe obesity (BMI>35) in males. FIG. 3.1 is a schematic of annotated sequences contained in the 280 kb region defined by the boundary markers of haplotype I. FIG. 3.2 is a table summarizing the analysis of the expressed sequences depicted in FIG. 3.1. Genes containing or missing exons are indicated. FIG. 4 is a schematic of expressed sequences ARRDC3 (KIAA1376) and XM_299045, which are contained in the region defined by the boundary markers of haplotype I. FIG. 5 is a schematic of expressed sequences ARRDC3 (KIAA1376) and XM_299045, which are contained in the region defined by the boundary markers of haplotype I. ARRDC3 is also contained within the LD block. FIG. 6 is a table depicting haplotype II and its association with severe obesity (BMI: top 10%) in males. FIG. 7.1 is a table depicting SNPs and markers (haplotype III) associated with severe obesity (BMI: top 10%) in males. FIG. 7.2 is a table showing that the markers exhibiting the strongest association are found in the same LD block and are highly correlated. LD between markers was calculated using the standard definitions of D' and R2 (Lewontin, R., Genetics, 49:49-67 (1964) and Hill, W.G. & Robertson, A. Theor. Appl. Genet., 22:226-231 (1968)). FIG. 8 is a table showing a SNP-only haplotype (haplotype IV) identified across the LD block that is associated with severe obesity (BMI: top 10%) in males. FIG. 9 is a schematic representation of a hypothetical model for the role of TXFflPH in the regulation of obesity. It is proposed that ARRDC3 shifts fuel metabolism from storage to oxidative by inhibiting the reducing activity of thioredoxin (TXN). The proposed model is based on the close identity between ARRDC3 and TXNfP, as well as the biology of TXNIP. TXN undergoes a reversible oxidation to the cystine disulfide through transfer of reducing equivalents (a disulfide substrate). The oxidized TXN is reduced back to the Cys form by the NADPH-dependent thioredoxin reductase (TXNR). ARRDC3 may interact with reduced form of TXN and inhibit its reducing activity. Thus increased activity of ARRDC3 would alter the cellular redox state that results in decreased ratio NADH to NAD+. Increased levels of NAD+ would then activate the TCA cycle and shift fuel metabolism from storage to oxidative. FIGS. 10.1 to 10.122 show the genomic sequence of the human ARRDC3 gene and surrounding sequence that directs expression of ARRDC3 (SEQ ID NO: 1). FIGS. 11.1 to 11.2 are the cDNA sequence of ARRDC3 (SEQ ID NO: 2; GenBank Accession No.: NM_020801). FIG. 12 is the ARRDC3 polypeptide sequence (SEQ ID NO: 3; GenBank Accession No.: NP_ 065852). FIG. 13 is a schematic showing a role in fuel oxidation and/or desensitization of G-protein coupled receptors (GPCR) for ARRDC3. FIG. 14 shows synteny between the regions of human and murine chromosome 5 associated with obesity. FIG. 15 shows the mean expression levels of ARRDC3 in over 80 human tissues and cell lines. FIG. 16 shows the lod score curve for the chromosome 9 linkage with the obesity related locus on chromosome 5. FIG. 17 shows correlation plots for correlation between ARRDC3 expression and BMI in males and females, using either subcutenous or omental fat. FIG. 18 highlights a pattern of expression associated with ARRDC3 expression in adipose tissue. FIG. 19 shows an adipose cluster of genes most correlated with ARRDC3 in adipose and blood tissues. FIG. 20 shows clustering over the same set of genes described in FIG. 19, but using the expression results from the blood profiling study. FIG. 21 is a schematic representation of fasting/feeding schedule of the volunteers in Group 2 in the study of fasting signature in human subcutaneous fat, Example 8. FIG. 22 shows clustering results of 420 genes identified in the study of fasting signature in human subcutaneous fat in Group 2, Example 8. FIG. 23 shows clustering results of 420 genes identified in the study of fasting signature in human subcutaneous fat in Groups 1 and 2, Example 8. FIG. 24.1 shows reduction in transcript levels of ARRDC3 in adipose tissue upon food intake in Group 2, Example 8. FIG. 24.2 shows reduction in transcript levels of ARRDC3 in adipose tissue upon food intake in the combined pool of Group 1 and Group 2, Example 8. FIG. 25.1 shows reduction in transcript levels of PDK4 in adipose tissue upon food intake in Group 2, Example 8. FIG. 25.2 shows reduction in transcript levels of PDK4 in adipose tissue upon food intake in Group 2, Example 8.
DETAILED DESCRIPTION OF THE INVENTION A genome wide scan of large extended families (data on over 20,000 individuals, > 17,000 genotyped) and fine-mapping efforts were performed to collect phenotypic and genotypic data on obese subjects. Of the 20,000 individuals that were studied, more than 4,000 satisfied the clinical definition of obesity, defined as having a Body Mass Index (BMI) of 30 or more (BMI is calculated as the weight in kilograms divided by the square of the height in meters). Genome scans for obesity revealed a major locus at 5ql4 with evidence for linkage (lod score = 4.56, p-value = 0.0000023, information 0.93) using male affecteds only (FIG. 1). Ultra fine mapping and association studies revealed that the locus 5ql4 encompassed a gene called thioredoxin interacting protein homolog (AREDC3; also called TXNIPH or KIAA1376). Using genotypes based on microsatellites and single nucleotide polymorphisms (SNPs), it is demonstrated herein that particular combinations of genetic markers, e.g. , a combination of SNPs and/or microsatellites (referred hereinafter as "haplotypes") encompassing ARRDC3 are significantly associated with obesity or susceptibility to obesity. In particular, it has been discovered that particular haplotypes appear at higher than expected frequencies in subjects who are obese. Methods for the diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition are described herein and are encompassed by the invention. Kits for assaying a subject to detect a predisposition or susceptibility to obesity and/or an obesity-associated condition are also encompassed by the invention. In addition, methods for treating obesity, a susceptibility to obesity, an obesity-associated conditions and/or a susceptibility to an obesity-associated condition in a subject are also encompassed by the invention. The ARRDC3 -associated haplotypes describe a set of genetic markers associated with ARRDC3. In a certain embodiment, the haplotype can comprise one or more markers, two or more markers, three or more markers, four or more markers, or five or more markers, six or more markers, seven or more markers, eight or more markers, nine or more markers, ten or more markers, eleven or more markers, twelve or more markers, thirteen or more markers or fourteen or more markers. The genetic markers are particular "alleles" at "polymorphic sites" associated with ARRDC3. A nucleotide position at which more than one nucleotide is possible in a population (either a natural population or a synthetic population, e.g., a library of synthetic molecules) is referred to herein as a "polymorphic site".
Where a polymorphic site is a single nucleotide in length, the site is referred to as a single nucleotide polymorphism ("SNP"). For example, if at a particular chromosomal location, one member of a population has an adenine and another member of the population has a thymine at the same position, then this position is a polymorphic site, and, more specifically, the polymorphic site is a SNP.
Polymorphic sites can allow for differences in sequences based on substitutions, insertions or deletions. Each version of the sequence with respect to the polymorphic site is referred to herein as an "allele" of the polymorphic site. Thus, in the previous example, the SNP allows for both an adenine allele and a thymine allele. Typically, a reference sequence is referred to for a particular sequence. Alleles that differ from the reference are referred to as "variant" alleles. For example, the reference genomic sequence that contains the gene encoding ARRDC3, and which is associated with obesity or a predisposition or susceptibility to obesity and/or an obesity-associated condition is described herein by SEQ ID NO: 1. The term, "variant ARRDC3", as used herein, refers to a ARRDC3 sequence that differs from SEQ ID NO: 1, but is otherwise substantially similar. The genetic markers that make up the haplotypes described herein are ARRDC3 variants. The variants of ARRDC3 that are used to determine the haplotypes disclosed herein are associated with a susceptibility to a number of obesity and obesity-associated phenotypes. Additional variants can include changes that affect a polypeptide, e.g., the ARRDC3 polypeptide. These sequence differences, when compared to a reference nucleotide sequence, can include the insertion or deletion of a single nucleotide, or of more than one nucleotide, resulting in a frame shift; the change of at least one nucleotide, resulting in a change in the encoded amino acid; the change of at least one nucleotide, resulting in the generation of a premature stop codon; the deletion of several nucleotides, resulting in a deletion of one or more amino acids encoded by the nucleotides; the insertion of one or several nucleotides, such as by unequal recombination or gene conversion, resulting in an interruption of the coding sequence of a reading frame; duplication of all or a part of a sequence; transposition; or a rearrangement of a nucleotide sequence. Such sequence changes alter the polypeptide encoded by a ARRDC3 nucleic acid. For example, if the change in the nucleic acid sequence causes a frame shift, the frame shift can result in a change in the encoded amino acids, and/or can result in the generation of a premature stop codon, causing generation of a truncated polypeptide. Alternatively, a polymorphism associated with obesity, a susceptibility to obesity, an obesity- associated condition and/or a susceptibility to an obesity-associated condition can be a synonymous change in one or more nucleotides (i.e., a change that does not result in a change in the encoded ARRDC3 amino acid sequence). Such a polymorphism can, for example, alter splice sites, affect the stability or transport of mRNA, or otherwise affect the transcription and/or translation of the polypeptide. The polypeptide encoded by the ARRDC3 cDNA sequence (SEQ ID NO: 2) is the "reference" ARRDC3 polypeptide (SEQ ID NO: 3). Polypeptides encoded by variant alleles are referred to as "variant" polypeptides with variant amino acid sequences. For example, the reference genomic sequence that contains the gene encoding ARRDC3 is described herein by SEQ ID NO: 1. The term, "variant
ARRDC3", as used herein, refers to a ARRDC3 sequence that differs from SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, but is otherwise substantially similar. A "substantially similar sequence" as used herein refers to a ARRDC3 sequence that shares at least about 80% amino acid or nucleotide sequence identity with a naturally occurring ARRDC3 sequence. In particular embodiments, a variant ARRDC3 sequence shares at least about 90% sequence identity, and more preferably at least about 95% sequence identity with a naturally occurring TXNIH sequence (e.g., SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3). Methods for calculating sequence identity are known in the art and are further described herein. The genetic markers that make up the haplotypes described herein are
ARRDC3 variants. The variants of ARRDC3 that are used to determine the haplotypes disclosed herein are associated with a susceptibility to a number of obesity and obesity-associated conditions. Haplotype analysis involves defining a candidate susceptibility locus using LOD scores. The defined regions are then ultra- fine mapped with microsatellite markers with an average spacing between markers of less than 100 kb. All usable microsatellite markers that are found in public databases and mapped within that region can be used. In addition, microsatellite markers identified within the deCODE genetics sequence assembly of the human genome can be used. Further additional discussion of haplotype analysis and identification follows.
SCREENING ASSAYS OF THE INVENTION In one embodiment of the invention, diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject is made by detecting a haplotype associated with ARRDC3 as described herein. Because the haplotypes described herein are a combination of various genetic markers, e.g., SNPs and/or microsatellites, detecting haplotypes can be accomplished by methods known in the art for detecting sequences at polymorphic sites. In one embodiment, diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject is made by detecting one of the haplotypes listed in Table 1 and/or markers listed in Table 2. Detection of a haplotype associated with ARRDC3 can also be used to determine the genetic basis of obesity or an obesity-associated condition. As used herein, "an obesity-associated condition" refers to a condition, disease and/or disorder (e.g., a cardiovascular disorder or a metabolic disorder) that is associated with obesity. Such obesity-associated conditions include, e.g., diabetes (e.g., Type 2 or adult onset diabetes), coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias (e.g., hyperlipidemia), stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer (e.g., colorectal cancer or breast cancer). Other diseases, conditions and/or disorders associated with obesity are known to those of skill in the art. Diagnostic assays can be designed for assessing markers near or in the ARRDC3 locus. Such assays can be used alone or in combination with other assays for identifying a predisposition or susceptibility to obesity (e.g., determining BMI, determining waist-to-hip ratio, or determining relative body fat (e.g., by bioimpedance)). Anthropometry can be used to determine whether the obesity is a result of defects in the thioredoxin system. Combinations of genetic markers are referred to herein as "haplotypes," and the present invention describes methods whereby detection of particular haplotypes is indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. Similarly, detection of the markers can also determine the genetic basis for obesity or an obesity- associated condition (e.g., by determining if the obesity or obesity-associated condition is mediated through the thioredoxin pathway or through ARRDC3 specifically). The detection of the particular genetic markers that make up the particular haplotypes can be performed by a variety of methods described herein and known in the art. For example, genetic markers can be detected at the nucleic acid level, e.g., by direct nucleotide sequencing or at the amino acid level if the genetic marker affects the coding sequence of ARP DC3, e.g., by protein sequencing or by immunoassays using antibodies that recognize the ARRDC3 protein or a particular ARRDC3 variant protein. In one embodiment, the assays are used in the context of a biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether a subject is susceptible to (i.e., is at risk for or has a predisposition for) obesity and/or an obesity-associated condition. The invention also provides for prognostic (or predictive) assays for determining whether a subject is predisposed or susceptible to obesity and/or an obesity-associated condition. For example, variations in a nucleic acid sequence can be assayed in a biological sample. Such assays can be used for prognostic or predictive purposes to thereby allow for the prophylactic treatment of a subject prior to the onset of symptoms associated with obesity and/or an obesity- associated condition. In one method of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition, or determining the genetic basis of obesity or an obesity-associated condition, hybridization methods, such as Southern analysis, Northern analysis, and/or in situ hybridizations, can be used (see Current Protocols in Molecular Biology, Ausubel, F. et al., eds., John Wiley & Sons, including all supplements). For example, a biological sample from a test subject (a "test sample") of genomic DNA, RNA, or cDNA, is obtained from a subject who is obese (if trying to determine the genetic basis of the obesity) or is suspected of having, being susceptible to or predisposed for, obesity and/or an obesity-associated condition (the "test subject"). The subject can be an adult, child, or fetus. The test sample can be from any source that contains genomic DNA, such as a blood sample, sample of amniotic fluid, sample of cerebrospinal fluid, or tissue sample from skin, muscle, buccal or conjunctival mucosa, placenta, gastrointestinal tract or other organs. A test sample of DNA from fetal cells or tissue can be obtained by appropriate methods, such as by amniocentesis or chorionic villus sampling. The DNA, RNA, or cDNA sample is then examined to determine whether a polymorphism that is associated with the region that directs expression of ARRDC3 is present. The presence of an allele of the haplotype can be indicated by sequence- specific hybridization of a nucleic acid probe specific for the particular allele. A sequence-specific probe can be directed to hybridize to genomic DNA, RNA, or cDNA. A "nucleic acid probe", as used herein, can be a DNA probe or an RNA probe that hybridizes to a complementary sequence. One of skill in the art would know how to design such a probe such that sequence specific hybridization will occur only if a particular allele is present in a genomic sequence from a test sample. To diagnose a predisposition or susceptibility to obesity and/or an obesity- associated condition, or the genetic basis of the obesity or obesity-associated condition, a hybridization sample is formed by contacting the test sample containing a nucleic acid encoding ARRDC3, with at least one nucleic acid probe. A non- limiting example of a probe for detecting mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to mRNA or genomic DNA sequences described herein. The nucleic acid probe can be, for example, a full-length nucleic acid molecule, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to appropriate mRNA or genomic DNA. For example, the nucleic acid probe can be all or a portion of SEQ ID NO: 1 or SEQ ID NO: 2, optionally comprising at least one allele contained in the haplotypes described herein, or the probe can be the complementary sequence of such a sequence. Other suitable probes for use in the diagnostic assays of the invention are described herein. The hybridization sample is maintained under conditions that are sufficient to allow specific hybridization of the nucleic acid probe to the nucleic acid encoding ARRDC3. "Specific hybridization", as used herein, indicates exact hybridization (e.g., with no mismatches). Specific hybridization can be performed under high stringency conditions or moderate stringency conditions (see below). In one embodiment, the hybridization conditions for specific hybridization are high stringency (e.g., as described herein). Specific hybridization, if present, is then detected using standard methods. If specific hybridization occurs between the nucleic acid probe and the nucleic acid encoding ARRDC3 in the test sample, then the sample contains the allele that is complementary to the nucleotide that is present in the nucleic acid probe. The process can be repeated for the other markers that make up the haplotype, or multiple probes can be used concurrently to detect more than one marker at a time. It is also possible to design a single probe containing more than one marker of a particular haplotype (e.g., a probe containing alleles complementary to 2, 3, 4, 5 and/or all of the markers that make up a particular haplotype). Detection of the particular markers of the haplotype in the sample is indicative that the source of the sample has the particular haplotype (e.g., an at-risk haplotype) and therefore is predisposed or susceptible to obesity and/or an obesity-associated condition. In another hybridization method, Northern analysis (see Current Protocols in Molecular Biology, Ausubel, F. et al., eds., John Wiley & Sons, supra) is used to identify the presence of a polymorphism associated with obesity, a predisposition or susceptibility to obesity and/or an obesity-associated condition. For Northern analysis, a test sample of RNA is obtained from the subject by appropriate means. Specific hybridization of a nucleic acid probe, as described above, to RNA from the subject is indicative of a particular allele complementary to the probe. For representative examples of use of nucleic acid probes, see, for example, U.S. Patent Nos. 5,288,611 and 4,851,330. Additionally or alternatively, a peptide nucleic acid (PNA) probe can be used in addition or instead of a nucleic acid probe in the hybridization methods described above. A PNA is a DNA mimic having a peptide-like, inorganic backbone, such as N-(2-aminoethyl)glycine units, with an organic base (A, G, C, T or U) attached to the glycine nitrogen via a methylene carbonyl linker (see, for example, Nielsen, P., et al, Bioconjug. Chem., 5:3-7 (1994)). The PNA probe can be designed to specifically hybridize to a molecule in a sample suspected of containing one or more of the genetic markers of a haplotype that is associated with obesity, a predisposition or susceptibility to obesity and/or an obesity-associated condition. Hybridization of the PNA probe is diagnostic for a predisposition or susceptibility to obesity and/or an obesity-associated condition. Hybridization of the probe can also confirm that the obesity or obesity-associated condition is mediated through the thioredoxin pathway or through ARRDC3 specifically. In one embodiment of the invention, diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or the determination of the genetic basis of the obesity or obesity-associated condition, is accomplished through enzymatic amplification of a nucleic acid from the subject. For example, a test sample containing genomic DNA can be obtained from the subject and the polymerase chain reaction (PCR) can be used to amplify the genomic ARRDC3 region (including flanking sequences if necessary) in the test sample. As described herein, identification of a particular haplotype (e.g., an at-risk haplotype) associated with the amplified ARRDC3 region can be accomplished using a variety of methods (e.g., sequence analysis, analysis by restriction digestion, specific hybridization, single stranded conformation polymorphism assays (SSCP), electrophoretic analysis, etc.). In another embodiment, diagnosis is accomplished by expression analysis using quantitative PCR (kinetic thermal cycling). This technique can, for example, utilize commercially available technologies such as TaqMan® (Applied Biosystems, Foster City, CA), to allow the identification of polymorphisms and haplotypes (e.g., at-risk haplotypes). The technique can assess the presence of an alteration in the expression or composition of the polypeptide encoded by ARRDC3 or splicing variants. Further, the expression of the variants can be quantified as physically or functionally different. In another method of the invention, analysis by restriction digestion can be used to detect a particular allele if the allele results in the creation or elimination of a restriction site relative to a reference sequence. A test sample containing genomic DNA is obtained from the subject. PCR can be used to amplify the genomic
ARRDC3 region (including flanking sequences if necessary) in the test sample from the test subject. Restriction fragment length polymorphism (RFLP) analysis is conducted as described (see, e.g., Current Protocols in Molecular Biology, supra). The digestion pattern of the relevant DNA fragment indicates the presence or absence of the particular allele in the sample. Sequence analysis can also be used to detect specific alleles at polymorphic sites associated with ARRDC3. Therefore, in one embodiment, determination of the presence or absence of a particular haplotype (e.g., an at-risk haplotype) comprises sequence analysis. For example, a test sample of DNA or RNA can be obtained from the test subject. PCR or other appropriate methods can be used to amplify
AJΛRDC3 and/or its flanking sequences, if desired. The presence of a specific allele then can be detected directly by sequencing the polymorphic site of the genomic DNA in the sample. Allele-specific oligonucleotides can also be used to detect the presence of a particular allele at a polymorphic site associated with ARRDC3, through the use of dot-blot hybridization of amplified oligonucleotides with allele-specific oligonucleotide (ASO) probes (see, for example, Saiki, R. et al., Nature, 324:163- 166 (1986)). An "allele-specific oligonucleotide" (also referred to herein as an "allele-specific oligonucleotide probe") is an oligonucleotide of approximately 10- 50 base pairs or approximately 15-30 base pairs, that specifically hybridizes to ARRDC3 or its flanking sequences, and that contains a specific allele at a polymorphic site as indicated by the polymorphisms and haplotypes described herein. An allele-specific oligonucleotide probe that is specific for one or more particular polymorphisms in ARRDC3 can be prepared, using standard methods (see, e.g., Current Protocols in Molecular Biology, supra). PCR can be used to amplify all or a fragment of ARRDC3, as well as genomic flanking sequences. The DNA containing the amplified ARRDC3 (or fragment of the gene and/or flanking sequences) is dot-blotted, using standard methods (see, e.g., Current Protocols in Molecular Biology, supra), and the blot is contacted with the oligonucleotide probe. The presence of specific hybridization of the probe to the amplified ARRDC3 is then detected. Specific hybridization of an allele-specific oligonucleotide probe to DNA from the subject is indicative of a specific allele at a polymorphic site associated with ARRDC3. An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphic site and only primes amplification of an allelic form to which the primer exhibits perfect complementarity (see, e.g., Gibbs, R. et al, Nucleic Acids Res., 17:2437-2448 (1989)). This primer is used in conjunction with a second primer, which hybridizes at a distal site on the opposite strand. Amplification proceeds from the two primers, resulting in a detectable product, which indicates that the particular allelic form is present. A control is usually performed with a second pair of primers, one of which contains a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarity to a distal site. The single-base mismatch prevents amplification and no detectable product is formed. The method works best when the mismatch is included in the 3'-most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456). With the addition of such analogs as locked nucleic acids (LNAs), the size of primers and probes can be reduced to as few as 8 bases. LNAs are a novel class of bicyclic DNA analogs in which the 2' and 4' positions in the furanose ring are joined via an O-methylene (oxy-LNA), S-methylene (thio-LNA), or amino methylene (amino-LNA) moiety. Common to all of these LNA variants is an affinity toward complementary nucleic acids, which is by far the highest reported for a DNA analog. For example, particular all oxy-LNA nonamers have been shown to have melting temperatures (Tm) of 64 °C and 74°C when in complex with complementary DNA or RNA, respectively, as opposed to 28°C for both DNA and RNA for the corresponding DNA nonamer. Substantial increases in Tm are also obtained when LNA monomers are used in combination with standard DNA or RNA monomers. For primers and probes, depending on where the LNA monomers are included (e.g., the 3' end, the 5 'end, or in the middle), the Tm could be increased considerably. In another embodiment, arrays of oligonucleotide probes that are complementary to target nucleic acid sequence segments from a subject, can be used to identify polymorphisms in a nucleic acid encoding ARRDC3 and/or its flanking sequence. For example, an oligonucleotide array can be used. Oligonucleotide arrays typically comprise a plurality of different oligonucleotide probes that are coupled to a surface of a substrate in different known locations. These oligonucleotide arrays, also described as "Genechips," have been generally described in the art (see, e.g., U.S. Patent No. 5,143,854, PCT Patent Publication Nos. WO 90/15070 and WO 92/10092). These arrays can generally be produced using mechanical synthesis methods or light directed synthesis methods that incorporate a combination of photolithographic methods and solid phase oligonucleotide synthesis methods (Fodor, S. et al, Science, 251:767-773 (1991); Pirrung et al, U.S. Patent No. 5,143,854 (see also published PCT Application No. WO 90/15070); and Fodor. et al, published PCT Application No. WO 92/10092 and U.S. Patent No. 5,424,186, the entire teachings of each of which are incorporated by reference herein). Techniques for the synthesis of these arrays using mechanical synthesis methods are described in, e.g., U.S. Patent No. 5,384,261; the entire teachings of which are incorporated by reference herein. In another example, linear arrays can be utilized. Once an oligonucleotide array is prepared, a nucleic acid of interest is allowed to hybridize with the array. Detection of hybridization is a detection of a particular allele in the nucleic acid of interest. Hybridization and scanning are generally carried out by methods described herein and also in, e.g., published PCT Application Nos. WO 92/10092 and WO 95/11995, and U.S. Patent No. 5,424,186, the entire teachings of each of which are incorporated by reference herein. In brief, a target nucleic acid sequence, which includes one or more previously identified polymorphic markers, is amplified by well-known amplification techniques, e.g., PCR. Typically this involves the use of primer sequences that are complementary to the two sfrands of the target sequence, both upstream and downstream, from the polymorphic site. Asymmetric PCR techniques can also be used. Amplified target, generally incorporating a label, is then allowed to hybridize with the array under appropriate conditions that allow for sequence-specific hybridization. Upon completion of hybridization and washing of the array, the array is scanned to determine the position on the array to which the target sequence hybridizes. The hybridization data obtained from the scan is typically in the form of fluorescence intensities as a function of location on the array. Although primarily described in terms of a single detection block, e.g., for detection of a single polymorphic site, arrays can include multiple detection blocks, and thus be capable of analyzing multiple, specific polymorphisms (e.g., multiple polymorphisms of a particular haplotype (e.g. , an at-risk haplotype)). In alternate arrangements, it will generally be understood that detection blocks can be grouped within a single array or in multiple, separate arrays so that varying, optimal conditions can be used during the hybridization of the target to the array. For example, it will often be desirable to provide for the detection of those polymorphisms that fall within G-C rich stretches of a genomic sequence, separately from those falling in A-T rich segments. This allows for the separate optimization of hybridization conditions for each situation. Additional descriptions of use of oligonucleotide arrays for detection of polymorphisms can be found, for example, in U.S. Patent Nos. 5,858,659 and 5,837,832, the entire teachings of both of which are incorporated by reference herein. Other methods of nucleic acid analysis can be used to detect a particular allele at a polymorphic site associated with ARRDC3. Representative methods include, for example, direct manual sequencing (Church and Gilbert, Proc. Natl. Acad. Sci. USA, 81:1991-1995 (1988); Sanger, F., et al., Proc. Natl. Acad. Sci. USA, 74:5463-5467 (1977); Beavis, etal., U.S. Patent No. 5,288,644); automated fluorescent sequencing; single-stranded conformation polymorphism assays (SSCP); clamped denaturing gel electrophoresis (CDGE); denaturing gradient gel electrophoresis (DGGE) (Sheffield, V., et al, Proc. Natl. Acad. Sci. USA, 86:232- 236 (1989)), mobility shift analysis (Orita, M., et al, Proc. Natl. Acad. Sci. USA, 86:2766-2770 (1989)), restriction enzyme analysis (Flavell, R., et al, Cell, 15:25-41 (1978); Geever, R., etal, Proc. Natl. Acad. Sci. USA, 78:5081-5085 (1981)); heteroduplex analysis; chemical mismatch cleavage (CMC) (Cotton, R., et al, Proc. Natl. Acad. Sci. USA, 85:4397-4401 (1985)); RNase protection assays (Myers, R., et al, Science, 230:1242-1246 (1985)); and use of polypeptides that recognize nucleotide mismatches, such as E. coli mutS protein; and allele-specific PCR. In another embodiment of the invention, diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or the genetic basis or the obesity or obesity-associated condition, can be made by examining expression and/or composition of a ARRDC3 polypeptide in those instances where the genetic marker contained in a haplotype described herein results in a change in the expression of the polypeptide (e.g., a resulting altered amino acid sequence leading to decreased or increased expression, or altered 5' or 3' nucleic acid sequences that flank the ARRDC3 gene and alter transcription of the gene). A variety of methods can be used to make such a detection, including enzyme linked immunosorbent assays (ELISA), Western blots, immunoprecipitations and immunofluorescence. A test sample from a subject is assessed for the presence of an alteration in the expression and/or an alteration in composition of the polypeptide encoded by
ARRDC3. An alteration in expression of a polypeptide encoded by ARRDC3 can be, for example, an alteration in the quantitative polypeptide expression (i.e., the amount of polypeptide produced). An alteration in the composition of a polypeptide encoded by ARRDC3 is an alteration in the qualitative polypeptide expression (e.g., expression of a mutant ARRDC3 polypeptide or of a different splicing variant). In one embodiment, diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or determination of the genetic basis of obesity or an obesity-associated condition, is made by detecting a particular splicing variant encoded by ARRDC3, or a particular pattern of splicing variants. Both such alterations (quantitative and qualitative) can also be present. An
"alteration" in the polypeptide expression or composition, as used herein, refers to an alteration in expression or composition in a test sample, as compared to the expression or composition of polypeptide by ARRDC3 in a control sample. A control sample is a sample that corresponds to the test sample (e.g., is from the same type of cells), and is from a subject who is not affected by obesity or a predisposition or susceptibility to obesity and/or an obesity-associated condition (e.g., a subject that does not possess an at-risk haplotype as described herein). Similarly, the presence of one or more different splicing variants in the test sample, or the presence of significantly different amounts of different splicing variants in the test sample, as compared with the control sample, can be indicative of a predisposition or susceptibility to obesity and/or an obesity-associated condition. An alteration in the expression or composition of the polypeptide in the test sample, as compared with the control sample, can be indicative of a specific allele in the instance where the allele alters a splice site relative to the reference in the control sample. Various means of examining expression or composition of the polypeptide encoded by ARRDC3 can be used, including spectroscopy, colorimetry, electrophoresis, isoelectric focusing, and immunoassays (e.g., David et al, U.S. Patent No. 4,376,110) such as immunoblotting (see, e.g., Current Protocols in Molecular Biology, particularly chapter 10). For example, in one embodiment, an antibody (e.g., an antibody with a detectable label) that is capable of binding to the polypeptide can be used. Antibodies can be polyclonal or monoclonal. An intact antibody, or a fragment thereof (e.g., Fv, Fab, Fab', F(ab') ) can be used. The term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled. Examples of indirect labeling include detection of a primary antibody using a fluorescently-labeled secondary antibody and end-labeling of a DNA probe with biotin such that it can be detected with fluorescently-labeled streptavidin. Western blot analysis (e.g., using an antibody that specifically binds to a polypeptide encoded by a variant ARRDC3, or an antibody that specifically binds to a polypeptide encoded by a reference allele) can be used to identify the presence in a test sample of a polypeptide encoded by a variant ARRDC3 allele, or the absence in a test sample of a polypeptide encoded by the reference allele. In one embodiment of this method, the level or amount of polypeptide encoded by ARRDC3 in a test sample is compared with the level or amount of the polypeptide encoded by ARRDC3 in a control sample. A level or amount of the polypeptide in the test sample that is higher or lower than the level or amount of the polypeptide in the control sample, such that the difference is statistically significant, is indicative of an alteration in the expression of the polypeptide encoded by ARRDC3, and is diagnostic for a particular allele responsible for causing the difference in expression. Alternatively, the composition of the polypeptide encoded by ARRDC3 in a test sample is compared with the composition of the polypeptide encoded by ARRDC3 in a control sample. In another embodiment, both the level or amount and the composition of the polypeptide can be assessed in the test sample and in the control sample. In one embodiment, the diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, or determination of the genetic basis of obesity or an obesity-associated condition, is made by detecting at least one ARRDC3 -associated allele in combination with an additional assay (e.g., determining BMI, determining waist-to-hip ratio, determining relative body fat (e.g., by bioimpedance).
DIAGNOSTIC KITS Kits useful in the methods of diagnosis comprise components useful in any of the methods described herein, including for example, hybridization probes, restriction enzymes (e.g., for RFLP analysis), allele-specific oligonucleotides, antibodies that bind to an altered ARRDC3 polypeptide (e.g., to a polypeptide having the sequence depicted in SEQ ID NO: 3, but comprising at least one genetic marker included in the haplotypes described herein) or to non-altered (native) ARRDC3 polypeptide (e.g., to a polypeptide having the sequence depicted in SEQ ID NO: 3), means for amplification of nucleic acids comprising ARRDC3, means for analyzing the nucleic acid sequence of ARRDC3, means for analyzing the amino acid sequence of a ARRDC3 polypeptide, etc. Additionally, kits can provide reagents for assays to be used in combination with the methods of the present invention, e.g., reagents for use in determining BMI (e.g., a scale, a tape measure), waist-to-hip ratio (e.g., a tape measure) and/or relative body fat (e.g., calipers, a bioimpedance-measuring device). Kits (e.g., reagent kits) useful in the methods of diagnosis comprise components useful in any of the methods described herein, including for example, hybridization probes or primers as described herein (e.g., labeled probes or primers), reagents for detection of labeled molecules, restriction enzymes (e.g., for RFLP analysis), allele-specific oligonucleotides, antibodies that bind to altered or to non- altered (native) ARRDC3 polypeptide, means for amplification of nucleic acids comprising ARRDC3, means for analyzing the nucleic acid sequence of a ARRDC3 nucleic acid, means for analyzing the amino acid sequence of a ARRDC3 polypeptide, etc. In one embodiment, the invention is a kit for assaying a sample from a subject to determine the genetic basis of obesity and/or an obesity-associated condition, or to detect a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, wherein the kit comprises one or more reagents for detecting an at-risk haplotype associated with the ARRDC3 gene. In particular embodiments, the kit can comprise, e.g., at least one contiguous nucleotide sequence that is completely complementary to a region comprising at least one of the markers of the at-risk haplotype, one or more nucleic acids that are capable of detecting one or more specific markers of an at-risk haplotype. Such nucleic acids (e.g., oligonucleotide primers) can be designed using portions of the nucleic acids flanking SNPs that are indicative of obesity or an obesity-associated condition or a predisposition or susceptibility to obesity and/or an obesity-associated condition. Such nucleic acids (e.g., oligonucleotide primers) are designed to amplify regions of the ARRDC3 nucleic acid (and/or flanking sequences) that are associated with an at- risk haplotype for obesity or an obesity-associated condition. In another embodiment, the kit comprises one or more labeled nucleic acids capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene and reagents for detection of the label. Suitable labels include, e.g., a radioisotope, a fluorescent label, an enzyme label, an enzyme co-factor label, a magnetic label, a spin label, an epitope label. Table 1 depicts such at-risk haplotypes (e.g., haplotype I, haplotype II; haplotype III; haplotype IV) and markers (see also, Table 2). In one embodiment, the at-risk haplotype to be detected by the reagents of the kit comprises two or more markers selected from the group consisting of the markers in Table 1. In particular embodiments, the kit comprises two or more markers selected from the markers comprising haplotype I, haplotype II, haplotype III, or haplotype IN. In these embodiments, the presence of the at-risk haplotype is indicative of obesity or an obesity-associated condition, or a predisposition or susceptibility to obesity and/or an obesity-associated condition.
Table 1. Haplotypes (listing markers) and single markers associated with obesity or an obesity-associated condition, or a predisposition or susceptibility to obesity and/or an obesity-associated condition.
Figure imgf000028_0001
a Markers are described further in Table 2 b Relative Risk (RR) of an allele or a haplotype, i.e., the ratio of the incidence of the condition among subjects who contain the haplotype normalized to the incidence of the condition among subjects who do not contain the haplotype (calculated assuming the multiplicative model, together with the population attributable risk (PAR)).
DIAGNOSIS OF ARRDC3-ASSOCIATED OBESITY AND/OR ASSOCIATED CONDITIONS USING HAPLOTYPES Certain haplotypes described herein, such as those shown in FIGs. 1 and 2, have been found more frequently in individuals with obesity than in individuals without obesity. Therefore, these "at-risk" haplotypes can be used to diagnose ARRDC3-associated obesity and/or associated condition(s). Identification of ARRDC3 -associated obesity and/or associated condition(s) facilitates treatment planning, as treatment can be designed and therapeutics selected to target components involved in fuel metabolism, for example, those components shown in FIG. 9. In one embodiment of the invention, diagnosis of ARRDC3 -associated obesity or an associated condition, is made by detecting a polymorphism in a ARRDC3 nucleic acid (e.g., using the methods described above and/or other methods known in the art). In other embodiments, the invention pertains to a method for the diagnosis and identification of ARRDC3 -associated obesity or an associated condition in a subject, by identifying the presence of an at-risk haplotype in ARRDC3 as described in detail herein. For example, the haplotypes described herein in Table 1 are found more frequently in obese individuals and/or individuals having an obesity-associated condition than in individuals not affected by these conditions. Therefore, these haplotypes have predictive value for detecting ARRDC3 -associated obesity or an associated condition in a subject. In certain embodiments, an at-risk haplotype is characterized by the presence of polymorphism(s) depicted in Table 2. In other embodiments, the at-risk haplotype is selected from the group consisting of haplotype I, haplotype II, haplotype III and haplotype IV. The at-risk haplotype can also comprise a combination of the markers in haplotype I, haplotype II haplotype III and haplotype IV. The methods described herein can be used to assess a sample from a subject for the presence or absence of an at-risk haplotype; the presence of an at-risk haplotype is indicative of ARRDC3 -associated obesity or an associated condition. In one embodiment, the invention is a method for the diagnosis and identification of a predisposition or susceptibility to obesity and/or an obesity- associated condition in a subject, or for determining the genetic basis of obesity or an obesity-associated condition, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene. In one embodiment, the at-risk haplotype is one that confers a significant risk of predisposition or susceptibility to obesity. In another embodiment, the at-risk haplotype is one that confers a significant risk of a predisposition or susceptibility to an obesity-associated condition. In one embodiment, significance associated with a haplotype is measured by relative risk (RR). RR is the ratio of the incidence of the condition among subjects who contain the haplotype to the incidence of the condition among subjects who do not contain the haplotype. In another embodiment, the at-risk haplotype has a relative risk of at least 1.8. In other embodiments, the at-risk haplotype has a relative risk of at least 2.7, or at least 3.0. In another embodiment, the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"5 or less, lxlO"6 or less, lxlO"7 or less or lxlO"8 or less. In another embodiment, significance associated with a haplotype is measured by an odds ratio. In one embodiment, a significant risk is measured as an odds ratio of at least about 1.2, including by not limited to: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9. In another embodiment, an odds ratio of at least 1.2 is significant. In a further embodiment, an odds ratio of at least about 1.5 is significant. In still a further embodiment, an odds ratio of at least about 1.7 is significant. In another embodiment, the significance is measured by a percentage. In one embodiment, a significant increase in risk is at least about 20%, including but not limited to about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% and 98%. In another embodiment, a significant increase in risk is at least about 50%. It is understood however, that identifying whether a risk is medically significant may also depend on a variety of factors, including the specific disease, the haplotype, and often, environmental factors. The invention also pertains to methods of diagnosing a predisposition or susceptibility to obesity and/or an obesity-associated condition in a subject, or determining the genetic basis of obesity or an obesity-associated condition, comprising screening for an at-risk haplotype associated with the ARRDC3 nucleic acid that is more frequently present in a subject who is obese or has an obesity- associated condition, or is predisposed or susceptible to obesity and/or an obesity- associated condition (affected), compared to the frequency of its presence in a healthy subject (control). In this embodiment, the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity and/or an obesity- associated condition. Standard techniques for genotyping for the presence of SNPs and/or microsatellite markers associated with obesity can be used, such as fluorescent based techniques (Chen, X., etal, Genome Res., 9:492-498 (1999)), PCR, LCR, Nested PCR and other techniques for nucleic acid amplification. In one embodiment, the method comprises assessing in a subject the presence or frequency of one or more specific SNP alleles and/or microsatellite alleles (e.g., alleles that are present in an at-risk haplotype) associated with ARRDC3 and linked to obesity or a an obesity- associated condition or a predisposition or susceptibility to obesity and/or an obesity-associated condition. In this embodiment, an excess or higher frequency of the allele(s), as compared to a healthy control subject, is indicative that the subject is predisposed or susceptible to obesity and/or an obesity-associated condition. A detailed discussion of statistical methods of identifying and analyzing haplotypes is given below in the section "Screening for and Identification of Haplotypes".
DIAGNOSING ARRDC3 -ASSOCIATED OBESITY AND ASSOCIATED CONDITIONS BY ASSESSING THE INTERACTION BETWEEN TXN AND ARRDC3 The invention also relates to methods of assessing an individual for an increased risk of obesity or an obesity-associated condition comprising assessing the interaction between ARRDC3 and thioredoxin (TXN). As described herein, it is reasonable to believe that one way in which ARRDC3 functions is by binding to reduced TXN and inhibiting TXN's reducing activity. An individual can be screened by obtaining a biological sample from the individual, and measuring the binding of ARRDC3 to TXN (e.g., by detecting the reducing activity of TXN in the sample). The reducing activity can be detected and measured, for example, using an NADPH/TXN reductase dependent insulin reducing assay (Spyrou, G., et al, J. Biol. Chem., 272:2936-2941 (1997)). If the reducing activity of TXN in the sample is decreased (i.e., less inhibition of reducing activity) compared to than that in a healthy subject (control) then the patient has an increased risk of obesity or an obesity-associated condition. In one embodiment, the individual has 10% less thioredoxin reducing activity compared to controls. In other embodiments the individual has 10%, 20%, 25%, 50%, 75%, 80%, 90% or 95% less thioredoxin reducing activity compared to controls.
The Role ofARRDCS in the Regulation of Obesity It is proposed that ARRDC3 gene product may affect long-term body weight through fuel metabolism. Based on the homology of ARRDC3 with TXNIP, the role of ARRDC3 in fuel metabolism and regulation of body weight can be hypothesized. Thioredoxin (TXN) is a multifunctional oxidoreductase with a conserved amino-acid sequence (-Cys-Gly-Pro-Cys-) at its active site, regulating a number of cellular processes via thiol redox control (TXNred / TXNox) (Holgrem, A. (Annu. Rev. Biochem., 54:237-271 (1985)) (FIG. 9). Three thioredoxin molecules exist in humans (Powis and Montfort (Annu. Rev. Biomol. Struct., 30:421- 455 (2001)). Of the TXN molecules, TXN-1 was been characterized the most. In addition to the two catalytic Cys residues, Cys32 and Cys35, TXN-1 contains three other Cys residues, Cys62, Cys69 and Cys73 (with noncatalytic activities). TXN-1 is a secreted protein and is able to homodimerize resulting in a loss of reducing activity. A larger protein TXN-2 (of unknown function) contains a mitochondrial import sequence and is predominantly found in mitochondria. TXN-2 contains the catalytic Cys residues but lacks the other Cys sites found in TXN-1. Finally, a TXN-like cytosolic protein has recently been cloned from a human testis cDNA library. TXN is induced by various stress components including ultraviolet-induced cytocide and hydrogen peroxide (oxidizing agent) and its induction is thought to mediate cytoprotective responses (Powis and Monfort (supra)). Transgenic (Tg) overexpression (3 fold increase) of TXN in mice has been performed (Mitsui, et al, Antioxid. Redox Signal., 4:693-696 (2002)). Here, the Tg mice are fertile showing normal growth and normal behavior. Moreover, the Tg mice show increased resistance to oxidative stress and have extended lifespans. Mouse TXN homozygous knock-out results in early embryonic lethality suggesting a major role for TXN in differentiation and morphogenesis (Matsui, et al, Develop. Biol, 178:179-185
(1996)). In contrast heterozygous knock-out animals are viable, fertile and exhibit normal growth (Matsui (supra)). In summary, TXNs affect a wide range of cellular processes including growth and antiapoptosis. TXNIP, originally identified as vitamin D3-upregulated gene 1 (VDUP1) (Chen and DeLucca, Biochim. Biophys. Acta, 1219:26-32 (1994)), binds reduced but not oxidized TXN and inhibits its reducing activity (measured by NADPH/TXN reductase dependent insulin reducing assay). Further, increased expression of ARRDC3 results in reduction of TXN expression (Nishiyama et al, J. Biol. Chem., 274:21645-21650 (1999)). Recently, TXNIP mRNA expression was found to be markedly induced in pancreatic islet cells by glucose treatment (Shalev et al,
Endocrinology, 143:3695-3698 (2002)). These data suggest that the TXNIP system is linked to cellular nutrient sensing pathways. TXN regulated redox state in cells is important for metabolism and signal fransduction. While not wishing to be limited to a particular theory, given the close identity of ARRDC3 (5ql4) to TXNIP (lq21) it is reasonable to believe that their activities overlap to some extent. The nonsense mutation in TXNIP is a loss of function (reduced transcript levels and amino acids essential for binding to TXN are missing) mutation, resulting in a 2-fold increase in liver content of triglycerides (Bodnar et al (supra)). This was accompanied by reduced flux of fatty acids (most likely blocked by raised NADH7NAD+) oxidized via the TCA cycle. Thus, the mutation results in the expansion of fatty acid pools that are available for triglyceride synthesis. Moreover, since regulation of the TXNIP system is linked to nutrient (glucose, vitamin D3, free fatty acids) availability, it is reasonable to believe that increased activity of its homologue ARRDC3 would shift fuel metabolism from storage to oxidative and act as a promising therapeutic target for obesity and associated conditions including those described herein. Furthermore, based on the patterns of expression and clustering data, shown in Examples 2 through 1, for different functional categories of genes whose expression correlates with that of ARRDC3 (see Table 3), a functional role for ARRDC3 can be proposed. FIG. 13 highlights a role in fuel oxidation and/or desensitization of G-protein coupled receptors (GPCRs) for ARRDC3. Regarding the role it may play in the development of obesity, FIG. 13 shows how ARRDC3 may regulate fatty acid oxidation and fatty acid biosynthesis. In addition, the data suggest that ARRDC3 may affect GPCR sensitization, thereby affecting the signaling cascade associated with GPCR activity important to obesity related traits.
METHODS OF THERAPY BY REGULATING ARRDC3 ACTIVITYY AND EXPRESSION As described and exemplified herein, ARRDC3 is a gene that is linked to obesity. The ARRDC3 protein has sequence homology to thioredoxin interacting protein (TXNIP). TXNIP binds reduced, but not oxidized thioredoxin (TXN), and inhibits its reducing activity. TXN regulated redox state in cells is important for metabolism and signal transduction. Given the sequence similarity between
ARRDC3 and TXNIP, it is reasonable to believe that their activities overlap to some extent. Regulation of the TXNIP system is linked to nutrient (glucose, vitamin D3, free fatty acids) availability, increased activity of its homologue ARRDC3 may act by shifting fuel metabolism from storage to oxidative and act as a promising therapeutic target for obesity and obesity-associated conditions, such as those described herein. In one embodiment, the invention is a method of treating or preventing obesity and/or an obesity-associated condition in a subject comprising administering to the subject an agonist (e.g., a promoter) of ARRDC3. As used herein, "an agonist of ARRDC3" refers to an agent (compound) that increases the expression or biological activity or function of ARRDC3 (e.g., the inhibition of TXN reducing activity). Such agonists include proteins, fusion proteins, polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small molecules or other drugs, and ribozymes. Test agents (compounds) can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection. The biological library approach is limited to polypeptide libraries, while the other four approaches are applicable to polypeptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K.S. Anticαncer DrugDes., 12:145 (1997)). Agents that increase the expression of biological activity or function of ARRDC3 can be identified, for example, using an in vitro ARRDC3 -specific substrate processing assay. For example, an assay can be performed wherein ARRDC3 and a substrate of ARRDC3 (e.g., TXN) are contacted in the presence and absence of the agent to be tested. If the presence of the agent results in an increase in the binding of the substrate, by an amount that is statistically significant, then the agent is an agonist of ARRDC3. Methods for conducting such substrate processing assays are known to those of skill in the art. The present invention also relates to an assay for identifying agents that alter (preferably increase) the expression of the ARRDC3 gene (e.g., fusion proteins, polypeptides, peptidomimetics, prodrugs, receptors, binding agents, antibodies, small molecules or other drugs, or ribozymes) which alter (e.g., increase) expression (e.g., transcription or translation) of the ARRDC3 gene, as well as agents identifiable by the assays. For example, a solution containing a nucleic acid encoding a ARRDC3 polypeptide can be contacted with an agent to be tested. The solution can comprise, for example, cells containing the nucleic acid or cell lysate containing the nucleic acid; alternatively, the solution can be another solution that comprises elements necessary for transcription/translation of the nucleic acid. Cells not suspended in solution can also be employed, if desired. The level and/or pattern of ARRDC3 expression (e.g., the level and/or pattern of mRNA or of protein expressed) is assessed, and is compared with the level and/or pattern of expression in a confrol (i.e., the level and/or pattern of the ARRDC3 expression in the absence of the agent to be tested). If the level and/or pattern in the presence of the agent differ (e.g., are increased), by an amount or in a manner that is statistically significant, from the level and/or pattern in the absence of the agent, then the agent is an agent that alters the expression of ARRDC3. In another embodiment of the invention, agents which alter (e.g., increase) the expression of the ARRDC3 gene can be identified using a cell, cell lysate, or solution containing a nucleic acid encoding the promoter region (or other 5' or 3' sequences flanking the ARRDC3 gene) of the ARRDC3 gene operably linked to a reporter gene. After contact with an agent to be tested, the level of expression of the reporter gene (e.g., the level of mRNA or of protein expressed) is assessed, and is compared with the level of expression in a control (i.e., the level of the expression of the reporter gene in the absence of the agent, to be tested). If the level in the presence of the agent differs (e.g., is increased), by an amount or in a manner that is statistically significant, from the level in the absence of the agent, then the agent is an agent that alters the expression of ARRDC3, as indicated by its ability to alter expression of a gene that is operably linked to the ARRDC3 gene promoter. Enhancement of the expression of the reporter indicates that the agent is an agonist of ARRDC3 expression and/or biological activity. Similarly, inhibition of the expression of the reporter indicates that the agent is an antagonist of ARRDC3 expression and/or biological activity. In another embodiment, the level of expression of the reporter in the presence of the agent to be tested is compared with a control level that has previously been established. A level in the presence of the agent that differs from the control level by an amount or in a manner that is statistically significant indicates that the agent alters ARRDC3 expression. Agents that increase ARRDC3 expression or biological activity are particularly useful for treating or preventing obesity or an obesity-associated condition. ARRDC3 agonists identified as described herein can be used not only to treat or prevent obesity/or an obesity-associated condition, but also to reduce triglyceride levels in a subject, or to increase fatty acid oxidation in a subject, or to decrease the interaction between TXN and ARRDC3 in an individual in need thereof. Other agents that regulate the activity of ARRDC3 indirectly (e.g., agents that regulate the activity of ARRDC3 by regulating the activity of other proteins which in tarn regulate the activity of ARRDC3) are also encompassed by the invention. The invention also encompasses agents (e.g., agonists of ARRDC3) that are identified by the methods of the invention. Accordingly, it is within the scope of this invention to further use an agent identified as described herein in an appropriate animal model. For example, an agent identified as described herein can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above- described screening assays for treatments as described herein. In addition, an agent identified as described herein can be used to alter activity of a polypeptide encoded by ARRDC3, or to alter expression of ARRDC3, by contacting the polypeptide or the gene (or contacting a cell comprising the polypeptide or the gene) with the agent identified as described herein. PHARMACEUTICAL COMPOSITIONS The present invention also pertains to pharmaceutical compositions comprising agents (compounds) described herein, and/or an agent that alters (e.g., enhances or inhibits) ARRDC3 gene expression or ARRDC3 polypeptide biological activity as described herein. For instance, a polypeptide, protein, an agent that alters ARRDC3 gene expression or biological activity, can be foπnulated with a physiologically acceptable carrier or excipient to prepare a pharmaceutical composition. The carrier and composition can be sterile. The formulation should suit the mode of administration. Suitable pharmaceutically acceptable carriers include but are not limited to water, salt solutions (e.g., NaCl), saline, buffered saline, alcohols, glycerol, ethanol, gum arabic, vegetable oils, benzyl alcohols, polyethylene glycols, gelatin, carbohydrates such as lactose, amylose or starch, dextrose, magnesium stearate, talc, silicic acid, viscous paraffin, perfume oil, fatty acid esters, hydroxymethylcellulose, polyvinyl pyrolidone, etc., as well as combinations thereof. The pharmaceutical preparations can, if desired, be mixed with auxiliary agents, e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, flavoring and/or aromatic substances and the like which do not deleteriously react with the active agents. The composition, if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents. The composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder. The composition can be formulated as a suppository, with traditional binders and carriers such as triglycerides. Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, polyvinyl pyrolidone, sodium saccharine, cellulose, magnesium carbonate, etc. Methods of introduction of these compositions include, but are not limited to, intradermal, intramuscular, intraperitoneal, intraocular, intravenous, subcutaneous, topical, oral and intranasal. Other suitable methods of introduction can also include gene therapy (as described below), rechargeable or biodegradable devices, particle acceleration devises ("gene guns") and slow release polymeric devices. The pharmaceutical compositions of this invention can also be administered as part of a combinatorial therapy with other agents. The composition can be formulated in accordance with the routine procedures as a pharmaceutical composition adapted for administration to human beings. For example, compositions for intravenous administration typically are solutions in sterile isotonic aqueous buffer. Where necessary, the composition may also include a solubilizing agent and a local anesthetic to ease pain at the site of the injection. Generally, the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampule or sachette indicating the quantity of active agent. Where the composition is to be administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water, saline or dextrose/water. Where the composition is administered by injection, an ampule of sterile water for injection or saline can be provided so that the ingredients may be mixed prior to administration. For topical application, nonsprayable forms, viscous to semi-solid or solid forms comprising a carrier compatible with topical application and having a dynamic viscosity preferably greater than water, can be employed. Suitable formulations include but are not limited to solutions, suspensions, emulsions, creams, ointments, powders, enemas, lotions, sols, liniments, salves, aerosols, etc., which are, if desired, sterilized or mixed with auxiliary agents, e.g., preservatives, stabilizers, wetting agents, buffers or salts for influencing osmotic pressure, etc. The agent may be incorporated into a cosmetic formulation. For topical application, also suitable are sprayable aerosol preparations wherein the active ingredient, preferably in combination with a solid or liquid inert carrier material, is packaged in a squeeze bottle or in admixture with a pressurized volatile, normally gaseous propellant, e.g., pressurized air. Agents described herein can be formulated as neutral or salt forms. Pharmaceutically acceptable salts include those formed with free amino groups such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., and those formed with free carboxyl groups such as those derived from sodium, potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2- ethylamino ethanol, histidine, procaine, etc. The agents are administered in a therapeutically effective amount. The amount of agents which will be therapeutically effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and can be determined by standard clinical techniques. In addition, in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the seriousness of the symptoms of obesity, and should be decided according to the judgment of a practitioner and each patient's circumstances. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems. The invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention. Optionally associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use of sale for human administration. The pack or kit can be labeled with information regarding mode of administration, sequence of drug administration (e.g., separately, sequentially or concurrently), or the like. The pack or kit may also include means for reminding the patient to take the therapy. The pack or kit can be a single unit dosage of the combination therapy or it can be a plurality of unit dosages. In particular, the agents can be separated, mixed together in any combination, present in a single vial or tablet. Agents assembled in a blister pack or other dispensing means is preferred. For the purpose of this invention, unit dosage is intended to mean a dosage that is dependent on the individual pharmacodynamics of each agent and administered in FDA approved dosages in standard time courses. The therapeutic agent(s) are administered in a therapeutically effective amount (i.e., an amount that is sufficient to treat the disease (e.g., obesity or an obesity-associated condition), such as by ameliorating symptoms associated with the disease, preventing or delaying the onset of the disease, and/or also lessening the severity or frequency of symptoms of the disease). The amount which will be therapeutically effective in the treatment of a particular individual's disorder or condition will depend on the symptoms and severity of the disease, and can be determined by standard clinical techniques. In addition, in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the seriousness of the disease or disorder, and should be decided according to the judgment of a practitioner and each patient's circumstances. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems.
SCREENING FOR AND IDENTIFICATION OF HAPLOTYPES In one embodiment, haplotypes can be used to identify individuals at risk for obesity and associated conditions. Haplotypes are a combination of genetic markers, e.g., particular alleles at polymorphic sites. Markers can include, for example, SNPs and microsatellites. The haplotypes can comprise a combination of various genetic markers; therefore, detecting haplotypes can be accomplished by methods known in the art for detecting sequences at polymorphic sites. For example, standard techniques for genotyping for the presence of SNPs and/or microsatellite markers can be used, such as fluorescent based techniques (Chen, et al, Genome Res. 9, 492 (1999)), PCR, LCR, Nested PCR and other techniques for nucleic acid amplification. These markers and SNPs can be identified in at-risk haploptypes. Certain ethods of identifying relevant markers and SNPs include the use of linkage disequilibrium (LD) and/or LOD scores.
Linkage Disequilibrium Linkage Disequilibrium (LD) refers to a non-random assortment of two genetic elements. For example, if a particular genetic element (e.g., "alleles" at a polymorphic site; see below) occurs in a population at a frequency of 0.25 and another occurs at a frequency of 0.25, then the predicted occurrance of a person's having both elements is 0.125, assuming a random distribution of the elements. However, if it is discovered that the two elements occur together at a frequency higher than 0.125, then the elements are said to be in linkage disequilibrium since they tend to be inherited together at a higher rate than what their independent allele frequencies would predict. Roughly speaking, LD is generally correlated with the frequency of recombination events between the two elements. Many different measures have been proposed for assessing the strength of linkage disequilibrium (LD). Most capture the strength of association between pairs of biallelic sites. Two important pairwise measures of LD are r2 (sometimes denoted Δ 2) and |D'|. Both measures range from 0 (no disequilibrium) to 1 ('complete' disequilibrium), but their interpretation is slightly different. |D'| is defined in such a way that it is equal to 1 if just two or three of the possible haplotypes are present, and it is <1 if all four possible haplotypes are present. So, a value of |D'| that is <1 indicates that historical recombination has occurred between two sites (recurrent mutation can also cause |D'| to be <1, but for single nucleotide polymorphisms (SNPs) this is usually regarded as being less likely than recombination). The measure r2 represents the statistical correlation between two sites, and takes the value of 1 if only two haplotypes are present. It is arguably the most relevant measure for association mapping, because there is a simple inverse relationship between r2 and the sample size required to detect association between susceptibility loci and SNPs. These measures are defined for pairs of sites, but for some applications a determination of how sfrong LD is across an entire region that contains many polymorphic sites might be desirable (e.g., testing whether the strength of LD differs significantly among loci or across populations, or whether there is more or less LD in a region than predicted under a particular model). Measuring LD across a region is not straightforward, but one approach is to use the measure r, which was developed in population genetics. Roughly speaking, r measures how much recombination would be required under a particular population model to generate the LD that is seen in the data. This type of method can potentially also provide a statistically rigorous approach to the problem of determining whether LD data provide evidence for the presence of recombination hotspots. For the methods disclosed herein, a significant r2 value can be 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1.0.
Haplotypes and LOD Score Definition of a Susceptibility Locus In certain embodiments, haplotype analysis involves defining a candidate susceptibility locus using LOD scores. The defined regions are then ultra-fine mapped with microsatellite markers with an average spacing between markers of less than 100 kb. All usable microsatellite markers that are found in public databases and mapped within that region can be used. In addition, microsatellite markers identified within the deCODE genetics sequence assembly of the human genome can be used. The frequencies of haplotypes in the patient and the control groups can be estimated using an expectation-maximization algorithm (Dempster A. et al, 1977. J. R. Stat. Soc. B, 39:1-389). An implementation of this algorithm that can handle missing genotypes and uncertainty with the phase can be used. Under the null hypothesis, the patients and the controls are assumed to have identical frequencies. Using a likelihood approach, an alternative hypothesis is tested, where a candidate at-risk-haplotype, which can include the markers described herein, is allowed to have a higher frequency in patients than confrols, while the ratios of the frequencies of other haplotypes are assumed to be the same in both groups. Likelihoods are maximized separately under both hypotheses and a corresponding 1-df likelihood ratio statistic is used to evaluate the statistic significance. To look for at-risk-haplotypes in the 1-lod drop, for example, association of all possible combinations of genotyped markers is studied, provided those markers span a practical region. The combined patient and control groups can be randomly divided into two sets, equal in size to the original group of patients and controls.
The haplotype analysis is then repeated and the most significant p-value registered is determined. This randomization scheme can be repeated, for example, over 100 times to construct an empirical distribution of p-values. In a preferred embodiment, a p-value of <0.05 is indicative of an at-risk haplotype. A detailed discussion of haplotype analysis follows.
Haplotype analysis One general approach to haplotype analysis involves using likelihood-based inference applied to NEsted MOdels. The method is implemented in the program NEMO, which allows for many polymorphic markers, SNPs and microsatellites. The method and software are specifically designed for case-control studies where the purpose is to identify haplotype groups that confer different risks. It is also a tool for studying LD structures. When investigating haplotypes constructed from many markers, apart from looking at each haplotype individually, meaningful summaries often require putting haplotypes into groups. A particular partition of the haplotype space is a model that assumes haplotypes within a group have the same risk, while haplotypes in different groups can have different risks. Two models/partitions are nested when one, the alternative model, is a finer partition compared to the other, the null model, i.e, the alternative model allows some haplotypes assumed to have the same risk in the null model to have different risks. The models are nested in the classical sense that the null model is a special case of the alternative model. Hence traditional generalized likelihood ratio tests can be used to test the null model against the alternative model. Note that, with a multiplicative model, if haplotypes hi and hj are assumed to have the same risk, it corresponds to assuming thatT; pt =jj pj where/and p denote haplotype frequencies in the affected population and the control population respectively. One common way to handle uncertainty in phase and missing genotypes is a two-step method of first estimating haplotype counts and then treating the estimated counts as the exact counts, a method that can sometimes be problematic (e.g., see the information measure section below) and may require randomization to properly evaluate statistical significance. In NEMO, maximum likelihood estimates, likelihood ratios and p-values are calculated directly, with the aid of the EM algorithm, for the observed data treating it as a missing-data problem. NEMO allows complete flexibility for partitions. For example, the first haplotype problem described in the Methods section on Statistical analysis considers testing whether hi has the same risk as the other haplotypes fa, ..., hk. Here the alternative grouping is [h , [h2, ..., fa] and the null grouping is [hi, ..., fa]. The second haplotype problem in the same section involves three haplotypes hi = GO, h2 - GX and fa = AX, and the focus is on comparing h and fa. The alternative grouping is [hi], [h2], [fa] and the null grouping is [hi, fa], [fa]. If composite alleles exist, one could collapse these alleles into one at the data processing stage, and performed the test as described. This is a perfectly valid approach, and indeed, whether we collapse or not makes no difference if there were no missing information regarding phase. But, with the actual data, if each of the alleles making up a composite correlates differently with the SNP alleles, this will provide some partial information on phase. Collapsing at the data processing stage will unnecessarily increase the amount of missing information. A nested-models/partition framework can be used in this scenario. Let fa be split into faa, fab, ...., h e, and fa be split into faa, fab, . ■ ., fae. Then the alternative grouping is [h ], [h2a, fab, ...., fae ], [faa, fab, ..., fae] and the null grouping is [fa, h2a, fab, ...., fae], [fa , fab, ■ .., fae]. The same method can be used to handle composite where collapsing at the data processing stage is not even an option since Lc represents multiple haplotypes constructed from multiple SNPs. Alternatively, a 3-way test with the alternative grouping of [fa], [faΑ, h2b, ...., h2e], [faa, fab, ..., fae] versus the null grouping of [hh faa, fab, ...., h2e, faa, fab, ■ • ■, fae] could also be performed. Note that the generalized likelihood ratio test-statistic would have two degrees of freedom instead of one.
Measuring information Even though likelihood ratio tests based on likelihoods computed directly for the observed data, which have captured the information loss due to uncertainty in phase and missing genotypes, can be relied on to give valid p-values, it would still be of interest to know how much information had been lost due to the information being incomplete. Interestingly, one can measure information loss by considering a two-step procedure to evaluating statistical significance that appears natural but happens to be systematically anti-conservative. Suppose we calculate the maximum likelihood estimates for the population haplotype frequencies calculated under the alternative hypothesis that there are differences between the affected population and confrol population, and use these frequency estimates as estimates of the observed frequencies of haplotype counts in the affected sample and in the confrol sample. Suppose we then perform a likelihood ratio test treating these estimated haplotype counts as though they are the actual counts. We could also perform a Fisher's exact test, but we would then need to round off these estimated counts since they are in general non-integers. This test will in general be anti-conservative because freating the estimated counts as if they were exact counts ignores the uncertainty with the counts, overestimates the effective sample size and underestimates the sampling variation. It means that the chi-square likelihood-ratio test statistic calculated this way, denoted by Λ*, will in general be bigger than A, the likelihood-ratio test- statistic calculated directly from the observed data as described in methods. But Λ* is useful because the ratio Λ/Λ* happens to be a good measure of information, or 1 - (Λ Λ*) is a measure of the fraction of information lost due to missing information. This information measure for haplotype analysis is described in Nicolae and Kong, Technical Report 537, Department of Statistics, University of Statistics, University of Chicago, Revised for Biometrics (2003) as a natural extension of information measures defined for linkage analysis, and is implemented in NEMO.
Statistical analysis For single marker association to the disease, the Fisher exact test can be used to calculate two-sided p-values for each individual allele. All p-values are presented unadjusted for multiple comparisons unless specifically indicated. The presented frequencies (for microsatellites, SNPs and haplotypes) are allelic frequencies as opposed to carrier frequencies. To minimize any bias due the relatedness of the patients who were recruited as families for the linkage analysis, first and second- degree relatives can be eliminated from the patient list. Furthermore, the test can be repeated for association correcting for any remaining relatedness among the patients, by extending a variance adjustment procedure (e.g., as described in Risch, N. & Teng, J., "The relative power of family-based and case-control designs for linkage disequilibrium studies of complex human diseases I. DNA pooling," Genome Res. 8:1278-1288 (1998)) for sibships so that it can be applied to general familial relationships, and present both adjusted and unadjusted p-values for comparison. The differences are in general very small as expected. To assess the significance of single-marker association corrected for multiple testing we carried out a randomisation test using the same genotype data. Cohorts of patients and controls can be randomized and the association analysis redone multiple times (e.g., up to 500,000 times) and the p-value is the fraction of replications that produced a p-value for some marker allele that is lower than or equal to the p-value we observed using the original patient and control cohorts. For both single-marker and haplotype analyses, relative risk (RR) and the population attributable risk (PAR) can be calculated assuming a multiplicative model (haplotype relative risk model), (Terwilliger, J.D. & Ott, J., Hum Hered, 42, 337-46 (1992) and Falk, CT. & Rubinstein, P, Ann Hum Genet 51 ( Pt 3), 227-33 (1987)), i.e., that the risks of the two alleles/haplotypes a person carries multiply. For example, if RR is the risk of A relative to a, then the risk of a person homozygote AA will be RR times that of a heterozygote Aa and RR2 times that of a homozygote aa. The multiplicative model has a nice property that simplifies analysis and computations - haplotypes are independent, i.e., in Hardy-Weinberg equilibrium, within the affected population as well as within the control population. As a consequence, haplotype counts of the affecteds and controls each have multinomial distributions, but with different haplotype frequencies under the alternative hypothesis. Specifically, for two haplotypes fa and fa, ήs (hi)/ήsk(hj) = fi/pi)/(jjlpj)> where/and p denote respectively frequencies in the affected population and in the control population. While there is some power loss if the true model is not multiplicative, the loss tends to be mild except for extreme cases. Most importantly, p-values are always valid since they are computed with respect to null hypothesis. In general, haplotype frequencies are estimated by maximum likelihood and tests of differences between cases and controls are performed using a generalized likelihood ratio test (Rice, J.A. Mathematical Statistics and Data Analysis, 602 (International Thomson Publishing, (1995)). deCODE's haplotype analysis program called NEMO, which stands for NEsted MOdels, can be used to calculate all the haplotype results. To handle uncertainties with phase and missing genotypes, it is emphasized that we do not use a common two-step approach to association tests, where haplotype counts are first estimated, possibly with the use of the EM algorithm, Dempster, (A.P., Laird, N.M. & Rubin, D.B., Journal of the Royal Statistical Society B, 39, 1-38 (1971)) and then tests are performed treating the estimated counts as though they are true counts, a method that can sometimes be problematic and may require randomisation to properly evaluate statistical significance. Instead, with NEMO, maximum likelihood estimates, likelihood ratios and p-values are computed with the aid of the EM-algorithm directly for the observed data, and hence the loss of information due to uncertainty with phase and missing genotypes is automatically captured by the likelihood ratios. Even so, it is of interest to know how much information is retained, or lost, due to incomplete information. Described herein is such a measure that is natural under the likelihood framework. For a fixed set of markers, the simplest tests performed compare one selected haplotype against all the others. Call the selected haplotype fa and the others fa, ..., fa. Leipi, ...,pk denote the population frequencies of the haplotypes in the controls, and/i, .. ,,f denote the population frequencies of the haplotypes in the affecteds. Under the null hypothesis, T∑ =pt for all /. The alternative model we use for the test assumes fa, ..., fa to have the same risk while fa is allowed to have a different risk. This implies that whiles can be different from/ι,7J (f2+...+fk) =pt (p2+...+pk) = βi for i = 2, ..., k. Denoting/i pi by r, and noting that β2+...+βk = 1, the test statistic based on generalized likelihood ratios is λ = 2 [£(r,p1 2, ..., β^l) - £(l,p12, ...,βk1)] where ^denotes logelikelihood and ~ and Λ denote maximum likelihood estimates under the null hypothesis and alternative hypothesis respectively. Λ has asymptotically a chi-square distribution with 1-df, under the null hypothesis. Slightly more complicated null and alternative hypotheses can also be used. For example, let fa be GO, fa be GX and fa be AX. When comparing GO against GX, i.e., this is the test which gives estimated RR of 1.46 and p-value = 0.0002, the null assumes GO and GX have the same risk but AX is allowed to have a different risk. The alternative hypothesis allows, for example, three haplotype groups to have different risks. This implies that, under the null hypothesis, there is a constraint that' i P\ =fι P2, or w = [fi pi] [f2 p2] = 1. The test statistic based on generalized likelihood ratios is Λ = 2 [£(βι fo, w) - £(pι, fi,p2, 1) ] that again has asymptotically a chi-square distribution with 1-df under the null hypothesis. If there are composite haplotypes (for example, fa and fa), that is handled in a natural manner under the nested models framework.
Linkage Disequilibrium using NEMO LD between pairs of SNPs can also be calculated using the standard definition of D' and R2 (Lewontin, R., Genetics 49, 49-67 (1964) and Hill, W.G. & Robertson, A. Theor. Appl. Genet. 22, 226-231 (1968)). Using NEMO, frequencies of the two marker allele combinations are estimated by maximum likelihood and deviation from linkage equilibrium is evaluated by a likelihood ratio test. The definitions of D' and R2 are extended to include microsatellites by averaging over the values for all possible allele combination of the two markers weighted by the marginal allele probabilities. When plotting all marker combination to elucidate the LD structure in a particular region, we plot D' in the upper left corner and the p- value in the lower right corner. In the LD plots the markers can be plotted equidistant rather than according to their physical location, if desired. Statistical Methods for Linkage Analysis Multipoint, affected-only allele-sharing methods can be used in the analyses to assess evidence for linkage. Results, both the LOD-score and the non-parametric linkage (NPL) score, can be obtained using the program Allegro (Gudbjartsson et al, Nat. Genet. 25:12-3, 2000). Our baseline linkage analysis uses the Spairs scoring function (Whittemore, A.S., Halpern, J. (1994), Biometrics 50:118-27; Kruglyak L, et al. (1996), Am JHum Genet 58: 1347-63), the exponential allele-sharing model (Kong, A. and Cox, N.J. (1997), Am JHum Genet 57:1179-88) and a family weighting scheme that is halfway, on the log-scale, between weighting each affected pair equally and weighting each family equally. The information measure we use is part of the Allegro program output and the information value equals zero if the marker genotypes are completely uninformative and equals one if the genotypes determine the exact amount of allele sharing by decent among the affected relatives (Gretarsdottir et al, Am. J. Horn. Genet, 70:593-603, (2002)). We computed the P- values two different ways and here report the less significant result. The first P- value can be computed on the basis of large sample theory; the distribution of Zjr = (2[loge(10)LOD]) approximates a standard normal variable under the null hypothesis of no linkage (Kong, A. and Cox, N.J. (1997), Am JHum Genet 61:1119- 88). The second P-value can be calculated by comparing the observed LOD-score with its complete data sampling distribution under the null hypothesis (e.g., Gudbjartsson et al, Nat. Genet. 25:12-3, 2000). When the data consist of more than a few families, these two P-values tend to be very similar.
Haplotypes and "Haplotype Block " Definition of a Susceptibility Locus In certain embodiments, haplotype analysis involves defining a candidate susceptibility locus based on "haplotype blocks." It has been reported that portions of the human genome can be broken into series of discrete haplotype blocks containing a few common haplotypes; for these blocks, linkage disequilibrium data provided little evidence indicating recombination (see, e.g., Wall., J.D. and
Pritchard, J.K., Nature Reviews Genetics 4: 587-597 (2003); Daly, M. et al, Nature Genet. 29:229-232 (2001); Gabriel, S.B. etal, Science 296:2225-2229 (2002); Patil, N. et al, Science 294:1719-1723 (2001); Dawson, E. etal, Nature 418:544-548 (2002); Phillips, M.S. etal, Nature Genet. 33:382-387 (2003)). There are two main methods for defining haplotype blocks: blocks can be defined as regions of DNA that have limited haplotype diversity (see, e.g., Daly, M. et al, Nature Genet. 29:229-232 (2001); Patil, N. et al, Science 294:1719-1723 (2001); Dawson, E. etal, Nature 418:544-548 (2002); Zhang, K. etal, PNAS SA 99:7335-7339 (2002)), or as regions between transition zones having extensive historical recombination, identified using linkage disequilibrium (see, e.g., Gabriel, S.B. et al, Science 296:2225-2229 (2002);Phillips, M.S. et al, Nature Genet. 33:382-387 (2003); Wang, N. et al, Am. J. Hum. Genet. 71:1227-1234 (2002);
Stumpf, M.P., and Goldstein, D.B., Curr. Biol. 13:1-8 (2003)). As used herein, the term, "haplotype block" includes blocks defined by either characteristic. Representative methods for identification of haplotype blocks are set forth, for example, in U.S. Published Patent Applications 20030099964; 20030170665; 20040023237; 20040146870. Haplotype blocks can be used readily to map associations between phenotype and haplotype status. The main haplotytpes can be identified in each haplotype block, and then a set of "tagging" SNPs or markers (the smallest set of SNPs or markers needed to distinguish among the haplotypes) can then be identified These tagging SNPs or markers can then be used in assessment of samples from groups of individuals, in order to identify association between phenotype and haplotype. If desired, neighboring haplotype blocks can be assessed concurrently, as there may also exist linkage disequilibrium among the haplotype blocks.
NUCLEIC ACIDS AND POLYPEPTIDES OF THE INVENTION All nucleotide positions are relative to SEQ ID NO: 1 (FIGS. 10.1 to 10.122; Hs_Build34_ chromosome 5_90600642-90925734) as indicated. The nucleic acids, polypeptides and antibodies described herein can be used in methods of diagnosis of a predisposition or susceptibility to obesity and/or an obesity-associated condition, as well as in kits useful for such diagnosis. The reference amino acid sequence for human ARRDC3 (GenBank Accession No.: NP_065852) is described by SEQ ID NO: 3 (FIG. 12). As used herein "associated with the ARRDC3 gene" means in proximity to the ARRDC3 gene as described herein. In one embodiment, a haplotype is within about 350 kb, 300 kb, 250 kb, 200 kb, 150 kb, 100 kb, 75 kb, 50 kb, 25 kb, 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of the ARRDC3 gene, and is thereby associated with the ARRDC3 gene. An "isolated" nucleic acid molecule, as used herein, is one that is separated from nucleic acids that normally flank the gene or nucleotide sequence (as in genomic sequences) and/or has been completely or partially purified from other transcribed sequences (e.g., as in an RNA library). For example, an isolated nucleic acid of the invention can be substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. In some instances, the isolated material will form part of a composition (for example, a crude extract containing other substances), buffer system or reagent mix. In other circumstances, the material can be purified to essential homogeneity, for example as determined by poiyacrylamide gel electrophoresis (PAGE) or column chromatography (e.g., HPLC). An isolated nucleic acid molecule of the invention can comprise at least about 50%, at least about 80% or at least about 90% (on a molar basis) of all macromolecular species present. With regard to genomic DNA, the term "isolated" also can refer to nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. For example, the isolated nucleic acid molecule can contain less than about 350 kb, 300 kb, 250 kb, 200 kb, 150 kb, 100 kb, 75 kb, 50 kb, 25 kb, 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of the nucleotides that flank the nucleic acid molecule in the genomic DNA of the cell from which the nucleic acid molecule is derived. The nucleic acid molecule can be fused to other coding or regulatory sequences and still be considered isolated. Thus, recombinant DNA contained in a vector is included in the definition of "isolated" as used herein. Also, isolated nucleic acid molecules include recombinant DNA molecules in heterologous host cells or heterologous organisms, as well as partially or substantially purified DNA molecules in solution. "Isolated" nucleic acid molecules also encompass in vivo and in vitro RNA transcripts of the DNA molecules of the present invention. An isolated nucleic acid molecule or nucleotide sequence can include a nucleic acid molecule or nucleotide sequence that is synthesized chemically or by recombinant means. Such isolated nucleotide sequences are useful, for example, in the manufacture of the encoded polypeptide, as probes for isolating homologous sequences (e.g., from other mammalian species), for gene mapping (e.g., by in situ hybridization with chromosomes), or for detecting expression of the gene in tissue (e.g., human tissue), such as by Northern blot analysis or other hybridization techniques. The invention also pertains to nucleic acid molecules that hybridize under high stringency hybridization conditions, such as for selective hybridization, to a nucleotide sequence described herein (e.g., nucleic acid molecules that specifically hybridize to a nucleotide sequence containing a polymorphic site associated with a haplotype described herein). In one embodiment, the invention includes variants that hybridize under high stringency hybridization and wash conditions (e.g., for selective hybridization) to a nucleotide sequence that comprises SEQ ID NO: 1 or a fragment thereof (or a nucleotide sequence comprising the complement of SEQ ID NO: 1 or a fragment thereof), wherein the nucleotide sequence comprises at least one polymorphic allele contained in the haplotypes (e.g., at-risk haplotypes) described herein. Such nucleic acid molecules can be detected and/or isolated by allele- or sequence-specific hybridization (e.g., under high stringency conditions). "Specific hybridization," as used herein, refers to the ability of a first nucleic acid to hybridize to a second nucleic acid in a manner such that the first nucleic acid does not hybridize to any nucleic acid other than to the second nucleic acid (e.g., when the first nucleic acid has a higher complementarity to the second nucleic acid than to any other nucleic acid in a sample wherein the hybridization is to be performed).
"Stringency conditions" for hybridization is a term of art that refers to the incubation and wash conditions, e.g., conditions of temperature and buffer concentration, that permit hybridization of a particular nucleic acid to a second nucleic acid; the first nucleic acid can be perfectly (i.e., 100%) complementary to the second, or the first and second can share some degree of complementarity that is less than perfect (e.g., 70%, 75%, 85%, 95%). For example, certain high stringency conditions can be used to distinguish perfectly complementary nucleic acids from those of less complementarity. "High stringency conditions", "moderate stringency conditions" and "low stringency conditions" for nucleic acid hybridizations are explained on pages 2.10.1-2.10.16 and pages 6.3.1-6.3.6 in Current Protocols in Molecular Biology (Ausubel, F. et al, "Current Protocols in Molecular Biology", John Wiley & Sons, (1998), the entire teachings of which are incorporated by reference herein). The exact conditions that determine the sfringency of hybridization depend not only on ionic strength (e.g., 0.2XSSC, 0.1XSSC), temperature (e.g., room temperature, 42°C, 68°C) and the concenfration of destabilizing agents such as formamide or denaturing agents such as SDS, but also on factors such as the length of the nucleic acid sequence, base composition, percent mismatch between hybridizing sequences and the frequency of occurrence of subsets of that sequence within other non- identical sequences. Thus, equivalent conditions can be determined by varying one or more of these parameters while maintaining a similar degree of identity or similarity between the two nucleic acid molecules. Typically, conditions are used such that sequences of at least about 60% identity, at least about 70% identity, at least about 80% identity, at least about 90% identity or at least about 95% identity remain hybridized to one another. By varying hybridization conditions from a level of stringency at which no hybridization occurs to a level at which hybridization is first observed, conditions that will allow a given sequence to hybridize (e.g., selectively) with the most complementary sequences in the sample can be determined. Exemplary conditions that describe the determination of wash conditions for moderate or low stringency conditions are described in Kraus, M. and Aaronson, S., Methods Enzymol, 200:546-556 (1991); and in, Ausubel, F. et al, "Current Protocols in Molecular Biology" , John Wiley & Sons, (1998). Washing is the step in which conditions are usually set so as to determine a minimum level of complementarity of the hybrids. Generally, starting from the lowest temperature at which only homologous hybridization occurs, each °C by which the final wash temperature is reduced (holding SSC concentration constant) allows an increase by 1% in the maximum mismatch percentage among the sequences that hybridize. Generally, doubling the concentration of SSC results in an increase in Tm of about 17°C. Using these guidelines, the wash temperature can be determined empirically for high, moderate or low sfringency, depending on the level of mismatch sought. For example, a low stringency wash can comprise washing in a solution containing 0.2XSSC/0.1% SDS for 10 minutes at room temperature; a moderate sfringency wash can comprise washing in a pre-warmed (42°C) solution containing 0.2XSSC/0.1% SDS for 15 minutes at 42°C; and a high stringency wash can comprise washing in a pre-warmed (68°C) solution containing 0.1XSSC/0.1%SDS for 15 minutes at 68°C. Furthermore, washes can be performed repeatedly or sequentially to obtain a desired result, as is known in the art. Equivalent conditions can be determined by varying one or more of the above parameters, as is known in the art, while maintaining a similar degree of complementarity between the target nucleic acid molecule and the primer or probe used (e.g., the sequence to be hybridized). The percent identity of two nucleotide or amino acid sequences can be determined by aligning the sequences for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first sequence). The nucleotides or amino acids at corresponding positions are then compared, and the percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity = # of identical positions/total # of positions x 100). In certain embodiments, the length of a sequence aligned for comparison purposes is at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 95% of the length of the reference sequence. The actual comparison of the two sequences can be accomplished by well-known methods, for example, using a mathematical algorithm. A non-limiting example of such a mathematical algorithm is described in Karlin, S. and Altschul, S. (Proc. Natl. Acad. Sci. USA, 90:5873-5877 (1993)). Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0), as described in Altschul, S. et al. (Nucleic Acids Res., 25:3389-3402 (1997)). When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., NBLAST) can be used. See the website on the world wide web at ncbi.nlm.nih.gov. In one embodiment, parameters for sequence comparison can be set at score=100, wordlength=12, or can be varied (e.g., W=5 or W=20). Another non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, CABIOS (1989). Such an algorithm is incorporated into the ALIGN program (version 2.0), which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. Additional algorithms for sequence analysis are known in the art and include ADVANCE and ADAM as described in Torellis, A. and Robotti, C. (Comput. Appl. Biosci, 10:3-5 (1994); and FASTA described in Pearson, W. and Lipman, D., (Proc. Natl. Acad. Sci. USA, §5:2444-8 (1988)). In another embodiment, the percent identity between two amino acid sequences can be accomplished using the GAP program in the GCG software package (Accelrys, Cambridge, UK) using either a Blossom 63 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4. In yet another embodiment, the percent identity between two nucleic acid sequences can be accomplished using the GAP program in the GCG software package, using a gap weight of 50 and a length weight of 3. The present invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleic acid that comprises SEQ ID NO: 1 or a fragment thereof (or a nucleotide sequence comprising the complement of SEQ ID NO: 1 or a fragment thereof), wherein the nucleotide sequence comprises at least one polymorphic allele contained in the haplotypes (e.g., at-risk haplotypes) described herein. The invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleotide sequence encoding an amino acid sequence selected from SEQ ID NO: 3, a polymorphic variant thereof, or a fragment or portion thereof. The nucleic acid fragments of the invention are at least about 15, at least about 18, 20, 23 or 25 nucleotides, and can be 30, 40, 50, 100 or 200 or more nucleotides in length. Longer fragments, for example, 30 or more nucleotides in length, which encode antigenic polypeptides described herein, are particularly useful, such as for the generation of antibodies as described below. The nucleic acid fragments of the invention are used as probes or primers in assays such as those described herein. "Probes" or "primers" are oligonucleotides that hybridize in a base-specific manner to a complementary strand of a nucleic acid molecule. In addition to DNA and RNA, such probes and primers include polypeptide nucleic acids (PNA), as described in Nielsen, P. et al, (Science, 254:1497-1500 (1991)). A probe or primer comprises a region of nucleotide sequence that hybridizes to at least about 15, typically about 20-25, and in certain embodiments about 40, 50 or 75, consecutive nucleotides of a nucleic acid molecule comprising a contiguous nucleotide sequence from SEQ ID NO: 1 and comprising at least one allele contained in one or more haplotypes described herein, and the complement thereof. The invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleotide sequence encoding an amino acid sequence selected from SEQ ID NO: 3, a polymorphic variant thereof, or a fragment or portion thereof. In particular embodiments, a probe or primer can comprise 100 or fewer nucleotides; for example, in certain embodiments from 6 to 50 nucleotides, or, for example, from 12 to 30 nucleotides. In other embodiments, the probe or primer is at least 70% identical, at least 80% identical, at least 85% identical, at least 90% identical or at least 95% identical to the contiguous nucleotide sequence or to the complement of the contiguous nucleotide sequence. In another embodiment, the probe or primer is capable of selectively hybridizing to the contiguous nucleotide sequence or to the complement of the contiguous nucleotide sequence. Often, the probe or primer further comprises a label, e.g., a radioisotope, a fluorescent label, an enzyme label, an enzyme co-factor label, a magnetic label, a spin label, an epitope label. The nucleic acid molecules of the invention, such as those described above, can be identified and isolated using standard molecular biology techniques and the sequence information provided in SEQ ID NO: 1. For example, nucleic acid molecules can be amplified and isolated by the polymerase chain reaction using synthetic oligonucleotide primers that are designed based on the sequence provided in SEQ ID NO: 1 (and optionally comprising at least one allele contained in one or more haplotypes described herein) and/or the complement thereof. See generally PCR Technology: Principles and Applications for DNA Amplification (ed. H.A. Eriich, Freeman Press, NY, NY, 1992); PCR Protocols: A Guide to Methods and Applications (Eds. Innis, et al, Academic Press, San Diego, CA, 1990); Mattila, P. et al, Nucleic Acids Res., 19:4967-4973 (1991); Eckert, K. and Kunkel, T., PCR Methods and Applications, 1:11-24 (1991); PCR (eds. McPherson et al, IRL Press, Oxford); and U.S. Patent No. 4,683,202, the entire teachings of all of which are incorporated herein by reference. The nucleic acid molecules can be amplified using cDNA, mRNA or genomic DNA as a template, cloned into an appropriate vector and characterized by DNA sequence analysis. Other suitable amplification methods include the ligase chain reaction (LCR; see Wu, D. and Wallace, R, Genomics, 4:560-469 (1989); Landegren, U. etal, Science, 241:1077-1080 (1988)), transcription amplification (Kwoh, D. et al, Proc. Natl. Acad. Sci. USA, 86:1173-1177 (1989)), and self-sustained sequence replication (Guatelli, J. et al, Proc. Nat. Acad. Sci. USA, §7:1874-1878 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single-stranded RNA (ssRNA) and double-stranded DNA (dsDNA) as the amplification products in a ratio of about 30 and 100 to 1, respectively. The amplified DNA can be labeled (e.g., radiolabeled) and used as a probe for screening a cDNA library derived from human cells. The cDNA can be derived from mRNA and contained in zap express (Stratagene, La Jolla, CA), ZIPLOX (Gibco BRL, Gaithersburg, MD) or other suitable vector. Corresponding clones can be isolated, DNA can obtained following in vivo excision, and the cloned insert can be sequenced in either or both orientations by art recognized methods to identify the correct reading frame encoding a polypeptide of the appropriate molecular weight. For example, the direct analysis of the nucleotide sequence of nucleic acid molecules of the present invention can be accomplished using well-known methods that are commercially available. See, for example, Sambrook et al, Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al, Recombinant DNA Laboratory Manual, (Acad. Press, 1988)). Additionally, fluorescence methods are also available for analyzing nucleic acids (Chen, X. et al, Genome Res., 9:492-498 (1999)) and polypeptides. Using these or similar methods, the polypeptide and the DNA encoding the polypeptide can be isolated, sequenced and further characterized. In general, the isolated nucleic acid sequences of the invention can be used as molecular weight markers on Southern gels, and as chromosome markers that are labeled to map related gene positions. The nucleic acid sequences can also be used to compare with endogenous DNA sequences in patients to identify genetic disorders (e.g., obesity, a susceptibility to obesity, an obesity-associated condition and/or a susceptibility to an obesity-associated condition), and as probes, such as to hybridize and discover related DNA sequences or to subtract out known sequences from a sample (e.g., subtractive hybridization). The nucleic acid sequences can further be used to derive primers for genetic fingeφrinting, to raise anti-polypeptide antibodies using immunization techniques, and/or as an antigen to raise anti-DNA antibodies or elicit immune responses. In another aspect of the invention, small double-stranded interfering RNA (RNA interference (RNAi)) can be used. RNAi is a post-transcription process, in which double-stranded RNA is introduced, and sequence-specific gene silencing results, though catalytic degradation of the targeted mRNA. See, e.g., Elbashir, S.M., etal, Nature, 411:494-498 (2001); Lee, N.S., Nature Biotech., 19:500-505 (2002); Lee, S-K. et al, Nature Medicine, 8(7):681-686 (2002), the entire teachings of these references are incoφorated herein by reference. RNAi is used routinely to investigate gene function in a high throughput fashion or to modulate gene expression in human diseases (Chi, et al, Proc. Natl. Acad. Sci. £/&4,100(l l):6343-6346 (2003)). Introduction of long double standed RNA leads to sequence-specific degradation of homologous gene transcripts. The long double stranded RNA is metabolized to small 21-23 nucleotide siRNA (small interfering RNA). The siRNA then binds to protein complex RISC (RNA-induced silencing complex) with dual function helicase. The helicase has RNAas activity and is able to unwind the RNA. The unwound si RNA allows an antisense strand to bind to a target. This results in sequence dependent degradation of cognate mRNA. Aside from endogenous RNAi, exogenous RNAi, chemically synthesized or recombinantly produced can also be used. As used herein, two polypeptides (or a region of the polypeptides) are substantially homologous or identical when the amino acid sequences are at least about 45-55%o. In other embodiments, two polypeptides (or a region of the polypeptides) are substantially homologous or identical when they are at least about 70-75%, at least about 80-85%, at least about 90%, at least about 95% or identical. A substantially homologous amino acid sequence, according to the present invention, will be encoded by a nucleic acid molecule comprising SEQ ID NO: 1 or a portion thereof, and further comprising at least one polymoφhism as shown in Table 1 or Table 2, wherein the encoding nucleic acid will hybridize to SEQ ID NO: 1 under stringent conditions as more particularly described above. A substantially homologous amino acid sequence will also be encoded by a nucleic acid molecule hybridizing to a nucleic acid sequence encoding SEQ ID NO: 3 or a portion thereof, or a polymoφhic variant thereof, under stringent conditions as more particularly described above. A variant polypeptide can differ in amino acid sequence by one or more substitutions, deletions, insertions, inversions, fusions, and truncations or a combination of any of these. Further, variant polypeptides can be fully functional or can lack function in one or more activities. Fully functional variants typically contain only conservative variation or variation in non-critical residues or in non- critical regions. Functional variants can also contain substitution of similar amino acids that result in no change or an insignificant change in function. Alternatively, such substitutions can positively or negatively affect function to some degree. Nonfunctional variants typically contain one or more non-conservative amino acid substitutions, deletions, insertions, inversions, or truncation or a substitution, insertion, inversion, or deletion in a critical residue or critical region. Amino acids that are essential for function can be identified by methods known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham, B. and Wells, J., Science, 244:1081-1085 (1989)). The latter procedure introduces single alanine mutations at every residue in the molecule. The resulting mutant molecules are then tested for biological activity (e.g., using an in vitro assay). Sites that are critical for polypeptide activity can also be determined by structural analysis, for example, by crystallization, nuclear magnetic resonance or photoaffinity labeling (Smith, L. et al, J. Mol. Biol, 224:899-904 (1992); de Vos, A. et al, Science, 255:306-312 (1992)). The isolated polypeptide can be purified from cells that naturally express it, purified from cells that have been altered to express it (recombinant), or synthesized using known protein synthesis methods. In one embodiment, the polypeptide is produced by recombinant DNA techniques. For example, a nucleic acid molecule encoding the polypeptide is cloned into an expression vector, the expression vector introduced into a host cell and the polypeptide expressed in the host cell. The polypeptide can then be isolated from the cells by an appropriate purification scheme using standard protein purification techniques. In general, polypeptides of the present invention can be used as a molecular weight marker on SDS-PAGE gels or on molecular sieve gel filtration columns using art-recognized methods. The polypeptides of the present invention can be used to raise antibodies or to elicit an immune response. The polypeptides can also be used as a reagent, e.g., a labeled reagent, in assays to quantitatively determine levels of the polypeptide or a molecule to which it binds (e.g., a receptor or a ligand) in biological fluids. The polypeptides can also be used as markers for cells or tissues in which the corresponding pol j jieptide is preferentially expressed, either constitutively, during tissue differentiation, or in a diseased state. The polypeptides can be used to isolate a corresponding binding partner, e.g., receptor or ligand, such as, for example, in an interaction trap assay, and to screen for peptide or small molecule antagonists or agonists of the binding interaction.
ANTIBODIES OF THE INVENTION Polyclonal and/or monoclonal antibodies that specifically bind one form of the gene product but not to the other form of the gene product are also provided. Antibodies are also provided that bind a portion of either the variant or the reference gene product that contains the polymorphic site or sites. The invention provides antibodies to polypeptides having an amino acid sequence of SEQ ID NO: 3 or a variant ARRDC3 polypeptide. The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds an antigen. A molecule that specifically binds to a polypeptide of the invention is a molecule that binds to that polypeptide or a fragment thereof, but does not substantially bind other molecules in a sample, e.g., a biological sample that naturally contains the polypeptide. Examples of immunologically active portions of immunoglobulin molecules include Fv, Fab, Fab' and F(ab')2 fragments. Such fragments can be produced by enzymatic cleavage or by recombinant techniques. For example, papain or pepsin cleavage can generate Fab or F(ab')2 fragments, respectively. Other proteases with the requisite substrate specificity can also be used to generate Fab or F(ab')2 fragments. The invention provides polyclonal and monoclonal antibodies that bind to a polypeptide of the invention. The term "monoclonal antibody" or "monoclonal antibody composition", as used herein, refers to a population of antibody molecules that contain only one species of an antigen binding site capable of immunoreacting with a particular epitope of a polypeptide of the invention. A monoclonal antibody composition thus typically displays a single binding affinity for a particular polypeptide of the invention with which it immunoreacts. Polyclonal antibodies can be prepared as described above by immunizing a suitable subject with a desired immunogen, e.g., polypeptide of the invention or fragment thereof. The antibody titer in the immunized subject can be monitored over time by standard techniques, such as with an enzyme linked immunosorbent assay (ELISA) using an immobilized polypeptide. If desired, the antibody molecules directed against the polypeptide can be isolated from the mammal (e.g., from the blood) and further purified by well-known techniques, such as protein A chromatography (e.g., to obtain the IgG fraction). At an appropriate time after immunization, e.g., when the antibody titers are highest, antibody-producing cells can be obtained from the subject and used to prepare monoclonal antibodies by standard techniques, such as the hybridoma technique (Kohler, G. and Milstein, C, Nature, 256:495-497 (1975)), the human B cell hybridoma technique (Kozbor, D. et al, Immunol Today, 4:72 (1983)), the EBV-hybridoma technique (Cole et al,
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 (1985)) or frioma techniques. Technology for producing hybridomas is well known (see generally Current Protocols in Immunology (1994) Coligan et al. (eds.) John Wiley & Sons, Inc., New York, NY). Briefly, an immortal cell (typically a myeloma) is fused to a lymphocyte (typically a splenocyte) from a mammal immunized with an immunogen as described above, and the culture supernatants of the resulting hybridoma cells are screened to identify a hybridoma producing a monoclonal antibody that binds a polypeptide of the invention. Any of the many well known protocols used for fusing lymphocytes and immortalized cells can be applied for the purpose of generating a monoclonal antibody to a polypeptide of the invention (see, e.g., Current Protocols in Immunology, supra; Galfre, G. et al, Nature, 266:550-552 (1977); Kenneth, R., in Monoclonal Antibodies A New Dimension In Biological Analyses, Plenum Publishing Coφ., New York, New York (1980); and Lerner, E., Yale J. Biol. Med, 54:387-402 (1981)). Moreover, the person of ordinarily skill in the art will appreciate that there are many variations of such methods that also would be useful. Alternative, or in addition, to preparing monoclonal antibody-secreting hybridomas, a monoclonal antibody to a polypeptide of the invention can be identified and isolated by screening a recombinant combinatorial immunoglobulin library (e.g., an antibody phage display library) with the polypeptide to thereby isolate immunoglobulin library members that bind the polypeptide. Kits for generating and screening phage display libraries are commercially available (e.g., Pharmacia Recombinant Phage Antibody System, Catalog No. 27-9400-01; Stratagene SurfZAF™ Phage Display Kit, Catalog No.240612). Additionally, examples of methods and reagents particularly amenable for use in generating and screening antibody display library can be found in, for example, U.S. Patent No. 5,223,409; published PCT Application Nos. WO 92/18619, WO 91/17271, WO 92/20791, WO 92/15679, WO 93/01288, WO 92/01047, WO 92/09690 and WO 90/02809; Fuchs, P. et al, Biotechnology (NY), 9:1369-1372 (1991); Hay, B. et al, Hum. Antibodies Hybridomas, 3:81-85 (1992); Huse, W. et al, Science, 246:1275- 1281 (1989); and Griffiths, A. etal, EMBOJ, 12:725-734 (1993). Additionally, recombinant antibodies, such as chimeric and humanized antibodies (e.g., antibodies comprising both human and non-human portions), which can be made using standard recombinant DNA techniques, are within the scope of the invention. Such chimeric and humanized antibodies (e.g., monoclonal antibodies) can be produced by recombinant DNA techniques known in the art. In general, antibodies of the invention (e.g., a monoclonal antibody) can be used to detect a polypeptide (e.g., in a cellular lysate, cell supernatant, tissue sample) in order to evaluate the abundance and pattern of expression of the polypeptide. Antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, for example, to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 1251, 131I, 35S, 32P, 33P, 14C or 3H. The invention will be further described by the following non-limiting example. The teachings of all publications cited herein not previously incoφorated by reference are hereby incoφorated by reference in their entirety.
EXAMPLES
Example 1: Identification of at-Risk Haplotypes Associated with the ARRDC3 Gene and with Obesity Genome-scans were carried out with Icelandic family material that included > 2 affecteds and 5 or 6 meiotic events per cluster (cormectivity ascertained by the Icelandic genealogy database). Subjects suffering from clinical obesity (BMI > 30) were initially ascertained based on various end-point complications including type 2 diabetes, hypertension, stroke, myocardial infarction, familial combined hyperlipidemia and peripheral arterial occlusive disease. Different overlapping phenotypes (e.g., BMI >27, >30, >31....>35) were then examined. For each linkage study, three different measurements-based phenotypes were considered: All, female-only or male-only phenotypes. Genome-wide Scans for Obesity Susceptibily Loci The genome- wide scan was performed using a set of 1100 framework markers (microstatellites) with an average intermarker distance of 4 cM. For the markers in the framework map the genetic locations were taken from the recently published high-resolution genetic map (HRGM), constructed at deCODE genetics (Kong et al. (Nature Genet, 31:241-247 (2002)). All data, including logarithms of odds (LOD) scores and nonparametric linkage (NPL) scores was produced using the Allegro software (Gudbjartsson et al. (Nature Genet., 25:12-13 (2000)) and statistical significance determined by applying model-free affecteds-only allele sharing methods. The allegro program produces LOD scores on the basis of multipoint calculations. The baseline linkage analysis utilizes the Spairs scoring function (Whittemore and Halpern (Biometrics, 50: 118-127 (1994); and Kruglyak et al. (J. Hum. Genet., 58:1347-1363 (1996)), the exponential allele-sharing model, and a family-weighting scheme that is half-way on the log scale between weighting each affected pairs equally and similarly weighting each family equally (Gretarsdottir et al. (Am. J. Genet., 70:593-603 (2002)). The Genome- wide scan was initiated using 51 pedigrees containing 124 affected males and 257 relatives. Here the framework scan lod score was 3.05 (p=0.00001, information 0.78). The unrestricted analysis (all) used 168 families containing 520 affected persons and 883 relatives and the lod score in the peak regions was 2.15. In an analysis using female-only runs there were 97 families containing 245 affected females and 612 relatives and the lod score in the peak region was 0.80. By genotyping additional 13 markers in the peak region the lod score increased to 4.56 (p=0.0000023, information 0.93) using male affecteds only. The one-lod drop region is 7 cM. Adding markers to the entire pedigree set and female-only runs did not affect the lod scores for that analysis. Two hundred and thirty markers covering the one-lod drop interval 82-92 (NCBI Build 33) have been genotyped on 930 obese males and their first-degree relatives plus a random sample of over 400 subjects. Genome scans for obesity revealed a major locus at 5ql4 with evidence for linkage (lod score = 4.56, p-value = 0.0000023, information 0.93) using male affecteds only (FIG. 1).
Ultra-fine Mapping and Association Studies Across the 1 -lod Drop The above-described markers were used to assess the linkage disequilibrium (LD) across the one-lod, for defining haplotypes, and for testing their association with obesity. Linkage disequilibrium (LD) between pairs of biallelic markers was estimated using the standard definition of D' (Lewontin, Genetics, 49:49-67 (1964)) and the deviation from linkage equilibrium was tested using a likelihood ratio test. The definition of D' was extended to include microsatellites by averaging over the values for all possible allele combination of the two markers weighted by the marginal allele probabilities. The corresponding p-value was defined as the minimum p-value for the pair of markers over the same combination, provided the joint probability was higher than 0.05. The LD assessment was carried out across the one-lod drop interval showing a near-uniform distribution of LD blocks. The marker density at the 5ql4 locus was one marker per 42 kb. Haplotype frequency in patients and random controls was estimated using an EM-algorithm (Expectation Maximum) as follows. To assess the correct significance levels multiple testing were adjusted for using randomization methods. For haplotypes consisting of one, two, or three markers, all possible haplotype combinations were tested for association with the disease. The combined patient and confrol group was then randomly divided into two groups, in size equal to the original patients and control groups, and the association analysis was repeated. The randomization step was repeated N times (N=100 to 1000) and an empirical distribution of lowest p-values was constructed. This provided an adjusted p-value. For haplotypes with more than three markers an iterative procedure was used (for computational reasons). The most significant 3 markers haplotypes were extended to 4 markers, by including the remaining markers one-by-one, and those haplotypes tested for association. This procedure was then iterated, i.e., selecting at each step the most significant haplotypes and adding other markers, until haplotypes including (typically) 6 to 10 markers were reached. To assess the significance of the p-values observed this way, this iterative procedure was repeated several (100) times using a randomized patient and control sets, and again the empirical distribution of p-values were constructed. This approach was used to identify the best haplotypes at the 20 Mb 1-lod drop at 5ql4 that associated with obesity, and resulted in a highly specific and significant association pointing to a 280 kb region. A haplotype based on 4 microsatellite markers that strikingly associates with severe obesity (BMI top 5% or BMI>35) with relative risk (RR) of 3.2 (FIG. 2) was found. This haplotype involves microsatellite markers DG5S744, DG5S740, DG5S1387 and DG5S748 (haplotype I; see Table 1). For the microsatellite alleles shown in FIG. 2 (as well as in FIGS. 6, 7.1 and 7.2), the CEPH sample 1347-02 (CEPH genomics repository) is used as a reference. The lower allele of each microsatellite in this sample is set at 0 and all other alleles in other samples are numbered accordingly in relation to this reference. Thus, allele 1 is 1 bp longer than the lower allele in the CEPH sample 1347-02, allele 2 is 2 bp longer than the lower allele in the CEPH sample 1347-02, allele 3 is 3 bp longer than the lower allele in the CEPH sample 1347-02, allele 4 is 4 bp longer than the lower allele in the CEPH sample 1347-02, allele -1 is 1 bp shorter than the lower allele in the CEPH sample 1347-02, allele -2 is 2 bp shorter than the lower allele in the CEPH sample 1347-02, and so on. Note that this same CEPH sample is a standard that is widely used throughout the world for calibration and comparison of alleles. The most significant haplotype described above, spans a region of 280 kb that encompasses several annotated sequences (FIG. 3.1). Analysis of all expressed and nonexpressed sequences that are annotated in the 280 kb region resulted in the exclusion of all but two genes (FIG. 3.2). The exclusion was based on the lack of splice sites, no match to nonhuman sequences, non-identity to other human sequences, being a repeat element or a pseudogene. The KIAA1376 protein (e.g., GenBank Accession No.: NP_065852), one of the two included genes, shows a relatively strong identity (40-60%) to the human thioredoxin interacting protein (TXNIP) located on chromosome lq21 (FIG. 4). TXNIP is a major locus for familial combined hyperlipidemia in various populations (Pajukanta et al.
(Mammalian Genome, 12:238-245 (2001); and Bodnar et al. (supra)). Here, a null mutation in the synonymous mouse gene (Hyplip 1) results in hyperlipdemia (Bodnar et al. (supra)). Therefore, KIAA1376 was named "TXNIP homologue" (ARRDC3). Further characterization revealed arrestin-like motifs in the ARRDC3 protein.
Assessment ofLD Blocks Using SNPs and Microsatellites In and Around the ARRDC3 Gene A number of SNPs in and around the ARRDC3 gene were identified by sequencing 282 obese male humans. Using microsatellite markers and SNPs, a strong LD block of 120 kb was identified. The LD block encompasses the 14 kb ARRDC3 (KIAA1376) gene and 106 kb in the 5' UTR region including the ARRDC3 (KIAA1376) promoter. The LD block does not encompass any other genes but ARRDC3 (KIAA1376) (FIG. 5).
Association Tests Using SNPs In and Around the ARRDC3 Gene A haplotype based on microsatellites and SNPs was defined that showed a striking association with obesity in males (FIG. 6). For SNP alleles identified in FIG. 6 (as well as in FIGS. 7.1, 7.2, and 8), A = 0, C = 1, G = 2, and T = 3. For microsatellites, alleles are identified with respect to the CEPH sample 1347-02, as described above. The haplotype shown in FIG. 6 (haplotype II) involves the markers DG5S745, SG05S41, SG05S422, DG5S741, SG05S32, SG05S31, SG05S30 and SG05S651 (See Table 1 and Table 2). Using SNPs in combination with microsatellite markers, a haplotype that associates strikingly (p=7.9xl0"9) with obesity in males was identified, with carrier frequency in affecteds (males with BMI in the top 10% of the distribution) of 27% vs 11% in population-based controls with a relative risk (RR) of 2.7 and a population attributable risk (PAR) of 16.5%. The number of affecteds was 755, while the number of confrols was 406. All results are significant after correction for multiple testing (p=0.0001). This SNP-microsatellite haplotype was found to be independent of haplotype I, as described above, which was based on microsatellites only. The haplotype risk comparison results in a combined nominal p-value of 1.25x10"9.
Single Marker Association Single markers (5 SNPs and 1 microsatellite marker) were also found to associate strongly with obesity in males (FIG. 7.1). This haplotype (haplotype III) includes the markers: SG05S40, SG05S37, SG05S421, SG05S31, SG05S30 and DG5S743. Here for instance, three SNPs that reside within the LD block (see above) to show a sfrong single marker association (p=2xl0"6to 6xl0"6) with obesity in males were found. Moreover, these markers all reside within the LD block and are highly correlated (FIG. 7.2). A SNP-only haplotype was also identified. This haplotype (haplotype IN) includes the markers SG05S40, SG05S436, SG05S435, SG05S433, SG05S37,
SG05S422, SG05S444, SG05S421, SG05S35, SG05S33, SG05S32, SG05S31,
SG05S30 and SG05S651 (See Table 1 and Table 2) with a nominal p-value of 9xl0"6 (FIG. 8). Association of markers outside the LD block was non-significant. Additional information, regarding the above-described haplotypes, including the location of the markers in SEQ DD NO: 1 (Hs_Build34_ chromosome 5_90600642-90925734), they type of marker, and the polymoφhism of the SNP is provided in Table 2.
Table 2. Sequence information of markers associated with obesity and obesity- associated conditions.
Figure imgf000068_0001
Specific sequence information on the markers are as follows (uppercase letter indicates an exon or SNP and lowercase lettering is flanking sequence (except in the case of a microsatellite, which is indicated by all capital lettering):
DG5S748: CCGATCAGGATCTCATTTAATCTGTCCATATTAAATGCAATAGCCTCCTC AGTATAAAGATGTGTGTGTATATGAAATGCATATGTCGTGTGTGTGTGTG TGTGTGTGTGTGTGTACATATATATAGGATAAAGGTTCCGACAGCT (SEQ ID NO: 4)
SG05S648: catgcatttatttaatgcagacctaggcattatgttctaagggacgctgaagataccgaaatgaaaatactaagtatgggca tcaaaatgttcatatggaaaggggaaaaagacatttaaaagacatgtaacacacaaatacaatattcttaccactaatgagg actcatttccctccgtagtcaggcctctgggtcttaggataatagaaatgaagatcgggaaggcagagacatgtggccta cataatcttagaagattttattatcatttatatctaagagtcaggccacgttttcagaccaattgccaaattgcaatgacttctct agtctcctcccaccaatttcatcattctctcaacatactactcatgcaaaactgagagggtgagaggaaatatactcaggag ggcttgctacttttgctctttatcccatgaggtgtctgatcctatctggtcatttccagtlttattgacatattatgatactttttaaG cacctagcaacattgaagggataatctggtgttcttccccctcaacatcactgctggatattgtacaggctattgttcctgtat cccagacccaccctctattctggaggctttctagtgttctgggcccttaaagagaggtctcttgttcatgtggtttgactcaga tttggttgtgcatgctcagagcataggagacaccaatacacaatgtggtggctgtatcctgtggccagtcacggcctacat caaggaaaaggacagctgccataattcaggttttagattgttgcatcccaagcagaaaatggaatcatttcagtactgactc tgcacgaactcagctcttatcttcagcaggggcagatactttgcagcaggatacagaaagactgattccagcctccctcca atcacatcacctcaaatttttctgtcttacacctacaccaagtccttgaacttagatacaaataattaatgcatctgtcctgtggt tg (SEQ ID NO: 5)
SG05S649: ttcattctagxτtctattgaaatgtgtatttctcttaaaggccttccctgacatcctatccaaaatacatccttttatgcattatatatt tatagaacttatcactacttcaccttatattacgtatttgttcctgttgattgccaatatgccctattaaaatgtaagctctatgag gacaagtactttgtttagctcatgtttgcattcccaaaacctagaacatgccttcatatagtaaatgctcaataaatatttattga atgtacatttcgcctctccttcttaatcacagacattgccagagggacataggaacagtaagatgtggataaccctttcttgt ctgcataaattatcatgggaatatccccagctttgggaatcagaaccccagaggcattcagacaatcccaacctatgactg agggctcaaatatcttgatatcacaagttcctcttgttggagaataaaaaaatctatttctgttggttttcatAaaattccatttgt ataagataaaaataattagcttctggactcttggggttgagaggataaagaaacactttaaaaagtacttcttggagtggaat tgaaaatgggcctaggaccagggtcagaaaattacttgatatttggtagcattctatatttcttaattattttatctatgtgcata aatcatttttaaattttaaaaggagaaatggtttacaataagatgaaatattttaaatatattctggaaaataatacatattaaact tgttaacatctactttttcccctctatttccccttacaagtaaggtagcatcttcatatttttaacaaaagggcactaactcttttaa aacaccaagattcatttttgttcactaagtttcccaaggaagtcagagactccttgtcatatatccttctttcttctcattccaag atcaaatagtacatatttacgtagttattctactcttatactctgcctttgcagatt (SEQ ID NO: 6)
SG05S650: ggtgcgcctgtagtcccagctacttgggaggctgaggcaggagaatctcttgtacccgggaggcagagtttgctgtgag ccgaggtcgtgccactgcactccagccagggcgacggagccagactctgtctcaaaaaactaaaaactgxTtttttaaaa aatgttgttcagcatgtgagcaaatactttaccacacgcaaaaccagcttaatagcaacctcattttactttcttcttcctccag agcaactgaacaaacccggggacaggtccctaagcctcttcatgcccagcctttctttccttggcttcacaatattacagat gctaagatcatgaaatttgaatttaaatagacgtagattttattcttatctcctaaatgtgatctatgtgaccaggaaacattgttt aacctctataaattttagctatcttactgatgaaatggagctaatccttacctcatagagttttcgggggattaaattggatgat gcAtaagaagcaccacagagcttggtccggagagattgtttaatgtgtttggtagtttgctttggtaattttcttccactgaat tcacaagcagacaaatatcctgccagttcccaaatattcaacatatatgccaagctgtgtaactttactcagtttgtaaatttac acagccaagctgctccctttggttagaataccctttcttctgggcccagctgaagtatatcctctcccagaaagaaaggctg ccttaataataatgtctcttacatttctcctttctccagttcctagagcagctatttggttcttgacatgacctgccttgtttcacta ataccccttccaactagattttacactgctttagggcaggacccagggtccagcactgaacagtttcatataataggtactc agcagacaccagctgtttataattttgaagtttggacgttgtgaagttcaggtgtatggcattcccttaggcaagggtataa (SEQ ID NO: 7) SG05S651: ttctgccctctgagtcatccctctttcttcatgagaatagtcatctttgcaaataagcagggttlttttttaacttgtctcctgcga gttaggaatacaaagaatttccttttcattttgtatcatctcatccctttaagtcaaacxτtgtgtaactcctttagaagctttgtgg gcctcctgtgtgtcaaattataatccactccactcaacaaaattcacaccccaatttctttctaaaaatacttggcccaagcat caaagtccatggactttagggtacttagaagccttttttctagtttaaaatatctaagagtcacactcttaagatctttagggga cctattttctatcagaaagggcatgcattaatatttttggaagccctaagcacacccttaaacttttcttaggtcacaacaaac gtcttacaggcaaacccttgaggttttaatctgagactatgccttattgatagcatgctggatctgatcattAttcatttttctgg tgtcatcactgattttgaaaatcttttgttgggagaggctaaagatgagaaacaltttttttaacacagaaagtcccagcttctt catgtttcctctaattttgcttaacaactattccttaaatttatctaaagtctttttaaacacagtttttaaatttcattctgaaag^ aaaatgcaaacattttcatttggaaaattctataatcaagaactaatgctaatcttttggaagacagtacaaacaagtaagag aacagtgtatgaatttcactgcctattctttctggtaattggtctgtagctacaacaattgtgtcctaatttcaagacacaccaa ggacactggctaataaatacttttctcaaaaaaaacattaatctatctatgtctttatctctctagaagaaacatccccatgactt tcagatctggctgcactgtcatggttgtaaagcttctcccattgctggttattctccc (SEQ ID NO: 8)
SG05S30: tctattgtggactatttacttgtatttctcactcacttaaatcttatagacaatataaagtgctggatttggccaggcgcattgcct cacatctgtaatcccagcacttttggaggccgaggcaggcagatcacctgaggtcaggagttcaagaccagcctggcca acatggtgaaaccccatgtcttctaaaaatacaaaaattagccaggcatggtggcacgagtctgtaatcccagctactcgg gaggctgaagcaggagaattgctggaacccaggaggcggaggttgcagtgagccaagatcacaccattgcactccag cctgggtgacaagagcaagactccatctcaaaaaaaaatgctggattcataggatggcctgtaagtattagaggaattca aaaagaccaatgcaggtgaattagtcaagtgtggcttcaggaagaaatcaccatcggtttttttcttaaccagacccatcttt tctaaacaactggCttttgcatagagccagaatctaatcaggctctctctgtttccttgtaaagtcaactttgatatttatctaaa cattttatagctccttctgctaatttacatgctgtttgtattacacaggtgaagtcaaacctttttcaatgaatatcataatagtga agtggttctggcttacaaagttgtttctgtgttccacttgacattcataggttcaaaattttagtaacttggttataataaactaaa caatggactatctagtacacaggaatccttttctgaaagccatctacatggtcccattccttacagagaaagaccatgataat gtgttggtacattgcaatgaagggagagaaggaagggaaggaggaagaaagaaagaggaaatgaaagagggaaag aagggaaaaagggaggaaggagagaaggaaggaaaaaagggagggagaattgaattgaactaagatgtttacacttt ctattttgtcatctcatag (SEQ ID NO: 9) SG05S31: gatttgtgagcttttggtgcacccatcaccccagcaatatacactgtttatccctcatccccttccctccctttccctctgagtc cccaaagtccattgtatcattcttttgcctttgcatcttcatagctt cactactcagttactttacttaggataatagtctccaatcccacccaggttgctgtgaatgccattatttcattctttttt gagtagtattccatcatatatacataccacagtttattttaatccactcattgattgatgggcatttgggctggttccacatttttg caactgcgaattgtgctgctatatgcatataggccttgaagtatttctctatgcgtttgtaactatctagctgtctggagcatat gagtttgcaacctcttgattagagtccaatattgtaggcctggaaccacttacaaggtgagtatgtacGtatctgcaatgtca tctattatactatttttaaaatttaactctcagacaat^ ctgacaatgtaagctgtcaaaggggaccagagcactggagttgggggaagagaggattaagcctatctcctacttctccc tccccaaaactctcaagcaactaagaggattttggggtattacatgattctgtaggctatggatatctgggtatggtgaaata gcccgtctgttatataacttcataaaagaaactcagggagctaaaactcatcttttaacctgggaacctgaagctcaaaagg cctttaaccagctaaagcaagtcttgctcaaggcaccagctctcagccttcctgtagggaaggtcttcaatctgtatgtatca gaaaggaagggaatagccctgggagttttaacacaggctggaggaccagctcaaccagtgggttatcta (SEQ ID NO: 10)
SG05S32: ggccttatttaaaccaaactctccctctcctaaaaaaaattcaacaagaaatatgacaatctatgaaaaacatccttaaaaga agataacgaaagataaaacacagaataattcctacxTtaagtttcactcttggttcgcttgagttgctgtcacctctgggccat ttgtagaaatagacttaaacagtaagtattgtgaaggatagatacaagattataaattctctgagaacaaaagagaggctgc agtttgccaaagcaggaactgatgtttgtcttctgaggttaggtaatgttattatcaagtgctacaaatgtaaatatatacagct agacatacggcacacatacatacatatgtttatgttatatattag atgcttcctaaaaagacaaaccatacatagccatctttg gatcagagtactctgtgaagtatcacgtacctagtcctcaacataaatgcctgagctatataacaatatttattaactgtcGtt ggctaaagacttactaggctagccaactgaacctacaacaacttactaggtagttgttatgacctgaattgtatccccaccc aaaatttatttτtttaaatcccaacccccactacctcagaatgtgactatttggatatagagcctttaaagtggtaattaaggta aaataaggccatatgggtgagccttcatccagtatgactggtgttcttataaggagagattacgacacagatacatagagg aaagacatgtgaagacaccagagaaccatggccatctacaaggtaaggagagaggccttagaagaaaccaaccctact cacaccttgatcttggaattctagcctctagaattgtaaagaaaataaatttcaattgtttaagccaccagtctgtggtacctgt aatggcaacactagaaaactatcatcataaagtagataaattaxtttatcctacttccggattcl ttgaggtttaagaactggc ag (SEQ ID NO: 11) DG5S741:
CTTCTGCCTTTCTCCCTCCTTTCCTTTGCTTCCTTCCGTTACTGACTTGTGC GTGCTCTCTCTT ATCTTTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT GTGTGTGTGTGTGTGTATGTGTATGTGTCTTTCTGTTAAAACCCATTGAGT GCCTGCTACCTATATGTATAATGCTTCTCTTGGACACTGAAGATACAGCA GTGAACAAAAGTGTAGTAGCACAGAACACAAGCAATCTTG (SEQ ID NO: 12)
DG5S743: ATTCTTGGGACAACCAATGCCAGAAATTTTGACAAATATTCTCATAAGTG CTAAAGCAGATGAGTGAGGAAGCACAAGGTGTTAGTTAAAGAGTAACTG AAAGCAAGGAGATTTTATCTGGGTGTCACCAGAAGTGCAGCATTTGTAT TGAAGTAGCTGGAGGACATTAGAAAAAACAGACAGACAGACAGGAATA AATGAGGATGTTAAGGAAGAATTTTACAAAATCTCTACTAAGTTTCCCA GATCAGCAAACATTTTCAGAGACAGAGAGAGAGAGAGAGAGAGAGAGA GAGAGAGTGTGTGTGTGTGTGTGTGTGTGTGTTGGAGAGAAGTGGGACG GGTGGGATAGTGGAAAGAAATCAATCCTCAAAGTTGCCAACTCACTTTT CCCAAGGAAAAi TTTTTAAAATTCTTCCATCAACTATCAACTCAATTTT ATGAAGAATATTTTATCACAAAAAGCTAAAATAATATTGGAATTTTAATT TTGAATAATT TAGCTTTATACAACCCTAACTTTATCATGTTACTGTGGAC CAGTAAAAACTTTGTCACAAGCCAGTTTCTGTGAATAGCTGACCTGAAAT GGAAAGTCCAAATTGAGATTAATAATTTCATCC (SEQ ID NO: 13)
SG05S33: gggatggagctggaagaaggaatctagaagaagtgagaacttccaaaatgaccctgatgtaaccctgaaacttatgggg acctagattaggatgcagacagagaaaaaggagaggagtattttagaaggaaaaccaaaaggaaaccttaaaaataag cctattttcccactggaggaaaggtttagtttgttttggtttggtttggttttaatggtttgtataacaaacag^ taacacatcaaaatggtccatagatcttgaaaaaatgctacataaacaaagatataatccagtaatgagaaagaacagtatt tcaacatcttccgcgggttaggggctattacttaaaccttcatctggtagacttgacccctggggaaggcagcaaactgat cttgtttatgtaactctgatctcacttttgtttttactτttatcatgacccattcaaccacgcagctacatgaaaacaa aacagCcatgtactaaagtatgagatggttatttgtacctttgcaactctgaagaaggaaagcattacttagcttagactaca agtatttatcaagaattattgaaactggccaggcgcagtggctcacgcctgtaatcccagcactttgggaggccgaggtg ggtgaatcacctgaggtcaggagttcgggaccagcctggccaacatagtgaaacccccgtctctactaaaaatatgaaa attagtcggggtggtggcaggaacctgtaatcccagctattcgggaggctgaggcatgagaattgcttgaacccaggag gcggggttgcagtgagccgagatcatgccactgcactccagcctgggccacagtgtgagactctgtctcaaaaaaaaaa aagaagaagaagaatttattaaactaagagcatattttatctgcaagtagttttaacctaagtaaatcacagtacttttctttgta gtaaaataatactttatttt (SEQ ID NO: 14)
SG05S34: ctgctactccagcacctgcaactgctcacactgtctctgtgctttagtctctttgtagcttctccactacctgaaattacattcta agcttacaggttagcctgttcattattcaactgtaccattaaaaatataaatttcatgagggaaggcaatttgtgtctgttttttac cacttgatcctcagtgcctaggaagtggctggaacatacaatgaattcaataaatatttgttaaatgaataaatagcttaaata atatattcactctgtgcagctaaacataaaactcagctttcattcaagaactattaaataattcactcacacaaaaatøxrtttgg gcaatgttgattttactaaagattttgtgtaggaaatggagaaaggatttatactggcactcttcattaaaaattgctcaacctc atagataacaaaaagaatgcaaattaaaagtgcaccaagaagcctgaggtattagaggttatgaggagtgaaGagaact tctgctacagattagagtgtggaggagggagttgccaagcacaaaacccaagtgcctgacataaagtatcaaagtgaga aaggtatttatgtaggaggaggcaacagtgtgagacttgttacctacagggatgttgatgaaatatgtaagtacattaagaa taacatgagccaggacttcgactgtgagagaagagtgttaaaaatatggaaagggagaaaattagaatggatctggtgat gttggatagtagttgaaggaataggtatcaactagtagttgtttatatatggagatagtaaccaatgtcagtatgtgtctgttta ctaagagggccttggaaaagacacaccccaatagccagcaacacacatagaattcaggcttcttl ttttcttttcctttttctt tτttttttctcattccgttgcccaggctggagtgccgtggtgcgatctcggctcactgcaccctccatctcccaggttcaagg (SEQ ID NO: 15) SG05S35: ctctaatattcctcctccctgacaaaggagtcagatactaataaagggagctgaggagattgtatttgaatcagacttaagat tgaatttttacactggagtgaaattagtttggctttgttttgtgtttttaataactaaaaatgatgagaaagtt^ aagaaattgtcttgagagactctctacaaagcactattgagaattgtgattggaggaaaaacaaacattactgtattactatat actaagtagaatgtactcaataatactggttatactaaaatgttgactttctaaagcaggctggaaggggtgtgtttttgggg aaaagttatacatagtcttcataatagtagacaatggaaaaaaacaacaacaacaaaaccaaagtgaaatatagggcttat gtaagcataxrcaaaataagxτaaaactatattcatatttgattgcattatataagttgaagacacaaacttatctatatAtttct caaaaggagacattxTtagtactataaagcttggctgtgtatatgatcttagtxτtcaaagcaccatttcatacttgtacttatga gcaagagaacaaactctcatttcatatccaccgtgtacgcgacggaatctagaccatagaaaatgttcagacagtgatttc aaattaatctatcacttcctacctgcttctgccctttgctatcatatatagcaatattttggtcattaatccatattaattaaagatc agagaaggtggatagxlttaattaxτtttacaatgtacatgtattaltttgtaattcagaaaaattaaactttaaaaagaattagtat tttattcaatgtctccctcttcataccaactcatgctttatttgcataagctcagtacccacacacaaaatctacaaatgctgag catctgttggttttgccagtgttgcttccattctccttcttctg^aaacggcatctctgttttcctt (SEQ ID NO: 16)
SG05S420: ttactctgcaaatctagccgtaggtggtgtcttaaaaggactttttcttccttcgaggaagtgatcaagcttcccccagcccc cacccgtgtcatcctggaaacaacttggttgtttttcggtgctaattaagcaatgtaggagaaagaagttgaggagacagc aaagcctattgattacattgtacttgcttctctacctcgctcttcaaaggattaaggttcctttgacctcgactgagttcatttcc aacctgcctacaaagtactggtttaaaaaaaaaaaatccaagcaaagtatgctcttgtctatgaacaatatttcttgtccggtt ttattacgttttctttggaaattcgttagattatgctcaacaatcggccgcttgcccaaccggtttctcccttcccccatctctctt cacacatcactttttcctcctcccatccaggcagttggtaaaacgctgcagtgccggtaacgcacgcttcgggctggtcG ctgcagtttatcctaacttggctttgcaaactccccacccagcccttgtaccctcgcaccccactctgccattatctttctagta gagtgcgctcggtaacgggctccagatagcacgtcgagggtatacggtagcctgcctaccaataggagcccacaatga agagaagtcaggttcagccggccagcagctccagtatctccttcccctagctctacacctactgtacccgggagcgagc gggcaagggagcgagcgcggcgcggcgcggcgcgggagggggcgcgcagggagggcggacagcagccgcgg cctgcgcctgcgcactggggttgtttttcacaaagctgctcttaaagtgagctggcagcttttagccgcccgtctttgtaagg agaccactgagacgagcgggagcgcggagcagcagcctctgctgccctgactttttaagaaatctcaatgaactatttgt agagaatcactgatccggcctgca (SEQ ID NO: 17) SG05S421: agccgtaggtggtgtcttaaaaggactttttcttccttcgaggaagtgatcaagcttcccccagcccccacccgtgtcatcc tggaaacaacttggttgtttttcggtgctaattaagcaatgtaggagaaagaagttgaggagacagcaaagcctattgatta cattg acttgcttctctacctcgctcttcaaaggattaaggttcctttgacctcgactgagttcatttccaacctgcctacaaa gtactggtttaaaaaaaaaaaatccaagcaaagtatgctcttgtctatgaacaatatttcttgtccggttttattacgttttc gaaattcgttagattatgctcaacaatcggccgcttgcccaaccggtttctcccttcccccatctctcttcacacatcacttttt cctcctcccatccaggcagttggtaaaacgctgcagtgccggtaacgcacgcttcgggctggtcgctgcagtttatcctA acttggctttgcaaactccccacccagcccttgtaccctcgcaccccactctgccattatctttctagtagagtgcgctcggt aacgggctccagatagcacgtcgagggtatacggtagcctgcctaccaataggagcccacaatgaagagaagtcaggt tcagccggccagcagctccagtatctccttcccctagctctacacctactgtacccgggagcgagcgggcaagggagc gagcgcggcgcggcgcggcgcgggagggggcgcgcagggagggcggacagcagccgcggcctgcgcctgcgc actggggttgtttttcacaaagctgctcttaaagtgagctggcagcttttagccgcccgtctttgtaaggagaccactgaga cgagcgggagcgcggagcagcagcctctgctgccctgactttttaagaaatctcaatgaactatttgtagagaatcactga tccggcctgcaagcattttgcacggc (SEQ ID NO: 18)
SG05S36: gtttcaatgacttgaaactgggagctctcaaactagcctggtaactgagaccccgccccgggcgcccgggataggcgg agccaggtttgttagcctgttgctaaggcgatggcaggcgctgattggattgggcgcgggcgggtggggagagtgagc tcgcgccgcggctggggcgaggctaggaaaccggcgtgcgccggtaaaggttcctccagagcgacccccacccagc ccgtactcacgtttgttcccgggcttctgggggcgcccagcccgtgtcaacatcatgttctgtcgcccctcttttgggcctc ctaagggccatcgaaaggaaacgacttcaggctttaactgagtagagggcagacgctcctgccggccttacaggtgctg gatggccatggacccaacxTtagcagaaaatccttttcagatggaagttggatgggaagggttggcaggacccccgttag ctatcttccgagtgggaggtgtcagggGactcctgtgaacgcttgttttccccttagctgaaacaaagcttgttccccgacc tctccagtcgccacactgtctaccttgagttgacctgaggggacagagcaaaggctggcccgaggaagaaggaaagtg attccaatatcgaaatgttgggggaggggggggcttcgacgggtatcctctgccccctcccccatcgttttccgaagcctt cgggcccccgcctccccttcgccaccccaatatcccttgccgaaactaaaggtgtcagagtggcccgagccagatgga accggtcgccggctggggcgtgtaaagcggaagcgccgcggctcgctcgccccgccccccggccagcaggctagg ttgcgcgggcacgggcgcgagcgccgcgcgacgtatgccgtatgtatagcggggcgcgcgggcggcagcggcggc ggcggcggcgcgcgggcaggccggctgggatctgctcgcgccggtttaggccggtcctc (SEQ ID NO: 19) SG05S444: ttgtgatcattttagatcaataaataattatttttagttcaataaataattatttttagttataaatctgacattaaaatcaatgtgtgg cttttLttLLttLLLtttttactccgtctaxTtgacaaggttttctagtggtxτtggagtatctattctattcacatctggtattattattccc tataaactgacatattttaagagggttgcatgaacagtttttctttgatacattatagtcttaatgtgataaatttaagcaaaccgt tggagacaaaggatagcattgtcttaccatgatgtgagacccgagccttttgagtatatatactattttataaacagatgcttt gcagtttgggtcattttccattagaaaaaataaaaagcctgtggggtgtgagtattcattatccccattttacagataaggca gctgacgacaggatttttattttgtccagaggaaatgatgaagttagaattagaactcagatCtctcagactccaaagcttgt tttgtggtcttaattgttatgtaagcatctttaaagttatgccaccctgcggagtatgtaaatgttgaataataaattctttgttttct ttttatgtttgttttagatgatgataattccgaagaaggcttccacactattcattcaggaaggcatgaatatgcattcagcttcg agcttccacagacgtaagtattataaataataggcatttttatagatacttatgatcagagcattattttaaaattaagtattctaa alttaaacaxτtttagttgaaaaaaaaacccttttcattactgaaacttgctagaggaaaacaatatattaaacataagaagcta ttttttaatctagcgtggcaaaattatagaaagcatactcaattttcagtattctttcctaaagtttatgaggxτtr^ tagggtttcctttaacatgagtgaattggaagttgctagtcgaat (SEQ ID NO: 20)
SG05S422: atcaagaggactatatcxtttcttttgaagcccgtaaagttagagaacagccaaacaagaattagaacatgagtaaactga gagatacaaaagtagxτtttcagtaaacttaggaaatgtgaaatttttgaaactataatgactattccaggccttttctagacctt agttagttgctattaaaagatgataattcccggctaatgagaagtggag actatagggaatttggtctgaatgtttaaccagt gtctagaatttatcagtagaaatagttaaaccattctcaactttgtgtttgaattcttggccttttcaaaaaatgcccattaaactt tcatgattttgccaccatcattaaaagtagcctcctcttagtttaaatggattgtattcttctggctttttaaagcaaaattttaaag ataxτtattgtttattcttagttggaaataggaagttttttccccagtgacttatccctttggaaattttttaAattatatgtttgatac ctcattggaacataaacaatacaatatattaacacttctattctgacctactgtgtaattgttagaaagaggtataaaattaatat attgacctttatattctgcacagaccactcgctacctcattcgaaggccgacatggcagtgtgcgctattgggtgaaagcc gaattgcacaggccttggctactaccagtaaaattaaagaaggaatttacagtctttgagcatatagatatcaacactccttc attactggtaagaattgactgaattatttattctttgtttttggcatataaatattaaagaataatcaatcacctaaactatgcagtt gtcataatttaaaaatactggttttcttttaaaatttcataaatatacattttcttaaatacagtcaagaacactgaaattaatgaaa aatatgctcatatttttgacattgtagtatacctttaagtaaatataaaactaagacc (SEQ ID NO: 21)
SG05S423: tcaagaggactatatcxτtttcttttgaagcccgtaaagttagagaacagccaaacaagaattagaacatgagtaaactgag agatacaaaagtagtttttcagtaaacttaggaaatgtgaaatttttgaaactataatgactattccaggccttttctagacctta gttagttgctattaaaagatgataattcccggctaatgagaagtggagtactatagggaatttggtctgaatgtttaaccagt gtctagaatttatcagtagaaatagttaaaccattctcaactttgtgtttgaattcttggccttttcaaaaaatgcccattaaactt tcatgatlttgccaccatcattaaaagtagcctcctcttagtttaaatggattgtattcttctggcxτtttaaagcaaaattttaaag ataxxattgtttattcttagttggaaataggaagttttttccccagtgacttatccctttggaaattttttaaAttatatgt^ ctcattggaacataaacaatacaatatattaacacttctattctgacctactgtgtaattgttagaaagaggtataaaattaatat attgacctttatattctgcacagaccactcgctacctcattcgaaggccgacatggcagtgtgcgctattgggtgaaagcc gaattgcacaggccttggctactaccagtaaaattaaagaaggaatttacagtctttgagcatatagatatcaacactccttc attactggtaagaattgactgaattatttattctttgtttttggcatataaatattaaagaataatcaatcacctaaactatgcagtt gtcataatttaaaaatactggttttcttttaaaatttcataaatatacattttcttaaatacagtcaagaacactgaaattaatgaaa aatatgctcatatttttgacattgtagtatacctttaagtaaatataaaactaagacca (SEQ ID NO: 22)
SG05S424: agccaaacaagaattagaacatgagtaaactgagagatacaaaagtagtxτttcagtaaacttaggaaatgtgaaatttttg aaactataatgactattccaggccttttctagaccttagttagttgctattaaaagatgataattcccggctaatgagaagtgg agtactatagggaatttggtctgaatgtttaaccag gtctagaatttatcagtagaaatagttaaaccattctcaactttgtgtt tgaattcttggccttttcaaaaaatgcccattaaactttcatgattttgccaccatcattaaaagtagcctcctcttagtttaaatg gattgtattcttctggctttttaaagcaaaattttaaagatatttattgtttattcttagttggaaataggaagttttttccccagtg cttatccctttggaaattttttaaattatatgtttgatacctcattggaacataaacaatacaatatattaaCacttctattctgacc tactgtgtaattgttagaaagaggtataaaattaatatattgacctttatattctgcacagaccactcgctacctcattcgaagg ccgacatggcagtgtgcgctattgggtgaaagccgaattgcacaggccttggctactaccagtaaaattaaagaaggaat ttacagtctttgagcatatagatatcaacactccttcattactggtaagaattgactgaattatttattctttgtttttggcatataa atattaaagaataatcaatcacctaaactatgcagttgtcataatttaaaaatactggttttcttttaaaatttcataaatatacatt ttcttaaatacagtcaagaacactgaaattaatgaaaaatatgctcatatttttgacattgtagtatacctttaagtaaatataaa actaagaccagtatatgatatgaagtatctcttactcagagtatgacagccataacaa (SEQ ID NO: 23)
SG05S425: agagcagtagtatcttttattataatacaatatcacttacttgacaatatcataatctaaaatttgtactatactttgttattccaata gagggatagttgtcaaataatggggaatcattagtttcattøaaaaactccttttttaaagaaatctatxxttaatt gaaaataattcaaatttgacttaaaaclttatttgaxτtaaacaaatggtaacatctttaaagcacataagaaatgggcattttcc ttttttacatattgaxrttttaatagtctcagagaaagatgttaacatttggaattataatcagattgcttt^ gcagl taaatggaatgtggaaagggttttttttattt^ gtfratgggataggctgtaaaccctttaccttttttttctaattgaatggaaatttatatGaaaataaaagtaatggtataattcaa aaccatcggttøgttaaaaaataagaagtatgtctggatacaaaaaaaaagcaaaaccagttgtaatttttaatt^ tttagggttatattttaaaatattctggtattgattgtaccaatccatacactgacctttttctcttaatttttttcttacagtcacccc aagcaggcacaaaagaaaagacactctgttgctggttctgtacctcaggcccaatatccttaagtgccaaaattgaaagg aagggctataccccaggtatgtaagaggtagattcccacaaaaaacttttcttcacttagcxlttgtgtttaaatttataatatga ctactaaatataaacataatatag1:attcttttgtgaaataacaaattgtgcttatccaxτtgtttttttcacaaagtagaatgtagtt tcctttttcaaatagctactgattagttcatttctatgactcaag (SEQ ID NO: 24)
SG05S37: aggcagxttaaatggaatgtggaaagggttxτttttax tggttggtt^ tatgXTatgggataggctgtaaaccctttacctttttrttctaattgaatggaaatttatatgaaaataaaagtaatggtataattc aaaaccatcggttagttaaaaaataagaagtatgtctggatacaaaaaaaaagcaaaaccagttgtaatttttaattttccata aatttagggttatattttaaaatattctggtattgattgtaccaatccatacactgacctttttctcttaatttxτttctt ccaagcaggcacaaaagaaaagacactctgttgctggttctgtacctcaggcccaatatccttaagtgccaaaattgaaa ggaagggctataccccagg^atg aagaggtagattcccacaaaaaacttttcttcacttagcttttgtgtTtaaatttataat atgactactaaatataaacataatatagtattcttttgtgaaataacaaattgtgcttatccatttgtttttttcacaaagtagaatg tag1ttccxtttcaaatagctactgattagttcatttctatgactcaagtta1ttgcagtt tgcactttcttaclttagctttaagaatcttgacttcttttgtt atcaattcagatatttgctgagattgagaactgctcttcccgaatggtggtgccaaaggcagccatttaccaaacacaggc cttctatgccaaagggaaaatgaaggaagtaaaacagcttgtggctaacttgcgtggggaatccttatcatctggaaagac agagacgtggaatggcaagttgctgaaaattccaccagtttctccctctatcctcg (SEQ ID NO: 25) SG05S426: gtaaatagtgtgagctaaaaaactatgtttatgggataggctgtaaaccctttaccttttttttctaattgaatggaaatttatatg aaaataaaagtaatggtataattcaaaaccatcggttagttaaaaaataagaagtatgtctggatacaaaaaaaaagcaaa accagttgtaatttttaattttccataaatttagggttatattttaaaatattctggtattgattgtaccaatccatacactgacctttt tctcttaatttttttcttacagtcaccccaagcaggcacaaaagaaaagacactctgttgctggttctgtacctcaggcccaat atccttaagtgccaaaattgaaaggaagggctataccccaggtatgtaagaggtagattcccacaaaaaacttttcttcact tagcttttgtgtttaaatttataatatgactactaaatataaacataatatagtattcttttgtgaaataacaaatTgtgcttatcca tttg ttttttcacaaagtagaatgtagtttcctttttcaaatagctactgattagttcatttctatgactcaagtt gggacatcttttcatgtatgttagttgcactttcttactttagctttaagaatcttgacttcttttgttgXTtttttctaacagtagatat gtatatatttttccctctctaggtgaatcaattcagatatttgctgagattgagaactgctcttcccgaatggtggtgccaaagg cagccatttaccaaacacaggccttctatgccaaagggaaaatgaaggaagtaaaacagcttgtggctaacttgcgtggg gaatccttatcatctggaaagacagagacgtggaatggcaagttgctgaaaattccaccagtttctccctctatcctcgact gtagtataatccgcgtggaatattcactaatggtatgtacacatttaagaggttttttcct (SEQ ID NO: 26)
SG05S38: ttataccttatatgtaaatgtactatacagaattatacataaaagagaaactxttcatgtatgtaagtttaaaaatgaagtaaatg ggggtttcaaataacattaaaattggttatgagtttttgaaaaggaaatcatacttggcattctaaacttaatatttctttgcaatg tttaggtatatgtggatattcctggagctatggatttatttcttaatttgccacttgtcatcggtaccattcctctacatccatttgg tagcagaacctcaagtgtaagcagtcagtgtagcatgaatatgaactggctcagtttatcacttcctgaaagacctgaaggt aatttgataatacacgtctaagcttaatctcttactactattatcaagaaaattattttgcctctgatattttatgaccctagatgaa cataatcl ttcctaaagaaagtagatgtgcccctaatattcagttttgattatatgtcttcatagtttTgatcaccaaacaataa ttattttaccttgacgttgcxτtttaatacctactattctgtgtaxrttgacatgcataagaacatgctgttcgaattgcgtg atcttt ttccttcagcaccacccagctatgcagaagtggtaacagaggaacaaaggcggaacaatcttgcaccagtgagtgcttgt gatgactttgagagagcccttcaaggaccactgtttgcatatatccaggagtttcgattcttgcctccacctctttattcagag gtaagtacctactccttaagtatgaaatgaactggctatcttgggaaatagcattgcaggcalttttttttcttaatgttattagat cttgattatcatgttaacagaacaaataatttaaaacaacagattaaaagttaatattaggttcatcaatatttactaagtttgat gcagaggtaggactgttcacttattagggaattaagataatgatctaacctcatgtt (SEQ ID NO: 27) SG05S39: agtgtaagcagtcagtgtagcatgaatatgaactggctcagtttatcacttcctgaaagacctgaaggtaatttgataataca cgtctaagcttaatctcttactactattatcaagaaaattattttgcctctgatattttatgaccctagatgaacataatcltttccta aagaaagtagatgtgcccctaatattcagttttgattatatgtcttcatagttttgatcaccaaacaataattattttaccttgacg ttgcxτtttaatacctactattctgtgtattttgacatgcataagaacatgctgttcgaattgcgtgtatcttttt cccagctatgcagaagtggtaacagaggaacaaaggcggaacaatcttgcaccagtgagtgcttgtgatgactttgaga gagcccttcaaggaccactgtttgcatatatccaggagtttcgattcttgcctccacctctttattcagaggtaAgtacctact ccttaagtatgaaatgaactggctatcttgggaaatagcattgcaggcatttttttttcttaatgttattagatcttgattatcatgt taacagaacaaataatttaaaacaacagattaaaagttaatattaggttcatcaatatttactaagtttgatgcagaggtagga ctgttcacttattagggaattaagataatgatctaacctcatgttgaattactaatttttttttctcttaaacttttatcttgagacac attttttgaaggaagaattttgtatatccaagcatcaataggaagaactagcaxτttgcaaacactgattaaatagtagaattta tcttattcatgatgcttttccaattgcatatxrtgacattctcagtcataagaaataatgtatttgtgataacatcttaaaacaacat tcccacatggttagactattttcgttcatctcatattgatggcagctaagcacttaa (SEQ ID NO: 28)
SG05S427: agcttaatctcttactactattatcaagaaaattattttgcctctgatattttatgaccctagatgaacataatcttttcctaaagaa agtagatgtgcccctaatattcagttttgattatatgtcttcatagttttgatcaccaaacaataattattttaccttgacgttgcttt ttaatacctactattctgfgtattttgacatgcataagaacatgctgttcgaattgcgtgtatctttttccttcagcaccacccag ctatgcagaagtggtaacagaggaacaaaggcggaacaatcttgcaccagtgagtgcttgtgatgactttgagagagcc cttcaaggaccactgtttgcatatatccaggagtttcgattcttgcctccacctctttattcagaggtaagtacctactccttaa gtatgaaatgaactggctatcttgggaaatagcattgcaggcatttttttttcttaatgttattagatcttgaTtatcatgttaaca gaacaaataatttaaaacaacagattaaaagttaatattaggttcatcaatatttactaagtttgatgcagaggtaggactgtt cacttattagggaattaagataatgatctaacctcatgttgaattactaatttttttttcto^ gaaggaagaattttgtatatccaagcatcaataggaagaactagcattttgcaaacactgattaaatagtagaatttatcttat tcatgatgcltttccaattgcatattttgacattctcagtcataagaaataatgtatttgtgataacatcttaaaacaacattccca catggttagactattttcgttcatctcatattgatggcagctaagcacttaaatgtaaataaaacatggcccttaggagaaacc cacagattactgaggaaaacaagctaagaagcaaaatattatgtaataatgtgataa (SEQ ID NO: 29) SG05S428: accctagatgaacataatcxτttcctaaagaaagtagatgtgcccctaatattcagttttgattatatgtcttcatagttttgatca ccaaacaataattattttaccttgacgttgctttttaatacctactattctgtgtattttgacatgcataagaacatgctgttcgaat tgcgtgtatctttttccttcagcaccacccagctatgcagaagtggtaacagaggaacaaaggcggaacaatcttgcacca gtgagtgcttgtgatgacx tgagagagcccttcaaggaccactgtttgcatatatccaggagtttcgattcttgcctccacct clτtattcagaggtaagtacctactccttaagtatgaaatgaactggctatcttgggaaatagcattgcaggcatttttttttctt aatgttattagatcttgattatcatgttaacagaacaaataatttaaaacaacagattaaaagttaatattaggTtcatcaatatt tactaagtttgatgcagaggtaggactgttcacttattagggaattaagataatgatctaacctcatgttgaattactaattttttt ttctcttaaacttttatcttgagacacatxttttgaaggaagaattttgtatatccaagcatcaataggaagaactagcattttgc aaacactgattaaatagtagaatttatcttattcatgatgcttttccaattgcatattttgacattctcagtcataagaaataatgta tttgtgataacatcttaaaacaacattcccacatggttagactattttcgttcatctcatattgatggcagctaagcacttaaatg taaataaaacatggcccttaggagaaacccacagattactgaggaaaacaagctaagaagcaaaatattatgtaataatgt gataaaatgcacagctcatttgagaagcagcttattaggtcttgggcatcaggtggggctt (SEQ ID NO: 30)
SG05S429: ctttcagtaaaaaatatattaaacccaacaatttgctlttctatgaccattctctttgtacttgagaagcaacttttcattaaaaata tgtgcatataaaagtcactctgaagataaagaaggtgggaggaaacctcttgagtgttttgggttcagcagttttatttcctg atacactgcgtttttggccctctg gggattttacacccatgagttattacctcacataagcacattgttcatgtgagtaaacaa attgtttgcacattgtataaacaaagagcattgtttatacgatttcatttttccaaagtcagaggaagcgatgagttaactttttt gtagaaggatgaaaggggtgctttacaaaagaagatgattgctatgaagatattctacgttgtgtatttaggtcttaattaagt actactaaagtactgagtaaaagtcttcattcactgataaaatτttactgttaatttttgtgaaataggggactAttcagacatg cctttataaagcttgttaatctagtaattaaatatcaggtactgtaggtaccaacaccctttcctccccacatagcatttcacgt atgaaaagagaataattggatataatatgctaacaagctgtgtttgctctctgattacatttcttgactagtattcagaattgtag tcaxτtgctacattcttcctagacctcatgaaataactctgtaaactttttttctcatttctttctagattgatccaaatcctgatcag tcagcagatgatagaccatcctgcccctctcgttgaaggaacacttggttgaatcaagttgatgtgggttccgaactgtatct cttccggctgaggacagagaagtatcttggagacacgtttcagaggaagtggaattacttttgcccagaaaaatggcgaa tacatgaaacaaccagtgatcatgctttagaagcctacagcaacattctgagactgctccaacatgc (SEQ ID NO: 31) SG05S430: ataagcacattgttcatgtgagtaaacaaattgtttgcacattgtataaacaaagagcattgtttatacgatttcatttttccaaa gtcagaggaagcgatgagttaacttttttgtagaaggatgaaaggggtgctttacaaaagaagatgattgctatgaagatat tctacgttgtgtatttaggtcttaattaagtactactaaagtactgagtaaaagtcttcattcactgataaaattttactgttaaiω tgtgaaataggggactattcagacatgcctttataaagcttgttaatctagtaattaaatatcaggtactgtaggtaccaacac ccxrcctccccacatagcatttcacgtatgaaaagagaataattggatataatatgctaacaagctgtgtttgctctctgatta catttcttgactagtattcagaattgtagtcatitgctacattcttcctagaccte^ ctttctagattgatccaaatcctgatcagtcagcagatgatagaccatcctgcccctctcgttgaaggaacacttggttgaat caagttgatgtgggttccgaactgtatctcttccggctgaggacagagaagtatcttggagacacgtttcagaggaagtgg aattacttttgcccagaaaaatggcgaatacatgaaacaaccagtgatcatgctttagaagcctacagcaacattctgaga ctgctccaacatgcttgaagatctaagcttttctcttttaaaactggcacatactcagagcagtcttcttagcctatggtcgtac gtgtcaagacatcacgttgtaaagagggatgatttccttcttttgatttgaaaatttgcacatgctcaatgcttacattgtgcgg ttcgacgtcactacagcttctttttttttttttttttttttctatttttgccagactcttgatactcttaa (SEQ ID NO: 32)
SG05S431: tgtatttaggtcttaattaag^actactaaagtactgagtaaaagtcttcattcactgataaaattttactgttaatttttgtgaaat aggggactattcagacatgcctttataaagcttgttaatctagtaattaaatatcaggtactgtaggtaccaacaccctttcct ccccacatagcaxl cacgtatgaaaagagaataattggatataatatgctaacaagctgtgtttgctctctgattacatttctt gactagtattcagaattgtagtcatttgctacattcttcctagacctcatgaaataactctgtaaacttttxrtctcatttctttctag attgatccaaatcctgatcagtcagcagatgatagaccatcctgcccctctcgttgaaggaacacttggttgaatcaagttg atgtgggttccgaactgtatctcttccggctgaggacagagaagtatcttggagacacgtttcagaggaagtGgaattact tttgcccagaaaaatggcgaatacatgaaacaaccagtgatcatgctttagaagcctacagcaacattctgagactgctcc aacatgcttgaagatctaagcttttctcttttaaaactggcacatactcagagcagtcttcttagcctatggtcgtacgtgtcaa gacatcacgttgtaaagagggatgatttccttcttttgatttgaaaatttgcacatgctcaatgcttacattgtgcggttcgacg tcactacagcttctltttttttttttttLtLtttctal tttgccagactcttgatactcttaaaacttgtttgtggtcagcacaacaagga acaaaacaaagctttgaaaaaactttaacatgaaaaaacgcactgacatttttttttatttaatatagcctggactttacctgcg tatgcacatgctcagaattgtctactaggctgactatgtatcacctcttcagcttggatccaat (SEQ ID NO: 33) SG05S432: gaaatttggaaacgggacatacacaaaagttacacacccacattccctttttatcatgacatacaagaagaaactagcaga gctaagaatggagtgaagaaaggcagtatggcaggcaccagcaaagagttgagggctgttgctcttaaaaattatttttttt attattattttgaaagtatggaagttttccattcactggggaaaggagggaaaagtgcatttax tttatacagagttacttaatt acctccaaaacacatatgttggaaatcgcttttgctggtgcaaagtatattaatgagcaggaatacatacattgaggttatga atagagagctcaatttgtacctttgctgtcttgctcaagcttggtatggcatgaaaactcgactttattccaaaagtaacttcaa aatttaaaatactagaacgtttgctgcgataaatcx ttggattttt acaaxτtgctaaacatgagaaatcactcactttgattatgtatagattacataggaagaacaatcacatcagtaagttatagttt atattaaaggtaattttctgttggctcataacaaatataccagcattcatgatagcatttcagcattttccaaggtaccaagtgt acttattttgttgttgttgttgttgttgtattttagaaggaattcag acatacgtgtaaaatgggtgttacatctatcctgccatttaaccccacagttaataaagtggctgaaaataatagtagctctg gcttggtgcttgacctggttaaatactgtcttaaagctcatacaaaacaaataggcttttccataagtggcctttaagaaaaca tggaagacaattcatgtttgacaaatgctgacagggtgaagaaagcccagtgtaaaaatgaatcgcgtttt (SEQ ID NO: 34)
SG05S433: agcaggaatacatacattgaggttatgaatagagagctcaatttgtacctttgctgtcttgctcaagcttggtatggcatgaa aactcgactttattccaaaagtaacttcaaaatttaaaatactagaacgtttgctgcgataaatcttttggatttttgtgtttttcta atgagaatactgtttttcattacctaaagaacaatttgctaaacatgagaaatcactcactttgattatgtatagattacatagg aagaacaatcacatcagtaagttatagtttatattaaaggtaattttctgttggctcataacaaatataccagcattcatgatag catttcagcattttccaaggtaccaagtgtacttattttgttgttgttgttgttgttgtattttagaaggaattcagctc^ aaagaaaaccagcatctctgatgttgcaacatacgtgtaaaatgggtgttacatctatcctgccatttaAccccacagttaa taaagtggctgaaaataatagtagctctggcttggtgcttgacctggttaaatactgtcttaaagctcatacaaaacaaatag gcttttccataagtggcctttaagaaaacatggaagacaattcatgtttgacaaatgctgacagggtgaagaaagcccagt gtaaaaatgaatcgcgttttaagtgattcggttaaagagtttgggctcccgtagcaaactaatactagataataaggaaatg ggggtgaaatatttttttattgttgaatcattttgtgaatgtccccctcaaaaaaagctaatggaatatttggcataaagggcat ttggtggttttaxTtttgtttgagggggattgtcagaaaatcccttttctctcttacgtctaactgactagggaacaattgttgata tgcatagcattggaatacttgtcattatatactcttacaaataacacatgaagcaagaatgaccaatatt (SEQ ID NO: 35) SG05S434: ttacctaaagaacaatttgctaaacatgagaaatcactcactttgattatgtatagattacataggaagaacaatcacatcagt aagttatagtttatattaaaggtaattttctgttggctcataacaaatataccagcattcatgatagcatttcagcattttccaag gtaccaagtgtacttattttgttgttgttgttgttgttgtax^agaaggaattcagctctgatgtttttaaagaaaaccagcatct ctgatgttgcaacatacgtgtaaaatgggtgttacatctatcctgccatttaaccccacagttaataaagtggctgaaaataat agtagctctggcttggtgcttgacctggttaaatactgtcttaaagctcatacaaaacaaataggcttttccataagtggcctt taagaaaacatggaagacaattcatgtttgacaaatgctgacagggtgaagaaagcccagtgtaaaaatgaatcGcgttt taagtgattcggttaaagagtttgggctcccgtagcaaactaatactagataataaggaaatgggggtgaaatatltttttatt gttgaatcattttgtgaatgtccccctcaaaaaaagctaatggaatatttggcataaagggcatttggtggttttatttttgtttg agggggattgtcagaaaatcccttttctctcttacgtctaactgactagggaacaattgttgatatgcatagcattggaatact tgtcattatatactcttacaaataacacatgaagcaagaatgaccaatattctgataattggcactggatcacaaaatgtgata aaactttaaatgtataaaactttatcaaataaagttttattttcccctttaaaatgtatttctttagaggcattacttttttaaaa ggtcaattcctgacataagatgtgaggttcacagttgtattccagtattcaagatagattcctgat (SEQ ID NO: 36)
SG05S435: ctgatgttgcaacatacgtgtaaaatgggtgttacatctatcctgccatttaaccccacagttaataaagtggctgaaaataat agtagctctggcttggtgcttgacctggttaaatactgtcttaaagctcatacaaaacaaataggcttttccataagtggcctt taagaaaacatggaagacaattcatgtttgacaaatgctgacagggtgaagaaagcccagtgtaaaaatgaatcgcgtttt aagtgattcggttaaagagtttgggctcccgtagcaaactaatactagataataaggaaatgggggtgaaatatttttttattg ttgaatcaxτttgtgaatgtccccctcaaaaaaagctaatggaatatttggcataaagggcatttggtggttttatttttgttt gggggattgtcagaaaatcccttttctctcttacgtctaactgactagggaacaattgttgatatgcatagcattggaataCtt gtcattatatactcttacaaataacacatgaagcaagaatgaccaatattctgataattggcactggatcacaaaatgtgata aaactttaaatgtataaaactttatcaaataaagttttattttccccxτtaaaatgta^ ggtcaattcctgacataagatgtgaggttcacagttgtattccagtattcaagatagattcctgatttttcaattaggaaaagta aaatccaaaatgttagcaaaacaaagtgcaatattaaatgtttgctttatagattatattctatggctgtttgtaatttctcxτttttt cctttxτtaxτtggtgctgaatatgtccttgtaggctctgttttaagaaaacaatatgtgggaaatgatttaatttttcctattgctct tccttgtggaaaataaagtgttttgtttttttctgttttgtataattgtttggagatttat (SEQ ID NO: 37) SG05S436: gtgcttgacctggttaaatactgtcttaaagctcatacaaaacaaataggcttttccataagtggcctttaagaaaacatgga agacaattcatgXTtgacaaatgctgacagggtgaagaaagcccagtgtaaaaatgaatcgcgttttaagtgattcggttaa agagx^gggctcccgtagcaaactaatactagataataaggaaatgggggtgaaatatxttttattgttgaatcattttgtga atgtccccctcaaaaaaagctaatggaatatttggcataaagggcalxtggtggttttatttttgtttgagggggattgtcaga aaatcccttttctctcttacgtctaactgactagggaacaattgttgatatgcatagcattggaatacttgtcattatatactctta caaataacacatgaagcaagaatgaccaatattctgataattggcactggatcacaaaatgtgataaaactttaaatgtAta aaactttatcaaataaagttttattttcccctttaaaatgtattt^ taagatgtgaggttcacagttgtattccagtattcaagatagattcctgatttttcaattaggaaaagtaaaatccaaaatgtta gcaaaacaaagtgcaatattaaatgtttgctttatagattatattctatggctgtttgta^ gaatatgtccttgtaggctctgttttaagaaaacaatatgtgggaaatgatttaal tttcctattgctcttccttgtggaaaataa agtgttttgtttttttctgxτttgtataattgtttggagattt^ aaattaaactattaaactctattttaagccattctgggtaagaattgtatcta (SEQ ID NO: 38)
SG05S441: tactgtcttaaagctcatacaaaacaaataggcttttccataagtggcctttaagaaaacatggaagacaattcatgtttgac aaatgctgacagggtgaagaaagcccagtgtaaaaatgaatcgcgttttaagtgattcggttaaagagtttgggctcccgt agcaaactaatactagataataaggaaatgggggtgaaatatttttttattgttgaatcattttgtgaatgtccccctcaa agctaatggaatatttggcataaagggcatttggtggttttattx gtttgagggggattgtcagaaaatccctttt gtctaactgactagggaacaattgttgatatgcatagcattggaatacttgtcattatatactcttacaaataacacatgaagc aagaatgaccaatattctgataattggcactggatcacaaaatgtgataaaactttaaatgtataaaacrttatcaaataAag ttttattttcccctttaaaatgtatttctttagaggcattacttttttaaaaatattggtcaattcctgacataagatgtgaggttcac agttgtattccag attcaagatagattcctgatttttcaattaggaaaagtaaaatccaaaatgttagcaaaacaaagtgcaa tattaaatgtttgctttatagattatattctatggctgtttgtaatttctctttttttccltttttatttggtgctgaatatgtcc ctctgttttaagaaaacaatatgtgggaaatgatttaatttttcctattgctcttccttgtggaaaataaagtgtt^ tttgtataattgtttggagalttatttgaatcttgatcatattagtaactcaccatacatgcaaacacattaaattaaactattaaac tctattttaagccattctgggtaagaattgtatctaggtctgtactaaagaaaa (SEQ ID NO: 39) SG05S442: gttcacagttgtattccagtattcaagatagattcctgatttttcaattaggaaaagtaaaatccaaaatgttagcaaaacaaa gtgcaatattaaatgtttgctttatagattatattctatggctgtttgtaatttctctτttttt tgtaggctctgtrttaagaaaacaatatgtgggaaatgatttaatt^ tttctgttttgtataattgtttggagalttatttgaatcttgatcatattagtaactcaccatacatgcaaacacattaaattaaacta ttaaactctat ttaagccattctgggtaagaattgtatctaggtctgtactaaagaaaattactgaccaactcctaggctaaa gtagtcctcctg^ctcaacctccgaagtagctaggactacagg gcacaccaccacacccaCctaaxτttctaatttttttgt agtcgtggtctcactatgttacccaagctgatctcaaactcgggcctcaagcgatcctcccaacttggcttcacaaagcact gggattacaggcgtgagacaccatgtccagactctaaatttcaatcttaaatatgaagcaaacttaaccaggtaaaataatt cactttcaggcaccaattccttctcactgtgttgtgttcaaattaccagtattt^ ttttaagagagtgtctcactttgtcacccaggctgcagtgtagtggcaggatctcggcccacttcagccttgacttcccagg ttcaagcaatcctcctgcctcagacccccaagtagctgagactacaggcatgccccaccatgcccagcttax tttgtgtttt ttgtggagatgagattttgccatgttgcccaggctggtctcaaactcctggactcaagtgat (SEQ ID NO: 40)
SG05S443: tcacagttgtattccagtattcaagatagattcctgal tttcaattaggaaaagtaaaatccaaaatgttagcaaaacaaagt gcaatattaaatgtttgctttatagattatattctatggctgtttgtaattt^ taggctctgttttaagaaaacaatatgtgggaaatgaxrtaatxTttcctattgctcttccttgtggaaaataaagtgtltt tctgttttgtataattgtttggagatttatttgaatcttgatcatattagtaactcaccatacatgcaaacacattaaattaaactatt aaactctattttaagccattctgggtaagaattgtatctaggtctgtactaaagaaaattactgaccaactcctaggctaaagt agtcctcctgtctcaacctccgaagtagctaggactacaggtgcacaccaccacacccaccTaattttctaatttttttgtag tcgtggtctcactatgttacccaagctgatctcaaactcgggcctcaagcgatcctcccaacttggcttcacaaagcactgg gattacaggcgtgagacaccatgtccagactctaaatttcaatcttaaatatgaagcaaacttaaccaggtaaaataattca ctttcaggcaccaattccttctcactgtgttgtgttcaaattaccagtatttcaagagtacataaacataacatagttttttttttttt taagagagtgtctcactttgtcacccaggctgcagtgtagtggcaggatctcggcccacttcagccttgacttcccaggttc aagcaatcctcctgcctcagacccccaagtagctgagactacaggcatgccccaccatgcccagcttatttttgtgttttttg tggagatgagattttgccatgttgcccaggctggtctcaaactcctggactcaagtgatcc (SEQ ID NO: 41) SG05S40: ttccattacaaaataatgacataattctacctcttcattttttgatatcaaagtcaatacagaagggagacaaatgtttccaact agacaaaaacl tattaatatagtttattttgacttttgagaaatataatctgcaatattttaacaaaaaataacagaaaaataat ggggaaatgaaataaatgaatgaatatggaaggatcactgcaaatccaagacaactagtggtttgaaaatccaaaggtat gtctacctggaaatggaaccagtatcactctatttgactgttagacagcataaagttaaacatataaggaaaaaaaattttttg aaccaccttcgcaataccaacttaaatatatgatatcaagatagtatatcttgacatagcaaaaaagcaataaagttgctttg gaatagtttctagcatttacatgtcaaaaatatttatcctaaggtaattcaagaaaacaaatgtttacgatgtttatattacAaat ctctaatacagcatagctaattctaactagaaatgattttttgctacttctgaattcaagacctagataagttctcacatcaatttc ctgtccaaataataattactacttctggcattggtaggtctctggggaagttaataagacagtaagttagaacataactttgta caatttagaagtctgtttactagttacataaaacagtcttaccagtgaatagaaggaaatgagagattaaaatgtactctctta attttltttttttttttttttgacagagtctcgctctgtcacccaggctggagtgcagtggcgcaatcttggctcactgcaacctct gcctcccaggttcaagcaattctcctgcctcagcctcctgttagctgggactacaggtgccagccaccacgcatggctaat ttttgtattttcagtagagacggggtttcaccatgttggtcaggctggtctcgaactcctgacctcaggtgatc (SEQ ID NO: 42)
DG5S1396:
ACCCAAGGGTCAGAAGGTGTTTGTTTGTTTTCTTCTGTGTGTATGTATGG TI JΠΓTGTTTGTTTGTTTGTTTGTTTGTTTGAGACAGAGTCTTGCTCTCTCA CCAAGGCTGGAGGACAGTGGCATTATCTCGGCTCACTGCTACCTCCACCT CCTGGGTTCAAGCAATTCTCCTGTCTCAGCTTCCTGAGTAGCTGGGACTA CAGGTGCCTGCCACCATGCCCGGCTAATTTTTGTATTTTTAGTGGAGACA GGGTTTCAACATATTGGTCAGGCTGGTCTCAAACTCCTGACCTCAGGTGA TCTGCCCACCTTGGCCTGCCAAAGTGCTGAGATTACAGGTGTGAGCCACC GCGCCTGGCCATATGTGTGTGTTTTTAAATAAATTTTGAACACAAACAGA GAAAAGAGAATGGGTGGTGCAG (SEQ ID NO: 43)
DG5S744:
CCCTCTGGAAACATCAGCTTGTGTTACTGGATTTTCCAACTTTTTTCAAA AGCTGGAAATGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT GTGTGTGTTGAGAAAGAGAGAGAGAGAGATCAGA (SEQ ID NO: 44) SG05S41: ggaggtggctggggaggagagacctgcagcaaaaatataactggaagtgatctgttaaggagatatagaaattctctatg acggggactaattagtagactttagtagagtcacagtaatgattttgaacttttttccagtactgtgtagctctgcaacccttta cagttaatacaaattccagtaaaacttacacccttgaccttgtattagtggtgctttgtgaaatgggagtaatttaggttgtgtc caatgttgggtacaggaggcctggtgcggtggctcatgctgtctctactaaaaataaaaaaagtagccaggcatggtggt gcatgcctgtggttccagctacttgggaggctgaggcacaagaatcgcttgaacctgggaggcggagattgcagtgag ctgaggtagcgccactacactccagcctgggtgacagagcgagattctgtctctaaataaataaataaatgttgggaaca ggatggaaaaaaGgcatcagagagggcagagccagagatgccaacaaataacagcaaattggagtagaatgactaa aatgttagccactcgaggattactgggatagactctgagttttcactgtgttgtaaaaagggcacgagttatttaaatattaga agtctaattgcctccaaacaggtgacaggcactggtatccagattacataacaatgtaaaatgaataaagaaaggattgag tcacatgatatctaagttctcx tgtttttgaaatttgctttcttcattctaacatcattaacaccattaagcataaatctgagaat^ ttgcttttaatttttaatg tgaatgtgtatacatttcctctcatatgtccaagtgtttttcttcttttgtttt acttttttgagacagttttgctctgtcgcctgggctgagtgcagtggcaccatcttggctatctttaacattcatctgacagaca a (SEQ ID NO: 45)
DG5S1851:
TGAAGGCAGTAGCAAACCTCTTAAACTCTAAAAACTCTAAAAGTGAAAT AGGCTGTAATCTGGGAGGCTGAGGTGGGAGGATGGCTTGTGTCCAGGAG TTCAAGACCCGCCTGGGCAGAATAGCAAGACCTCAGTCTCTAAAAAAAT AAAAAATAAGCCGGTGCTATGGCGTGCACCTGTAGTCCCAGCTACTCAG GAGGCTGAGGTGGGAGGATGGTTCGAGCCCAGGAGGTCGAGGCTGCAG TAAGCAACAGAGTGAGACCCTGTCTCATAAATAAATAAATAAATAAATA AATGTGAAATAGACTTAGTGGTTCTGGCCCACCTCATTT (SEQ ID NO: 46)
DG5S740:
GAGGCCCAAGAATTGATTGAACTTGGCAGGCAGAGGTTGTACTCCAGCC TGGGTGACAGAGCCAGAACTTGTCTCAAAAACAAGTAATTAATTAATTA AAAATTAAAAAATTAAAGTAAGTTCTCTACCTTGTTCACCCTTCCTGTGC CTAGAACTTAGGAGGTGCACAATAAATATTTGCTAATTCACACACACAC ACACACACACACACACACACAGACACACACACACAGACACAGACACAC ACACACACATACAGTATTTGGCAGCGGAAAGAGATTATCACGGTGATTC TAAAAATCCGTAGCACTCATAGGAGTATCTGTTGACCCAAATTT (SEQ ID NO: 47)
SG05S652: gagaagaaagggttttaaagctgcataaaactaagaagaacctgaaacaggcagaacaccagagtttctatcagagagt aggtaaaatggctagttttttcaccactaagaaaggagataaaagtttctccagcttaaaggagggagggattaaaagaga cctgattagtgtctacatgctgcttgcatcccagagatgggcagcaggaaccccaaaaagctttaagaagatggagaata tggggagttttgtttctattctcattgtatagccaggggaaactgcaagtctttcagagaaatgccttgtccagcacagtgga tgaggaaagatctgagtgggtggaactgagaggagcctcacccctatgtcagaggcagaggcgtttaaaccatagcaa ctccatcttgaataggggctgggtaaaataaggctgagagctgctgggctgcattcccagtaagttaggcattttaagaca caggatgagataggaggtCataaagaccttgttgataaaacagcatgcggtaaagaagctggctaaatcccaccaaaa ccaagatggcaacaaaagggacctctggtagtcctcactgctcattatacgttaattataatatattagcatgctaaaagaca ctcccactagcaccatgacactttacaaatgccatggcatttaccatatatggtctaaaatggggaggagcccacagttcc aggaattgctcaccaggtacccagaaaattcatgaataatctaccccttgtttagcatataatcaagaagttacaggaagtat aagcagctgcgcagcccatgtctggctatggag agctattcttttatttatttactttcttaatgaacttcctttcactttactcta tggacttgcccccagttctttcttgcacaaggtccaagaaccctctcttgaggtcttgatgcggacctctttccagtaacacc tacgtagggcaggagggcatcca (SEQ ID NO: 48)
SG05S43: ttataggcttagagaagctaagtatcaagtccaagttcacaccgtcataagatgaatttgggtttttgaactgaagtctgactt ggtaaatgccaaggcaatatccatatggatatgctaagaataatgaatcagaatcaggcacagtttttaaaacatcaaatga tgtatgtaattacttaaaatcttaaatttttatgcaataatttgacttttt^ ctagttcaagtatatgagattccatttctggtccatattactcattttgctttcttgggctctgatcctctctcgtattttcatgt^ tatttgtatttgfttggggcttttctctcctita^ aattacttcagtctccttgcctgcaagaaaagcagtcaaactacacgaacactctattgttttgcatCatttcacacgtacac agggataaggagggaaataattctcccaatgaagcagaaccaccggtaacagctggcacaacttctaaagacccagtct gtcttgaaaaacagccgcaaattgtccctaaataataattaccccacctgtatcaggactatctcactgactagctgcacttc atgacacaatacctgctattaactctcagcatggatataaacattgccataatattttttaaaaccaacaatgtatcatttgcac attaggacaacagaatgttagttaaggcacagaaaacacaaaggcatggctaccctgttctttggtgactccatatctaatt attaccacacatgtctccagtctttagacaaal tagttattataagggtggttggacaaatagcaaagatttgattttcattaac aaagggcacatcagaaccgagttaatccttcccagttacccaacattcaaagcttctagtgcagtcactt (SEQ ID NO: 49)
SG05S653: tttagtagagacagggttttgccatgttggctgggctggtctcgaactcctgaccacaggtgatccacccgccttggcctta cccaaagtgctggaattacaggtgtgagccacttcacctggcctaxτacttattcttttgtagagatggggtcttaccatgtt gcccaggctggtcttgaactcctgggctcaagcgatcccctcatctaggcctcctcaaatgctgggattacaggcatgag ctactgcacctggccccaataatctttatgttgtctccaaatggttttcatgataattgtgatattaagattaataacaagtgctc caaggttcctcattagagcgagtttaccaagggtctaattggcataatagaaacagagtcattgttatgccaggctaaatgt gatcttttttctgcaagcaaagaaataatattagcctaaagctttcatattgactaacactaattgtgtgtgctttgagttctcct GataaaaxTagcatttaxttatttattattcagtctctgtagagatggtcaaaaxttgctaatgctggagtctgggctca^ tcacacctgtaatcctagcacttcgggaggccaaagtgggaggatcgcttgagcccaggatttcgagaccagcttgggc cgcatggtaaaccctatctctacaaaaaatatgaaaattattcgggcatggtgttaggtgtttgtagtcccagctacttggga ggctaaagtgggaggatcacctgagcctgggaggttgaggttgtggtgagctgtgactataccactgcactccagcctg ggtgacagag1;cagaccatttcttaaacaaacaaacaaaaaattgccaatactgactatatcgcaagtcattacagtgaagt attgggaacagttttcaattaataataattgcacattggataaactgcattggctaccatactgccactgctcaaagcttcctg ataaaatt (SEQ ID NO: 50) DG5S1387:
AGGGAATCCAAGGTCATTCAACATCTGAGAGGATCCACTTATGTGAAAG ACAGAGAGGGCAACAAGTAGATTAAAACAACATTGAAGGCATGGATAC AGGCAGGAGGAGAAAAGGGAAATACTTTCATATCTCTGATGGAGAAGA GAGAATGCTGTATCCATAAAAGTGAACAGTTTGATGATCTATCATCTATC TATCTATCTATCTATCATCATCTATCTCTCTCTCTATCATCTATTTATCTGT CTCTAAAAAGAGTAAGAAAGACATCCTTAATATTCAAATCACGAGAATA TATATTAAAAAAATACCCAATAGAAGAGTTGAAGGTCGGGC (SEQ ID NO: 51)
DG5S745:
CCCAACACACCTCTACAGAAGATTTAAAAATTAGCCAGGCATGGTGGTG CACGCCTGTCATCCCAGCTAACTGCAAGGCTGAGATTGGAGGATTGGTT GGGCCTGGGAGGCTGAGGCTGCAGTGAGCTGTGATTGTGCCACTGCACT CCATCCTGGGTGACAGAGTGAGACCCTGTCTCAAAAACAAACAAACAAA CAAACAAACAAAATTCCTCTGTAACACTACTGATGGCTACTGAGCTTCTA CTTGTATACTTCCCATACTGATATATTTTCTATTCTATTCAGCAACTTATT CTGTTGCCAGTAAGTGTTAAAGTTCTTCCTTATGTGGAATTAAAATTTGC TTCCTTGGATTTTAAAAACATCTATCTGTTTATGCTCTGCTCCGTG (SEQ ID NO: 52)
Example 2: The ARRDC3 human obesity linkage region is syntenic to a murine obesity linkage region Following positional cloning of the male obesity susceptibility locus described in Example 1, a syntenic murine obesity susceptibility locus has been identified. As used herein the term "syntenic" describes a group of two or more chromosomal regions, found on one locus in a first species, that are also found in a homologous locus in a second species. FIG. 1 highlights the human chromosome 5 linkage to BMI in obese males supporting the ARRDC3 locus. Existance of two distinct loci controlling for BMI in males is supported by the broad linkage region. Further, while the linkage is clearly significant, with a peak lod score greater than 4, the region supported by the linkage is quite large. One method that has proven useful in narrowing such linkage regions is "intersecting" them with linkages in other species. Referring to FIG. 14, (the vertical axis shows LOD score, the horizontal axis shows relative distances in centimorgans along the entire chromosome) the left panel depicts several peaks corresponding to linkages on murine chromosome 13 to obesity-related traits in the BXD mouse cross described by Drake et al Physiol Genomics (2001) Apr 27; 5(4): 205-15. The traits represented include insulin levels, leptin levels, and fat mass. For most of the traits listed, there are two distinct but closely linked quantitative trait loci (QTL) controlling for the trait. The panel on the right represents the syntenic mapping between the mouse and human genomes in the mouse genome region supporting the linkages shown in the left panel. The mapping of the mouse lod score curves to the human lod score curves shows that the two peaks defined in both species overlap, with the peak lod score locations in each species falling within a 1 lod score drop of the peaks in the other species. ARRDC3 is a candidate for the distal peak in both species identified through positional cloning. Given that the genomic sequence for the B6 and DBA strains of mice used in the construction of the BXD cross are known, the ARRDC3 gene region was examined to determine if DNA variations in this gene could be found that were polymoφhic between these two strains. In total, three single nucleotide polymoφhisms were identified between the transcription START and STOP sites, with two falling in the 3'UTR and 1 in an intron. Further, ARRDC3 is located in a region that was determined not to be identical by descent between the B6 and DBA strains of mice, using the existing genomic sequence for these strains. These data are consistent with ARRDC3 as a positional candidate for the murine QTL (i.e., we would expect the B6 and DBA sframs to carry different haplotypes for this gene if the gene were the gene carrying the QTL).
Example 3: ARRDC3 is expressed in diverse tissues The general expression of ARRDC3 was assessed to determine whether it was expressed in various tissue types in both human and mouse populations.
Because little is known regarding the function of this gene, knowledge of the gene expression dittribution may be useful in establishing whether the function of this gene is specific to a given tissue or whether its action may be more general, whether it is seen expressed in tissues during development, whether its levels of expression vary significantly between different tissues, etc. FIG. 15 depicts the mean expression levels of ARRDC3 in over 80 human tissues and cell lines. This expression atlas was constructed by examining the relative transcript abundances in multiple samples for each of these tissue types for approximately 23,000 genes. ARRDC3 is seen to be more active in peripheral tissue like blood, adipose and liver tissue, compared to more central tissues like hypothalamus known to be involved in obesity related traits. In fact, effectively no expression of ARRDC3 was seen in hypothalamus tissue, suggesting that its function related to obesity may in fact be specific to peripheral tissues where it is seen to be more highly expressed.
Example 4: ARRDC3 expression in blood is linked to a hotspot eLOD region on chromosome 9 The regulation of ARRDC3 expression in tissues where it is moderately expressed was further explored. In this experiment blood samples from roughly 450 individuals recruited from 51 families were collected. The families were chosen among families that contribute most to any of three top obesity linkage region, identified in the Icelandic population using nonparametric linkage analysis (NPL). The contribution of individual family is qunatified by its NPL-score that measures the amount of excess sharing of genetic material in this region, i.e. that is inherited from a common ancestor, among the obese family members. For this study, NPL- score greater than 2 was considered sufficiently high for a family to be included in the collection. Each individual completed a "lifestyle" survey and was scored with respect to a number of traits, including weight, height, full plasma lipid profile, plasma glucose levels, plasma leptin levels, and plasma insulin levels. RNA was isolated from the blood samples of each individual and hybridized to a gene expression microarray representing 23,000+ genes, including ARRDC3. A genome-wide scan for linkage was conducted for the ARRDC3 gene expression trait. There were four female-specific linkages detected with lod scores greater than 1.5: chromosome 6 linkage with a lod score of 1.64, chromosome 20 linkage with a lod score of 2.03, chromosome 4 linkage with a lod score of 2.78, and a chromosome 9 linkage with a lod score of 3.42. No eLOD was detected for ARRDC3 expression in the region of the genome supporting the physical location of this gene (i.e., no cis eLOD was detected for ARRDC3 expression). The lod score curve for the chromosome 9 linkage is shown in FIG. 16. This linkage achieves genome-wide significance. The clinical traits examined in this experiment were examined point-wise for linkage to the chromosome 9 locus. There were 2 point-wise significant linkages and a third with a lod score greater than 1.0 linked to the same chromosome 9 locus: Bioimpedance (Ohms), mean coφuscular hemoglobin (MCH), and mean coφuscular volume (MCV). In addition to the three clinical traits linked to the chromosome 9 locus, of the total number of 39,800 linkages with lod scores greater than 1.5, detected for the 23,000+ genes profiled in this experiment, 447 fell within a 3 cM window of the ARRDC3 peak lod score on chromosome 9 (four would have been expected by chance). Therefore, the chromosome 9 locus is a hotspot for eLOD activity, with the expression of hundreds of genes in blood partially explained by this locus. These data further suggest that if all of these expression linkages are driven by the same gene, then one of the primary actions may be on the ARRDC3 gene, which in turn effects expression of a number of these other genes that show less significant linkage to this locus. A single GO molecular function category (see Table 3) was enriched in this set of genes: enzyme activator activity (p-value = 0.0015).
Table 3. Proteome Gene Ontology (GO) categories over represented in the 250 genes most correlated with ARRDC3 in adipose tissue (the genes making up the cluster in Figure 6). The GO functional categories associated with genes represented on the array are tested using the Fisher Exact Test to determine if the categories are over represented in the 250 gene set. A significant p-value suggests possible functional roles for the cluster. Proteome GeneOntology Functional Categories P-value nucleus 1.91E-08 transcription factor activity 2.10E-05 small nucleolar ribonucleoprotein complex 0.000162887 negative regulation of G-protein coupled receptor protein signaling pathway 0.000222109 transcriptional activator activity 0.000347302 regulation of transcription, DNA-dependent 0.000413329 nucleosome spacing 0.000759217
DNA binding 0.000769646 response to heat 0.001036909 apoptosis inhibitor activity 0.001145881
Sin3 complex 0.001220332 nucleolar ribonuclease P complex 0.001255927 thioredoxin pathway 0.001255927 ribonuclease P activity 0.001869855 acute-phase response 0.001891725 anti-apoptosis 0.002554905 transcription regulator activity 0.002852859 tRNA binding 0.003089684 regulation of transcription from Pol II promoter 0.003225976 transcription from Pol II promoter 0.003460811 mitochondrial membrane 0.004235764
MAPKKK cascade 0.005380449
ER-Golgi intermediate compartment 0.006605173 nucleolus 0.006835429 rRNA processing 0.009079681 nucleoplasm 0.009314119 response to nutrients 0.009755765
The group of genes co-localizing to the chromosome 9 locus strongly controlling ARRDC3 expression in females, is significantly enriched for genes that are significantly correlated with obesity-related traits. This is of note because the ARRDC3 gene has been implicated in obesity for males with much less of an effect observed in females. Therefore, it may be that the control of ARRDC3 in females is different than in males, so that even though females may carry the specific alleles that lead to obesity when carried in males, there may be compensatory factors at play in females that result in less exposure to disease. The female specific genetic control of the expression of this gene observed in the family pilot study is consistent with these observations.
Example 5: ARRDC3 expression in adipose tissue is significantly correlated with BMI in males but not females The expression of ARRDC3 was further explored to examine patterns of expression in adipose associated with obesity-related traits. In this pilot study, subcutaneous and omental fat tissue samples were collected from approximately 80 individuals. The same survey/phenotyping protocol employed for the experiment described in Example 4 was employed in this study. RNA was isolated from the adipose samples of each individual and hybridized to a gene expression microarray representing 23,000+ genes, including ARRDC3. FIG. 17 highlights a significant correlation between ARRDC3 expression and BMI for those individuals participating in this pilot study. Of interest is the significant correlation in males between the expression of ARRDC3 and BMI, with no significant correlation observed in females. Combined with the male-specific linkage of the ARRDC3 to BMI and the strong association of haplotypes in ARRDC3 to BMI in males, these data offer direct experimental evidence that ARRDC3 is in fact involved in obesity related traits. The correlation patterns observed in FIG. 17 did not hold for any of the genes flanking ARRDC3, again supporting ARRDC3 as the gene underlying the obesity traits at that locus. Example 6: Patterns of adipose tissue expression associated with ARRDC3 exression discriminates high BMI individuals from low BMI individuals Two hundred and fifty genes (the GenBank accession numbers of the genes used in this experiment are given in Table 4) representing the genes most correlated with ARRDC3 expression in adipose tissue were clustered using 2-dimensional, unsupervised agglomerative hierarchical clustering, where in both dimensions the heuristic criteria parameter was set to Average Link and the similarity measure was error weighted Pearson correlation coefficient. The clusters were compared to those formed by the adipose samples based on age and BMI. The two sample clusters had significantly different mean BMIs (p-value = 0.0007) and significantly different ages (p-value = 0.04). There was no association apparent in this cluster with gender. The pattern of expression more strongly supports the association to BMI compared to the correlation of ARRDC3 expression alone. FIG. 18 shows a pattern of expression associated with ARRDC3 expression in adipose tissue. Adipose tissue cluster of 250 genes representing the genes most correlated with ARRDC3 expression in adipose tissue. The pattern of expression more strongly supports the association to BMI compared to the correlation of ARRDC3 expression alone. The genes are clustered along the x-axis, and the adipose samples are clustered along the y-axis. The two groups defined by the clustering in this figure have significantly different mean BMIs (p-value = 0.0007) and significantly different ages (p-value = 0.04). There is no association apparent in this cluster with gender. It was discovered the two primary clusters formed by the genes expressed in the adipous tissue significantly associate with BMI and age, but not gender. That is, genes most strongly associated with ARRDC3 expression in adipose tissue collectively well discriminate high BMI individuals from low BMI individuals, indicating a significant perturbation in the transcriptional network associated with obesity related traits. These data again support ARRDC3 as a gene predisposing to obesity-related traits. Not only does the significant association between the pattern of expression of
ARRDC3 -associated genes and BMI indicate that this pattern is biologically meaningful, but the over-representation of Gene Ontology (GO) functional categories in the set of genes used in the clustering procedure also supports the significance of the cluster as it relates to obesity traits and suggests possible functional roles for ARRDC3. If the set of genes whose expression were associated with ARRDC3 expression were an artifact, we would not expect to see significant over representation of the GO categories. A list of GO categories over represented in the 250 gene set is given in Table 4.
Table 4. List of the 250 genes in adipose tissue whose expression is most correlated with the ARRDC3 expression, as described in Example 6. Human Gene ID Contig34880_RC AK023162 NM_014306 NM_006022 NM 81846 NM_020749 NM_152260 AB002337 NM_022453 Contig42328_RC Contig6064_RC NM_002923 NM_032302 AB002324 NM_003244 NM 38417 AK022043 NM_006295 NM_020529 NM_014205 AK000144 NM_003467 NM_020698 Contig34080_RC NM_024741 NM_030645 NM_002612 (human PDK 4) Contig53406_RC NM_022803 NM_020651 NM_017810 NM_017897 Contig56777_RC AK026286 NM_001781 NM_005858 N _020313 NM_016511 NM_153003 NM_138487 AK026373 NM_018948 AL080280 NM_024644 NM_001810 NM_003076 NM_025024 Contig46591_RC NM_005354 AK025797 NM_014725 NM_006887 Contig21997_RC NM_182793 NM_006337 NM_182627 NM_177442 NM_018380 NM_054028 NM_002690 NM_032626 NMJJ06973 NM_018941 AF041037 NM_152485 NM_002467 NM_005837 Contig19324_RC AB051463 NM_013382 AK021634 NM_030817 Contig29955_RC Contig53260_RC AK024121 Contig36420_RC NM 52411 Contig18780_RC NM_021830 NM_002434 AK024366 NM_032588 NM_012208 NM_024297 Contig5556_RC NM_022454 NM_017846 NM_001198 Contig6719_RC Contig26988_RC NM_013286 Contig44383_RC NM_182972 NM_021184 NM 144718 NM_004288 AK024035 NM_005194 NM_182763 NM_005195 NM_152595 NM_001363 AK026181 AK000954 NM_005452 NM_006469 NMJ305239 NM_144568 NM_007350 NM_022775 NM_021216 NM_002922 NM_012198 Contig40272_RC NM_032889 NM_016500 NM_013449 NM_012406 NM_030815 NM_005627 NM_015654 NM_170695 NM_177983 AK054933 NM 006831 NM_002727 NM_152307 NM_021981 NM_004623 Contig52296_RC NM_032290 NM_032807 NM_005204 Contig49270_RC NM_006079 NM_004178 NM_002809 NM_015683 NM 75617 NM_006414 NM_176870 NM_021640 Contig33975_RC Contig41277_RC NM_080665 NM_032848 NM_017829 NM_018081 NM_152652 Contig14068_RC NM_053045 AL137403 NM_020130 NM_020457 Contig44211_RC NM_003855 NM_004657 NM_004593 NM_033112 NM 006290 NM 003731 NM 004634 NM_004417 NM_024602
NMJD01924 NM_031426
NM_152567 NM_173354
NM_030938 Contig25429_RC
NM 002310 NM_004704
NM_006321 NM_005399
NM_032786 AK023474
NM 005793 NM_017825
Contig37142_RC Contig81_RC
NM_173686 NM 45014
NM_032522 NM_016270
NM_052850 NM_003883
Contig5432_RC NM_006096
NM_033397 NM_032891
NM_025203 NM_004424
AK057874 NM_032792
NM_014678 NM_175622
NM_023012 NM_152461
BC006136 NM_153339
NM 322752 NMJ317652
NM_003206 Contig34327
AL050185 NM_000601
NM_024784 NM_013299
NM_003481 NM_004241
NM_033480 NM_007034
NM_006763 NM_182531
ENST00000259939 ENST00000295439
AF085960 NM_152873
NM_003864 NM_006242
AK000535 Contig57101_RC
AB023164
NM_017807
NM_173548
Contig48436_RC
NM_145029
Contig45465_RC
NM 139016
NM 002600
NM_005842
NM_004592
NM_004952
NM_001165
AL117595
AF035318
NM_014134
Contig40091_RC
NM_006458
X66087
NM_152832
AF088045
NM_001166
NM_001400
AK001125
NM_005068
NM_001206
AK021459
Contig24843_RC
NM 022734 Example 7: Genes associated with ARRDC3 expression in adipose tissue are also associated with ARRDC3 expression in blood It was of interest to determine whether genes interacting with ARRDC3 expression in adipose tissue were also interacting with ARRDC3 expression in other tissues. Lists of genes associated with ARRDC3 in blood and adipose were constructed and the overlap between the lists was examined to determine if there was a statistically significant enrichment. Of the top 950 genes most significantly associated with ARRDC3 in blood and adipose tissues, there were 116 genes in the overlap, a very significant enrichment, given there were more than 23,000 genes represented on the array. Table 5 gives the GenBank accession numbers for the 116 genes. This set of genes potentially represents a more "core" set of genes most strongly interacting with ARRDC3, given their interaction with ARRDC3 is seen in multiple tissues.
Table 5. List of 116 genes most highly correlated with ARRDC3 expression in blood and adipose tissue, as described in Example 7. Human Gene ID Contig34880. .RC NM 152571 AK025081 NM 003467 NM 178815 NM 000399 NM 024741 NM 033027 NM 006186 AK026286 AK026820 Contig28001 RC AL080280 NM 024846 NM 024947 AK024035 NM 020307 NM 033389 NM 030645 NM 003407 NM 033514 NM 152485 AK023574 NMJD03821 NM 012208 AF130079 NM 030815 NM 004235 AK024121 NM_004860 NM 024297 Contig34108_RC NM 182972 Contig42395 RC Contig40272 RC NM 001945 NM 005627 NM 004927 Contig41277 RC NM 004071 NM 032786 NM 007076 NM 024784 AF009267 NM 006763 NM 031419 NM 004952 NM 001964 NM 014134 AL110175
AK021459 Contig25703_RC NM 017825 Contig36062 RC NM 006242 Contig 16654 RC NM_152268 NM 032349 Contig 13282 RC NM 004430 NM 014685 AK023526 NM_014797 NM 019110 Contig6514 RC AL512701 NM 004226 AF074998 NM 025026 NM 000963 NM 012143 NM 003192 NM 005067 ENST00000257780 NM_032690 NMJ314077 Contig 13480_ RC Contig49591 RC Contig22995_ RC Contig33762 RC Contig24602 RC NM 014793 NM 014154 AF090916 NM 032325 NM 003824 Contig40128_ RC NM 006925 Contig26573 RC NM 133280 Contig22758 RC AL137331 NM_002984 NM_152747 Contig44593 RC Contig31928_RC NM 080752 Contig6146_RC NM 018029 Contig15566 RC ENST00000300365 AK023892 Contig39129_ RC Contig26150 RC
Contig44534 RC NM 006748 NM 181708- U00945
AK022351 AL389956
AK022185 NM 005738
AK021981 Contig22022 RC
Contig41877 RC AF086341 As shown in FIG. 19 and FIG. 20, after clustering was perfomed in a manner similar to the one described above in Example 6, these 116 genes showed significant association to BMI in adipose tissue, but there was no apparent pattern in the cluster of these 116 genes over the family-based blood samples, and no detectable association to BMI from these blood-derived patterns of expression. Referring to FIG. 19, of the top 900 genes most significantly correlated with ARRDC3 in adipose tissue, 116 were correlated with the top 900 genes most significantly correlated with ARRDC3 in blood tissue. This is a statistically significant enrichment for genes associated with ARRDC3 in adipose also being associated with ARRDC3 in blood (given there are more than 23,000 genes represented on the array), with a p-value that is less than 10"25 (using the Fisher Exact Test). As with the two primary clusters indicated in FIG. 18, the two primary clusters here separate into higher and lower BMI groups. Referring to FIG. 20, there is no pattern evident in this plot that breaks the blood samples into meaningful clusters (i.e., clusters that associate with clinical traits like BMI).
Example 8: Expression pattern ofARRDC3 gene in human subcutaneous fat distinguishes fed and fasted states In order to further define the role of ARRDC3 in human adipose tissue metabolism, gene expression pattern in adipose tissue was examined in either fasted or fed individuals. Samples of RNA were collected from two groups of individuals. In Group 1, two biopsies of subcutaneous adipose were collected one week apart tissue from ten healthy donors. All had been fasting overnight. In Group 2, two biopsies of subcutaneous adipose tissue were collected one week apart from ten healthy donors.
However, in Group 2 the donors had either fasted or been fed. In the latter case, the biopsy was taken two hours after meal. FIG. 21 illustrates the experimental design for Group 2. Following the biopsy, RNA was isolated from the fat tissue samples and hybridised in DNA microarrays using 24,000 gene-specific probes. Expression levels were analyzed using ANalysis Of VAriance (ANOVA), a calculation procedure to allocate the amount of variation in data and determine if it is significant or is caused by random noise. (See, for example, Miller, R. G. Beyond ANOVA: Basics of Applied Statistics. Boca Raton, FL: Chapman & Hall, 1997 and Eric W. Weisstein. "ANOVA." From MathWorld at URL http://mathworld.wolfram.com/ANOVA.html. Infra-individualistic comparison between two fasting data points (week 1 and week 2) in Group 1 (n=10), yielded 0 results (0 genes responding to feeding) at ANOVA p-value <0.001 and 4 genes responding to feeding at ANOVA p-value <0.01. However, infra- and inter-individualistic comparison yielded 114 genes that are responding to feeding at ANOVA p-value <0.001 and 402 genes at ANOVA p- value θ.01. 402 genes found responsing to feeding in the combined infra- and inter- individualistic comparisons at ANOVA p-value <0.01 were subjected to clustering analysis similar to the one described in Example 6. As shown in FIG. 22 and FIG. 23, after clustering, these 402 genes showed significant association with the fasted/fed state. It was discovered that ARRDC3 was among the top 10 genes responding to feeding. As shown in FIGs. 24.1 for Group 2 only and 24.2 for the combined pool of Group 1 and Group 2 (vertical axis shows log ratio of the expression level), the transcript levels of ARRDC3 in adipose tissue were reduced upon food intake (ANOVA p-value = 0.000001). It was further discovered that pyruvate dehydrogenase kinase isoenzyme 4
(PDK4, GenBank Accession No. 002612) responds to feeding with ANOVA p-value = 0.002. PDK4 is also found among the genes that cluster with ARRDC3 gene expression in visceral fat, a cluster that discriminates between subjects with high BMI from subjects with low BMI. As shown in FIGs. 25.1 and 25.2 for Group 2 (vertical axis shows log ratio of the expression level), the transcript levels of PDK4 in adipose tissue were reduced upon food intake. In summary, there is an easily recognizable fasting signature in human subcutaneous fat. Fasting signature distinguishes between fasted and fed states. Among the top 10 (24,000 genes tested and 402 showed a significant response at ANOVA p-value < 0.01) responders was the human obesity gene ARRDC3. Obesity is a nutritionally related disease and genes that are important in the transition between the feeding and fasting are important candidate genes for obesity and metabolism in general. The finding that ARRDC3 levels are strongly and rapidly (2 hours) affected by the transition between these states in a tissue of major relevance to obesity lends a strong and independent support for a role of this gene in obesity. While this invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

CLAIMSWhat is claimed is:
1. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the presence of the at-risk haplotype associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity.
The method of Claim 1, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
3. The method of Claim 2, wherein the at-risk haplotype is selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
4. The method of Claim 3, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype I.
5. The method of Claim 3, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype II.
6. The method of Claim 3, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype III.
7. The method of Claim 3, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype IV.
8. The method of Claim 1, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
9. The method of Claim 8, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
10. The method of Claim 1, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
11. The method of Claim 1 , wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in an individual having a predisposition or susceptibility to obesity, compared to an individual who does not have a predisposition or susceptibility to obesity.
12. The method of Claim 1, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
13. The method of Claim 1 , wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 3.0.
14. The method of Claim 1, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"5 or less.
15. The method of Claim 1 , wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"6 or less.
16. The method of Claim 1, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"7 or less.
17. The method of Claim 1, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"8 or less.
18. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the presence of the at-risk haplotype associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to an obesity-associated condition.
19. The method of Claim 18, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
20. The method of Claim 19, wherein the at-risk haplotype is selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, haplotype IV.
21. The method of Claim 20, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype I.
22. The method of Claim 20, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype II.
23. The method of Claim 20, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype III.
24. The method of Claim 20, wherein the at-risk haplotype associated with the ARRDC3 gene comprises haplotype IV.
25. The method of Claim 18, wherein the obesity-associated condition is selected from the group consisting of diabetes, coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer.
26. The method of Claim 18, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
27. The method of Claim 26, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
28. The method of Claim 18, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
29. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to an obesity-associated condition, compared to a subject who does not have a predisposition or susceptibility to an obesity-associated condition.
30. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
31. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 3.0.
32. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"5 or less.
33. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"6 or less.
34. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"7 or less.
35. The method of Claim 18, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"8 or less.
36. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype I, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity.
37. The method of Claim 36, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
38. The method of Claim 37, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
39. The method of Claim 36, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
40. The method of Claim 36, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to obesity, compared to a subject who does not have a predisposition or susceptibility to obesity.
41. The method of Claim 36, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
42. The method of Claim 36, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 3.0.
43. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype I, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to an obesity-associated condition.
44. The method of Claim 43, wherein the obesity-associated condition is selected from the group consisting of diabetes, coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer.
45. The method of Claim 43, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
46. The method of Claim 45, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
47. The method of Claim 43, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
48. The method of Claim 43, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to an obesity-associated condition, compared to a subject who does not have a predisposition or susceptibility to an obesity-associated condition.
49. The method of Claim 43, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
50. The method of Claim 43, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 3.0.
51. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype II, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity.
52. The method of Claim 51, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
53. The method of Claim 52, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
54. The method of Claim 51, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
55. The method of Claim 51 , wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to obesity, compared to a subject who does not have a predisposition or susceptibility to obesity.
56. The method of Claim 51 , wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
57. The method of Claim 51, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 3.0.
58. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype II, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to an obesity-associated condition.
59. The method of Claim 58, wherein the obesity-associated condition is selected from the group consisting of diabetes, coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer.
60. The method of Claim 58, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
61. The method of Claim 60, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
62. The method of Claim 60, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
63. The method of Claim 60, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to an obesity-associated condition, compared to a subject who does not have a predisposition or susceptibility to an obesity-associated condition.
64. The method of Claim 58, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
65. The method of Claim 58, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 3.0.
66. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype III, wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity.
67. The method of Claim 66, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
68. The method of Claim 67, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
69. The method of Claim 66, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
70. The method of Claim 66, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to obesity, compared to a subject who does not have a predisposition or susceptibility to obesity.
71. The method of Claim 66, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
72. The method of Claim 66, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 2.5.
73. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype III, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to an obesity-associated condition.
74. The method of Claim 73, wherein the obesity-associated condition is selected from the group consisting of diabetes, coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer.
75. The method of Claim 73, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
76. The method of Claim 75, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
77. The method of Claim 73, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
78. The method of Claim 73, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to an obesity-associated condition, compared to a subject who does not have a predisposition or susceptibility to an obesity-associated condition.
79. The method of Claim 73, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
80. The method of Claim 73, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 2.5.
81. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype IV, and wherem the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to obesity.
82. The method of Claim 81, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
83. The method of Claim 82, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
84. The method of Claim 81, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
85. The method of Claim 81 , wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to obesity, compared to a subject who does not have a predisposition or susceptibility to obesity.
86. The method of Claim 81, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
87. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene, wherein the at-risk haplotype comprises haplotype IV, and wherein the presence of the at-risk haplotype is indicative of a predisposition or susceptibility to an obesity-associated condition.
88. The method of Claim 87, wherein the obesity-associated condition is selected from the group consisting of diabetes, coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer.
89. The method of Claim 87, wherein determining the presence or absence of the at-risk haplotype comprises enzymatic amplification of a nucleic acid from the subject.
90. The method of Claim 89, wherein determining the presence or absence of the at-risk haplotype further comprises electrophoretic analysis.
91. The method of Claim 87, wherein determining the presence or absence of the at-risk haplotype comprises nucleic acid sequence analysis.
92. The method of Claim 87, wherein the at-risk haplotype associated with the ARRDC3 gene is more frequently present in a subject having a predisposition or susceptibility to an obesity-associated condition, compared to a subject who does not have a predisposition or susceptibility to an obesity-associated condition.
93. The method of Claim 87, wherein the at-risk haplotype associated with the ARRDC3 gene has a relative risk of at least about 1.5.
94. A method of freating obesity or an obesity-associated condition or preventing obesity or an obesity-associated condition in a subject, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount.
95. The method of Claim 94, wherein the subject has an at-risk haplotype associated with the ARRDC3 gene.
96. The method of Claim 95, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
97. The method of Claim 96, wherein the at-risk haplotype associated with the ARRDC3 gene is haplotype I, haplotype II, haplotype III, or haplotype IV.
98. The method of Claim 94, wherein the obesity-associated condition is selected from the group consisting of diabetes, coronary artery disease, peripheral arterial occlusive disease, myocardial infarction, peripheral arterial occlusive disease, dyslipidemias, stroke, chronic venous abnormalities, orthopedic problems, sleep apnea disorders, esophageal reflux disease, hypertension, arthritis, infertility, miscarriages and cancer.
99. The method of Claim 94, wherein the subject has decreased ARRDC3 expression or activity.
100. The method of Claim 94, wherein the subject has increased thioredoxin expression or activity.
101. A method of reducing triglyceride levels in a subj ect, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount.
102. The method of Claim 101, wherein the subject has an at-risk haplotype associated with the ARRDC3 gene.
103. The method of Claim 102, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
104. The method of Claim 103, wherem the at-risk haplotype associated with the ARRDC3 gene is haplotype I, haplotype II, haplotype III, or haplotype IV.
105. The method of Claim 101, wherein the subj ect has decreased ARRDC3 expression or activity.
106. The method of Claim 101, wherein the subject has increased thioredoxin expression or activity.
107. A method of increasing fatty acid oxidation in a subject, comprising administering a compound that increases the expression or biological activity of ARRDC3 to the subject, in a therapeutically effective amount.
108. The method of Claim 107, wherein the subject has an at-risk haplotype associated with the ARRDC3 gene.
109. The method of Claim 108, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
110. The method of Claim 109, wherein the at-risk haplotype associated with the ARRDC3 gene is haplotype I, haplotype II, haplotype III, or haplotype IV.
111. The method of Claim 107, wherein the subject has decreased ARRDC3 expression or activity.
112. The method of Claim 107, wherein the subject has increased thioredoxin expression or activity.
113. A method of assessing a subject for an increased risk of obesity or an obesity-associated condition, comprising assessing the interaction between ARRDC3 and thioredoxin in the subject, wherein an increased level of interaction is indicative of a decreased risk of obesity or an obesity- associated condition.
114. The method of Claim 113, wherein the subject has an at-risk haplotype associated with the ARRDC3 gene.
115. The method of Claim 114, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2. -no¬
il 6. The method of Claim 115, wherein the at-risk haplotype associated with the ARRDC3 gene is haplotype I, haplotype II haplotype III, or haplotype IV.
117. The method of Claim 113, wherein the subject has decreased ARRDC3 expression or activity.
118. The method of Claim 113, wherein the subject has increased thioredoxin expression or activity.
119. A kit for assaying a sample from a subject to detect a predisposition or susceptibility to obesity in a subject, wherein the kit comprises one or more reagents for detecting an at-risk haplotype associated with the ARRDC3 gene.
120. The kit of Claim 119, wherein the one or more reagents comprise at least one contiguous nucleotide sequence that is completely complementary to a region comprising at least one of the markers of the at-risk haplotype.
121. The kit of Claim 119, wherein the one or more reagents comprise one or more nucleic acids that are capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene.
122. The kit of Claim 119, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
123. The kit of Claim 122, wherein the at-risk haplotype comprises a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
124. A kit for assaying a sample from a subject to detect a predisposition or susceptibility to an obesity-associated condition in a subject, wherein the kit comprises one or more reagents for detecting an at-risk haplotype associated with the ARRDC3 gene.
125. The kit of Claim 124, wherein the one or more reagents comprise at least one contiguous nucleotide sequence that is completely complementary to a region comprising at least one of the markers of the at-risk haplotype.
126. The kit of Claim 124, wherein the one or more reagents comprise one or more nucleic acids that are capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene.
127. The kit of Claim 124, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
128. The kit of Claim 127, wherein the at-risk haplotype comprises a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
129. A kit for assaying a sample from a subject to detect a predisposition or susceptibility to obesity in a subject, wherein the kit comprises: a) one or more labeled nucleic acids capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene; and b) reagents for detection of the label.
130. The kit of Claim 129, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
131. The kit of Claim 130, wherein the at-risk haplotype comprises a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
132. A kit for assaying a sample from a subject to detect a predisposition or susceptibility to an obesity-associated condition in a subject, wherein the kit comprises: a) one or more labeled nucleic acids capable of detecting one or more specific markers of an at-risk haplotype associated with the ARRDC3 gene; and b) reagents for detection of the label.
133. The kit of Claim 132, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
134. The kit of Claim 133, wherein the at-risk haplotype comprises a haplotype selected from the group consisting of: haplotype I, haplotype II, haplotype III, haplotype IV, and a combination of haplotype I, haplotype II, haplotype III, and haplotype IV.
135. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene, the marker having a p-value of 1X10"5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity.
136. The method of Claim 135, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"6 or less.
137. The method of Claim 135, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"7 or less.
138. The method of Claim 135, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"8 or less.
139. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene, the marker having a p-value of 1X10"5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to an obesity-associated condition.
140. The method of Claim 139, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"6 or less.
141. The method of Claim 139, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"7 or less.
142. The method of Claim 139, wherein the at-risk haplotype associated with the ARRDC3 gene has a p-value of lxl 0"8 or less.
143. A method of diagnosing a predisposition or susceptibility to obesity in a subject, comprising detecting the presence or absence of an at-risk haplotype associated with the ARRDC3 gene expressed in adipous tissue, wherein the presence of the at-risk haplotype associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to obesity.
144. The method of Claim 143, wherein the at-risk haplotype comprises a haplotype comprising one or more markers selected from Table 2.
145. A method of diagnosing a predisposition or susceptibility to an obesity- associated condition in a subject, comprising detecting the presence or absence of a genetic marker associated with the ARRDC3 gene expressed in adipous tissue, the marker having a p-value of 1X10"5 or less, wherein the presence of the marker associated with the ARRDC3 gene is indicative of a predisposition or susceptibility to an obesity-associated condition.
PCT/US2005/013900 2004-04-30 2005-04-22 Haplotypes in the human thioredoxin interacting protein homologue (arrdc3) gene associated with obesity WO2005111239A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US56698204P 2004-04-30 2004-04-30
US60/566,982 2004-04-30
US64989505P 2005-02-02 2005-02-02
US60/649,895 2005-02-02

Publications (2)

Publication Number Publication Date
WO2005111239A2 true WO2005111239A2 (en) 2005-11-24
WO2005111239A3 WO2005111239A3 (en) 2006-05-04

Family

ID=35394748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/013900 WO2005111239A2 (en) 2004-04-30 2005-04-22 Haplotypes in the human thioredoxin interacting protein homologue (arrdc3) gene associated with obesity

Country Status (1)

Country Link
WO (1) WO2005111239A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016005793A1 (en) * 2014-07-09 2016-01-14 Suisse Life Science S.A. Cosmetic method.

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998056804A1 (en) * 1997-06-13 1998-12-17 Human Genome Sciences, Inc. 86 human secreted proteins
EP1042346A1 (en) * 1997-06-13 2000-10-11 Human Genome Sciences, Inc. 86 human secreted proteins
WO2002046465A2 (en) * 2000-12-08 2002-06-13 Oxford Biomedica (Uk) Limited Method for identification of genes involved in specific diseases
WO2003097872A2 (en) * 2002-05-21 2003-11-27 Mtm Laboratories Ag G - protein coupled receptor marker molecules associated with colorectal lesions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998056804A1 (en) * 1997-06-13 1998-12-17 Human Genome Sciences, Inc. 86 human secreted proteins
EP1042346A1 (en) * 1997-06-13 2000-10-11 Human Genome Sciences, Inc. 86 human secreted proteins
WO2002046465A2 (en) * 2000-12-08 2002-06-13 Oxford Biomedica (Uk) Limited Method for identification of genes involved in specific diseases
WO2003097872A2 (en) * 2002-05-21 2003-11-27 Mtm Laboratories Ag G - protein coupled receptor marker molecules associated with colorectal lesions

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BODNAR J S ET AL: "Positional cloning of the combined hyperlipidemia gene Hyplip1" NATURE GENETICS, NATURE AMERICA, NEW YORK, US, vol. 30, no. 1, January 2002 (2002-01), pages 110-116, XP002226295 ISSN: 1061-4036 *
KONG A ET AL: "A high-resolution recombination map of the human genome" NATURE GENETICS, NATURE AMERICA, NEW YORK, US, vol. 31, no. 3, July 2002 (2002-07), pages 241-247, XP002979996 ISSN: 1061-4036 cited in the application *
PAJUKANTA PAEVI ET AL: "Familial combined hyperlipidemia is associated with upstream transcription factor 1 (USF1)" NATURE GENETICS, NEW YORK, NY, US, vol. 36, no. 4, April 2004 (2004-04), pages 371-376, XP002291153 ISSN: 1061-4036 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016005793A1 (en) * 2014-07-09 2016-01-14 Suisse Life Science S.A. Cosmetic method.

Also Published As

Publication number Publication date
WO2005111239A3 (en) 2006-05-04

Similar Documents

Publication Publication Date Title
Graessler et al. Association of the human urate transporter 1 with reduced renal uric acid excretion and hyperuricemia in a German Caucasian population
Timmermann et al. β-2 adrenoceptor genetic variation is associated with genetic predisposition to essential hypertension: The Bergen Blood Pressure Study
AU2006260477B2 (en) Genetic variants in the TCF7L2 gene as diagnostic markers for risk of type 2 diabetes mellitus
US8367333B2 (en) Genetic variants as markers for use in diagnosis, prognosis and treatment of eosinophilia, asthma, and myocardial infarction
US20070059722A1 (en) Novel genes and markers associated to type 2 diabetes mellitus
EP2155907B1 (en) Genetic variants useful for risk assessment of coronary artery disease and myocardial infarction
Cheyssac et al. Analysis of common PTPN1 gene variants in type 2 diabetes, obesity and associated phenotypes in the French population
EP2501826A1 (en) Nutrigenetic biomarkers for obesity and type 2 diabetes
Say The association of insertions/deletions (INDELs) and variable number tandem repeats (VNTRs) with obesity and its related traits and complications
Permutt et al. Searching for type 2 diabetes genes in the post-genome era
US20100255475A1 (en) Diagnostics and therapeutics for osteoporosis
US20100240539A1 (en) Human Autism Susceptibility Gene Encoding PRKCB1 and Uses Thereof
EP1732941A2 (en) Human type ii diabetes gene-kv channel-interacting protein (kchip1) located on chromosome 5
Meshkani et al. 1484insG polymorphism of the PTPN1 gene is associated with insulin resistance in an Iranian population
WO2005111239A2 (en) Haplotypes in the human thioredoxin interacting protein homologue (arrdc3) gene associated with obesity
US20080194419A1 (en) Genetic Association of Polymorphisms in the Atf6-Alpha Gene with Insulin Resistance Phenotypes
US20060057612A1 (en) Methods for diagnosing osteoporosis or a susceptibility to osteoporosis based on haplotype association
US20090208482A1 (en) Human obesity susceptibility gene encoding a member of the neurexin family and uses thereof
WO2005035793A2 (en) Cckar markers and haplotypes associated with extreme weight conditions
US20100047807A1 (en) Genetic variants associated with periodic limb movements and restless legs syndrome
WO2004041193A2 (en) HUMAN TYPE II DIABETES GENE-Kv CHANNEL-INTERACTING PROTEIN (KChIP1) LOCATED ON CHROMOSOME 5
WO2010083294A2 (en) Diagnostics and therapeutics for osteoporosis
Li Variation in insulin-like growth factor-2 binding protein 2 interacts with adiposity to alter insulin sensitivity in Mexican Americans
WO2009037295A1 (en) Method for testing psoriasis susceptibility

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase