WO2007131202A2 - Genomics of in-stent restenosis - Google Patents

Genomics of in-stent restenosis Download PDF

Info

Publication number
WO2007131202A2
WO2007131202A2 PCT/US2007/068293 US2007068293W WO2007131202A2 WO 2007131202 A2 WO2007131202 A2 WO 2007131202A2 US 2007068293 W US2007068293 W US 2007068293W WO 2007131202 A2 WO2007131202 A2 WO 2007131202A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
acid sample
increased risk
gene
haplotype
Prior art date
Application number
PCT/US2007/068293
Other languages
French (fr)
Other versions
WO2007131202A3 (en
Inventor
Santhi Ganesh
Elizabeth G. Nabel
Original Assignee
The Government Of The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Government Of The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services filed Critical The Government Of The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services
Publication of WO2007131202A2 publication Critical patent/WO2007131202A2/en
Publication of WO2007131202A3 publication Critical patent/WO2007131202A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/172Haplotypes

Definitions

  • This disclosure relates to the diagnosis and treatment of restenosis, particularly in- stent restenosis. More particularly, it relates to the identification of single nucleotide polymorphisms (SNPs) and haplotypes linked to in-stent restenosis that are useful for the diagnosis and treatment of this disease.
  • SNPs single nucleotide polymorphisms
  • Coronary artery disease is the leading cause of death in industrialized countries, and stent implantation is a mainstay of revascularization therapy for atherosclerosis.
  • ISR in-stent restenosis
  • ISR is a discrete focal vascular disease, characterized by a fibroproliferative response to the vascular "injury" induced by placement of a stent in a diseased artery. The response to injury follows a continuum in human arteries; some degree of cell proliferation occurs in all patients and can be thought of as a wound healing process.
  • ISR is primarily an inflammatory and proliferative disease, with distinct defined roles for various cell cycle proteins, growth factors, and inflammatory cytokines (Farb et al. Circulation 99(l):44-52, 1999; Ganesh et al. Pharmacogenomics 5(7):952-1004, 2004; Libby and Ganz Engl. J. Med. 337(6):418-419, 1997; Simon et al. J. Clin. Invest.
  • Atherosclerosis which can be characterized as the culmination of more indolent vascular injury processes.
  • Heritability of atherosclerosis and CAD are well established (Watkins and Farrall. Nat. Rev. Genet. 7(3): 163-173, 2006). What is not known, however, is the genetic basis of a patient's response to stent deployment and susceptibility to the development of ISR.
  • the clinical phenotype of ISR is unique, in that the exact timing of injury to the vascular wall by the stent is known in each patient, and well-defined clinical endpoints exist for the determination of ISR within a defined timeframe after stenting.
  • Described herein is the identification of sixteen candidate susceptibility loci linked to in-stent restenosis (ISR). Of these regions, seven contain the following genes: NOV, ARNTL, TAF4B, PKP4, EPHBl, ST 18 and FLJ21986, which encodes a hypothetical protein.
  • methods for identifying a subject having an increased risk of developing restenosis comprising obtaining a nucleic acid sample from the subject and determining the nucleotide present at the chromosomal positions identified herein as part of a haplotype associated with an increased risk of developing restenosis. Further provided is a method comprising determining nucleotide(s) present at the chromosomal positions identified herein in two or more of the group consisting of the PKP4 gene, the FLJ21986 gene, the
  • NOV gene the ARNTL gene, the TAF4B gene, the EPHBl gene, the ST 18 gene, cytoband 2p 16.1 , cytoband 4q31.21, cytoband 7p21.2, cytoband Ip31.1 , cytoband 2p24.1 , cytoband 2q22.3, cytoband 13ql4.3, cytoband 15q25.1 and cytoband 18q22.3.
  • Figure 1 is a series of diagrams showing linkage disequilibrium (LD) plotted as pairwise D' values for each of the eight regions identified in the regional haplotype analysis.
  • LD linkage disequilibrium
  • ⁇ -actin is included as a loading control.
  • Figure 2B is a series of images showing Movat pentachrome and immunostaining of ARNTL, NOV, PKP 4 and TAF4B in a morphologically normal coronary artery, in an artery with atherosclerosis and in lung as a control tissue. Images are shown at 2x magnification.
  • Figure 2C is a series of images showing a human coronary artery with ISR stained with Movat pentachrome. Removal of stent wires, which is necessary for sectioning of tissues, results in some disruption of overall arterial architecture, with the neointima intact. Further immunostaining of this coronary artery was performed in the same manner as for other arteries. Staining in the neointima is shown at 1Ox, 2Ox and 4Ox magnification.
  • Figures 3 shows quantitative trait analysis of the eight regions identified in the primary regional haplotype analysis: 1 Ip 15.2 ( Figure 3A); 18ql 1.2 ( Figure 3B); 2q24.1
  • Figure 4 is a flowchart showing a representative procedure for data cleanup/filtering prior to allelic association test of each SNP.
  • Figures 5 and 6 are flowcharts showing a representative procedure used for the genome-wide and regional haplotype association analysis for searching genetic variants of restenosis.
  • Figure 7 shows p-value plots for each of ABT2 (Figure 7A), ABT3 (Figure 7B), MNO (Figure 7C), MNl (Figure 7D), MN2 (Figure 7E), TTO ( Figure 7F), TTl ( Figure 7G) and TT2 ( Figure 7H).
  • the results of allele-based tests of association are presented using a 2- group strategy, in which all cases are compared to controls, and a 3 -group strategy, in which the two case groups are treated as ordinal outcomes and compared against controls.
  • two SNPs pass the significance threshold, which is adjusted using a Bonferroni correction for 96,767 tests.
  • the results of further allele-based tests of association are presented using various approaches.
  • Figure 8 shows quantitative trait analysis for the following loci: 2q24.1 (Figure 8A), 7q31.31 ( Figure 8B), 8q24.12 (Figure 8C), I lpl5.2 ( Figure 8D), 18ql l.2 (Figure 8E), 2p 16.1 (Figure 8F), 4q31.21 ( Figure 8G), and 7p21.2 (Figure 8H).
  • Figure 9 is a series of graphic chromosome maps, illustrating the position of each of five candidate ISR susceptibility genes (NOV, ARNTL, PKP4, TAF4B, and FLJ21986) identified herein.
  • nucleic and amino acid sequences listed herein are shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids, as defined in 37 CF. R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand. All sequence database accession numbers referenced herein are understood to refer to the version of the sequence identified by that accession number as it was available on the designated date.
  • SEQ ID NO: 1 is the nucleotide sequence of the ARNTL 5' RT-PCR primer.
  • SEQ ID NO: 2 is the nucleotide sequence of the ARNTL 3' RT-PCR primer.
  • SEQ ID NO: 3 is the nucleotide sequence of the FLJ21986 5' RT-PCR primer.
  • SEQ ID NO: 4 is the nucleotide sequence of the FLJ21986 3' RT-PCR primer.
  • SEQ ID NO: 5 is the nucleotide sequence of the NOV 5' RT-PCR primer.
  • SEQ ID NO: 6 is the nucleotide sequence of the NOV 3' RT-PCR primer.
  • SEQ ID NO: 7 is the nucleotide sequence of the PKP4 5' RT-PCR primer.
  • SEQ ID NO: 8 is the nucleotide sequence of the PKP4 3' RT-PCR primer.
  • SEQ ID NO: 9 is the nucleotide sequence of the TAF4B 5' RT-PCR primer.
  • SEQ ID NO: 10 is the nucleotide sequence of the TAF4B 3' RT-PCR primer.
  • SEQ ID NO: 11 is the nucleotide sequence of the B-actin 5' RT-PCR primer.
  • SEQ ID NO: 12 is the nucleotide sequence of the B-actin 3' RT-PCR primer.
  • SEQ ID NO: 13 is the nucleotide sequence of ARNTL (GenBank No. AF044288).
  • SEQ ID NO: 14 is the nucleotide sequence of FLJ21986 (GenBank No. AL832619.2).
  • SEQ ID NO: 15 is the nucleotide sequence of NOV (GenBank No. AY082381.1).
  • SEQ ID NO: 16 is the nucleotide sequence of PKP4 (GenBank No. BC048013.1).
  • SEQ ID NO: 17 is the nucleotide sequence of TAF4B (GenBank No. Y09321.1).
  • SEQ ID NO: 18 is the nucleotide sequence of ⁇ -actin (GenBank No. NM OOl 101).
  • SEQ ID NO: 19 is the nucleotide sequence of GAPDH (GenBank No. M 17851.1).
  • SEQ ID NOs: 20-23 are the nucleotide sequences of the regions spanning four SNPs that are part of a significant haplotype block in the PKP4 gene.
  • SEQ ID NOs: 24-26 are the nucleotide sequences of the regions spanning three SNPs that are part of a significant haplotype block in the FLJ21986 gene.
  • SEQ ID NOs: 27-39 are the nucleotide sequences of the regions spanning thirteen SNPs that are part of a significant haplotype block in the NOV gene.
  • SEQ ID NOs: 40-44 are the nucleotide sequences of the regions spanning five SNPs that are part of a significant haplotype block in the ARNTL gene.
  • SEQ ID NOs: 45 and 46 are the nucleotide sequences of the regions spanning two SNPs that are part of a significant haplotype block in the TAF4B gene.
  • SEQ ID NOs: 47-49 are the nucleotide sequences of the regions spanning three SNPs that are part of a significant haplotype block in cytoband 2p 16.1.
  • SEQ ID NOs: 50-52 are the nucleotide sequences of the regions spanning three SNPs that are part of a significant haplotype block in cytoband 4q31.21.
  • SEQ ID Nos: 53-59 are the nucleotide sequences of the regions spanning seven SNPs that are part of a significant haplotype block in cytoband 7p21.2.
  • SEQ ID NO: 60 is the nucleotide sequence of EPHBl (GenBank No. NM 004441).
  • SEQ ID NO: 61 is the nucleotide sequence of ST 18 (GenBank No. ABOl 1107).
  • Allele Any one of a number of viable DNA codings of the same gene (sometimes the term refers to a non-gene sequence) occupying a given locus (position) on a chromosome.
  • An individual's genotype for that gene will be the set of alleles it happens to possess.
  • two alleles make up the individual's genotype.
  • a diploid organism which has two different alleles of the gene is said to be heterozygous.
  • the process of "detecting alleles" may be referred to as “genotyping, determining or identifying an allele or polymorphism,” or any similar phrase.
  • the allele actually detected will be manifest in the genomic DNA of a subject, but may also be detectable from RNA or protein sequences transcribed or translated from this region.
  • Amplification The use of a technique that increases the number of copies of a nucleic acid molecule in a sample.
  • An example of in vitro amplification is the polymerase chain reaction (PCR), in which a biological sample obtained from a subject is contacted with a pair of oligonucleotide primers, under conditions that allow for hybridization of the primers to a nucleic acid molecule in the sample.
  • the primers are extended under suitable conditions, dissociated from the template, and then re -annealed, extended, and dissociated to amplify the number of copies of the nucleic acid molecule.
  • the product of amplification can be characterized by such techniques as electrophoresis, restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic acid sequencing.
  • amplification methods include strand displacement amplification, as disclosed in U.S. Patent No. 5,744,311 ; transcription-free isothermal amplification, as disclosed in U.S. Patent No. 6,033,881 ; repair chain reaction amplification, as disclosed in PCT Publication No. WO 90/01069; ligase chain reaction amplification, as disclosed in EP- A-320,308; gap filling ligase chain reaction amplification, as disclosed in U.S. Patent No. 5,427,930; and NASBATM RNA transcription-free amplification, as disclosed in U.S. Patent No. 6,025,134.
  • An amplification method can be modified, including for example by additional steps or coupling the amplification with another protocol.
  • Double-stranded DNA has two strands, a 5' -> 3' strand, referred to as the plus strand, and a 3' -> 5' strand (the reverse complement), referred to as the minus strand. Because RNA polymerase adds nucleic acids in a 5' -> 3' direction, the minus strand of the DNA serves as the template for the RNA during transcription. Thus, the RNA formed will have a sequence complementary to the minus strand and identical to the plus strand (except that U is substituted for T).
  • Antisense molecules are molecules that are specifically hybridizable or specifically complementary to either RNA or the plus strand of DNA.
  • Sense molecules are molecules that are specifically hybridizable or specifically complementary to the minus strand of DNA.
  • Antigene molecules are either antisense or sense molecules directed to a dsDNA target.
  • Array An arrangement of molecules, particularly biological macromolecules (such as polypeptides or nucleic acids) or cell or tissue samples, in addressable locations on or in a substrate.
  • a "microarray” is an array that is miniaturized so as to require or be aided by microscopic examination for evaluation or analysis. These arrays are sometimes called DNA chips, or generally, biochips; though more formally they are referred to as microarrays, and the process of testing the gene patterns of an individual is sometimes called microarray profiling.
  • DNA array fabrication chemistry and structure is varied, typically made up of 400,000 different features, each holding DNA from a different human gene, but some employing a solid-state chemistry to pattern as many as 780,000 individual features.
  • the array of molecules makes it possible to carry out a very large number of analyses on a sample at one time.
  • one or more molecules such as an oligonucleotide probe
  • the number of addressable locations on the array can vary, for example from a few (such as three) to at least 50, at least 100, at least 200, at least 250, at least 300, at least 500, at least 600, at least 1000, at least 10,000, or more.
  • an array includes nucleic acid molecules, such as oligonucleotide sequences that are at least 15 nucleotides in length, such as about 15-40 nucleotides in length, such as at least 18 nucleotides in length, at least 21 nucleotides in length, or even at least 25 nucleotides in length.
  • the molecule includes oligonucleotides attached to the array via their 5'- or 3 '-end.
  • each arrayed sample is addressable, in that its location can be reliably and consistently determined within the at least two dimensions of the array.
  • the feature application location on an array can assume different shapes.
  • the array can be regular (such as arranged in uniform rows and columns) or irregular.
  • the location of each sample is assigned to the sample at the time when it is applied to the array, and a key may be provided in order to correlate each location with the appropriate target or feature position.
  • ordered arrays are arranged in a symmetrical grid pattern, but samples could be arranged in other patterns (such as in radially distributed lines, spiral lines, or ordered clusters).
  • Addressable arrays usually are computer readable, in that a computer can be programmed to correlate a particular address on the array with information about the sample at that position (such as hybridization or binding data, including for instance signal intensity).
  • information about the sample at that position such as hybridization or binding data, including for instance signal intensity.
  • the individual features in the array are arranged regularly, for instance in a Cartesian grid pattern, which can be correlated to address information by a computer.
  • Binding or stable binding An association between two substances or molecules, such as the hybridization of one nucleic acid molecule to another (or itself) and the association of an antibody with a peptide.
  • An oligonucleotide molecule binds or stably binds to a target nucleic acid molecule if a sufficient amount of the oligonucleotide molecule forms base pairs or is hybridized to its target nucleic acid molecule, to permit detection of that binding.
  • Binding can be detected by any procedure known to one skilled in the art, such as by physical or functional properties of the target: oligonucleotide complex. For example, binding can be detected functionally by determining whether binding has an observable effect upon a biosynthetic process such as expression of a gene, DNA replication, transcription, translation, and the like. Physical methods of detecting the binding of complementary strands of nucleic acid molecules, include but are not limited to, such methods as DNase I or chemical footprinting, gel shift and affinity cleavage assays, Northern blotting, dot blotting and light absorption detection procedures.
  • one method involves observing a change in light absorption of a solution containing an oligonucleotide (or an analog) and a target nucleic acid at 220 to 300 nm as the temperature is slowly increased. If the oligonucleotide or analog has bound to its target, there is a sudden increase in absorption at a characteristic temperature as the oligonucleotide (or analog) and target disassociate from each other, or melt.
  • the method involves detecting a signal, such as a detectable label, present on one or both complementary strands.
  • T m The binding between an oligomer and its target nucleic acid is frequently characterized by the temperature (T m ) at which 50% of the oligomer is melted from its target.
  • T m the temperature at which 50% of the oligomer is melted from its target.
  • a labeled target molecule "binds" to a nucleic acid molecule in a spot on an array if, after incubation of the (labeled) target molecule (usually in solution or suspension) with or on the array for a period of time (usually 5 minutes or more, for instance 10 minutes, 20 minutes, 30 minutes, 60 minutes, 90 minutes, 120 minutes or more, for instance over night or even 24 hours), a detectable amount of that molecule associates with a nucleic acid feature of the array to such an extent that it is not removed by being washed with a relatively low stringency buffer (such as higher salt (such as 3 x SSC or higher), room temperature washes).
  • a relatively low stringency buffer such as higher salt (such as 3 x SSC or higher), room temperature washes).
  • Washing can be carried out, for instance, at room temperature, but other temperatures (either higher or lower) also can be used.
  • Targets will bind probe nucleic acid molecules within different features on the array to different extents, based at least on sequence homology, and the term "bind" encompasses both relatively weak and relatively strong interactions. Thus, some binding will persist after the array is washed in a more stringent buffer (such as lower salt (such as about 0.5 to about 1.5 x SSC), 55-65°C washes).
  • a more stringent buffer such as lower salt (such as about 0.5 to about 1.5 x SSC), 55-65°C washes).
  • probe and target molecules are both nucleic acids
  • binding of the test or reference molecule to a feature on the array can be discussed in terms of the specific complementarity between the probe and the target nucleic acids.
  • protein-based arrays where the probe molecules are or comprise proteins, and/or where the target molecules are or comprise proteins, and arrays comprising nucleic acids to which proteins/pep tides are bound, or vice versa.
  • cDNA A DNA molecule lacking internal, non-coding segments (such as introns) and regulatory sequences that determine transcription. By way of example, cDNA may be synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells.
  • Complementarity and percentage complementarity Molecules with complementary nucleic acids form a stable duplex or triplex when the strands bind, (hybridize), to each other by forming Watson-Crick, Hoogsteen or reverse Hoogsteen base pairs. Stable binding occurs when an oligonucleotide molecule remains detectably bound to a target nucleic acid sequence under the required conditions.
  • Complementarity is the degree to which bases in one nucleic acid strand base pair with the bases in a second nucleic acid strand. Complementarity is conveniently described by percentage, that is, the proportion of nucleotides that form base pairs between two strands or within a specific region or domain of two strands.
  • oligonucleotide For example, if 10 nucleotides of a 15- nucleotide oligonucleotide form base pairs with a targeted region of a DNA molecule, that oligonucleotide is said to have 66.67% complementarity to the region of DNA targeted.
  • sufficient complementarity means that a sufficient number of base pairs exist between an oligonucleotide molecule and a target nucleic acid sequence to achieve detectable binding.
  • the percentage complementarity that fulfills this goal can range from as little as about 50% complementarity to full (100%) complementary.
  • sufficient complementarity is at least about 50%, for example at least about 75% complementarity, at least about 90% complementarity, at least about 95% complementarity, at least about 98% complementarity, or even at least about 100% complementarity.
  • DNA deoxyribonucleic acid
  • the repeating units in DNA polymers are four different nucleotides, each of which includes one of the four bases (adenine, guanine, cytosine and thymine) bound to a deoxyribose sugar to which a phosphate group is attached.
  • Triplets of nucleotides, referred to as codons, in DNA molecules code for amino acid in a polypeptide.
  • the term codon is also used for the corresponding (and complementary) sequences of three nucleotides in the mRNA into which the DNA sequence is transcribed.
  • Fluorophore A chemical compound, which when excited by exposure to a particular wavelength of light, emits light (i.e., fluoresces), for example at a different wavelength. Fluorophores can be described in terms of their emission profile, or "color.” Green fluorophores, for example Cy3, FITC, and Oregon Green, are characterized by their emission at wavelengths generally in the range of 515-540 ⁇ . Red fluorophores, for example Texas Red, Cy5 and tetramethylrhodamine, are characterized by their emission at wavelengths generally in the range of 590-690 ⁇ . Examples of fluorophores that may be used are provided in U.S. Patent No.
  • 5,866,366 to Nazarenko et ah include for instance: 4-acetamido-4'- isothiocyanatostilbene-2,2'disulfonic acid, acridine and derivatives such as acridine and acridine isothiocyanate, 5-(2'-aminoethyl)aminonaphthalene-l -sulfonic acid (EDANS), 4- amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate (Lucifer Yellow VS), N-(4- anilino- 1 -naphthyl)maleimide, anthranilamide, Brilliant Yellow, coumarin and derivatives such as coumarin, 7-amino-4-methylcoumarin (AMC, Coumarin 120), 7-amino-4- trifluoromethylcouluarin (Coumarin 151); cyanosine; 4',6-di
  • rhodamine and derivatives such as 6-carboxy-X- rhodamine (ROX), 6-carboxyrhodamine (R6G), lissamine rhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101 and sulfonyl chloride derivative of sulforhodamine 101 (Texas Red); N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA); tetramethyl rhodamine; tetramethyl rhodamine isothiocyanate (TRITC); riboflavin; rosolic acid and terbium chelate derivatives.
  • ROX 6-carboxy-X- rhod
  • fluorophores include GFP (green fluorescent protein), LissamineTM, diethylaminocoumarin, fluorescein chlorotriazinyl, naphthofluorescein, 4,7- dichlororhodamine and xanthene and derivatives thereof.
  • GFP green fluorescent protein
  • LissamineTM diethylaminocoumarin
  • fluorescein chlorotriazinyl diethylaminocoumarin
  • fluorescein chlorotriazinyl 1,4-diofluorescein
  • naphthofluorescein 1,7- dichlororhodamine
  • xanthene 1,7- dichlororhodamine
  • Haplotype The genetic constitution of an individual chromosome. In diploid organisms, a haplotype contains one member of the pair of alleles for each site. A haplotype can refer to only one locus or to an entire genome. Haplotype can also refer to a set of single nucleotide polymorphisms (SNPs) found to be statistically associated on a single chromatid.
  • SNPs single nucleotide polymorphisms
  • haplotypes described in the regional haplotype analysis results are defined using “ 1 " and “2" coding for alleles, wherein “ 1 " corresponds to allele “A,” and “2" corresponds to allele “B.”
  • a haplotype defined as 1122 corresponds to a haplotype AABB.
  • nucleic acid consists of nitrogenous bases that are either pyrimidines (cytosine (C), uracil (U), and thymine (T)) or purines (adenine (A) and guanine (G)). These nitrogenous bases form hydrogen bonds between a pyrimidine and a purine, and the bonding of the pyrimidine to the purine is referred to as "base pairing.” More specifically, A will hydrogen bond to T or U, and G will bond to C. "Complementary” refers to the base pairing that occurs between to distinct nucleic acid sequences or two distinct regions of the same nucleic acid sequence.
  • oligonucleotide and “specifically complementary” are terms that indicate a sufficient degree of complementarity such that stable and specific binding occurs between the oligonucleotide (or its analog) and the DNA or RNA target.
  • the oligonucleotide or oligonucleotide analog need not be 100% complementary to its target sequence to be specifically hybridizable.
  • An oligonucleotide or analog is specifically hybridizable when binding of the oligonucleotide or analog to the target DNA or RNA molecule interferes with the normal function of the target DNA or RNA, and there is a sufficient degree of complementarity to avoid non-specific binding of the oligonucleotide or analog to non-target sequences under conditions where specific binding is desired, for example under physiological conditions in the case of in vivo assays or systems. Such binding is referred to as specific hybridization.
  • Hybridization conditions resulting in particular degrees of stringency will vary depending upon the nature of the hybridization method of choice and the composition and length of the hybridizing nucleic acid sequences. Generally, the temperature of hybridization and the ionic strength (especially the Na + and/or Mg ++ concentration) of the hybridization buffer will determine the stringency of hybridization, though wash times also influence stringency. Calculations regarding hybridization conditions required for attaining particular degrees of stringency are discussed by Sambrook et al. (ed.), Molecular Cloning: A
  • stringent conditions encompass conditions under which hybridization will only occur if there is less than 25% mismatch between the hybridization molecule and the target sequence.
  • Stringent conditions may be broken down into particular levels of stringency for more precise definition.
  • “moderate stringency” conditions are those under which molecules with more than 25% sequence mismatch will not hybridize;
  • conditions of "medium stringency” are those under which molecules with more than 15% mismatch will not hybridize, and
  • conditions of "high stringency” are those under which sequences with more than 20% mismatch will not hybridize.
  • Conditions of "very high stringency” are those under which sequences with more than 10% mismatch will not hybridize. The following is an exemplary set of hybridization conditions and is not meant to be limiting:
  • Hybridization 5x SSC at 65°C for 16 hours Wash twice: 2x SSC at room temperature (RT) for 15 minutes each
  • Hybridization 5x-6x SSC at 65°C-70°C for 16-20 hours
  • Hybridization 6x SSC at RT to 55°C for 16-20 hours
  • In vitro amplification Techniques that increase the number of copies of a nucleic acid molecule in a sample or specimen.
  • An example of in vitro amplification is the polymerase chain reaction, in which a biological sample collected from a subject is contacted with a pair of oligonucleotide primers, under conditions that allow for the hybridization of the primers to nucleic acid template in the sample.
  • the primers are extended under suitable conditions, dissociated from the template, and then re-annealed, extended, and dissociated to amplify the number of copies of the nucleic acid.
  • the product of in vitro amplification may be characterized by electrophoresis, restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic acid sequencing, using standard techniques.
  • Other examples of in vitro amplification techniques include strand displacement amplification (see U.S. Patent No. 5,744,311); transcription-free isothermal amplification (see U.S. Patent No. 6,033,881); repair chain reaction amplification (see WO 90/01069); ligase chain reaction amplification (see EP-A-320 308); gap filling ligase chain reaction amplification (see U.S. Patent No. 5,427,930); coupled ligase detection and PCR (see U.S. Patent No. 6,027,889); and NASBATM RNA transcription-free amplification (see U.S. Patent No. 6,025,134).
  • Isolated An "isolated" biological component (such as a nucleic acid molecule, protein, or organelle) has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, such as other chromosomal and extra-chromosomal DNA and RNA, proteins and organelles.
  • Nucleic acid molecules and proteins that have been "isolated” include nucleic acid molecules and proteins purified by standard purification methods. The term also embraces nucleic acid molecules and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acid molecules and proteins.
  • Label Detectable marker or reporter molecules, which can be attached to nucleic acids. Typical labels include fluorophores, radioactive isotopes, ligands, chemiluminescent agents, metal sols and colloids, and enzymes. Methods for labeling and guidance in the choice of labels useful for various purposes are discussed, e.g., in Sambrook et ah, in Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press (1989) and Ausubel et al, in Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences (1987).
  • Linkage disequilibrium (LD) The non-random association of alleles at two or more loci, not necessarily on the same chromosome.
  • LD describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies.
  • the expected frequency of occurrence of two alleles that are inherited independently is the frequency of the first allele multiplied by the frequency of the second allele. Alleles that co-occur at expected frequencies are said to be in linkage equilibrium.
  • Locus The position of a gene (or other significant sequence) on a chromosome. Mutation: Any change of the DNA sequence within a gene or chromosome. In some instances, a mutation will alter a characteristic or trait (phenotype), but this is not always the case.
  • Types of mutations include base substitution point mutations (for example, transitions or transversions), deletions, and insertions. Missense mutations are those that introduce a different amino acid into the sequence of the encoded protein; nonsense mutations are those that introduce a new stop codon. In the case of insertions or deletions, mutations can be in-frame (not changing the frame of the overall sequence) or frame shift mutations, which may result in the misreading of a large number of codons (and often leads to abnormal termination of the encoded product due to the presence of a stop codon in the alternative frame).
  • This term specifically encompasses variations that arise through somatic mutation, for instance those that are found only in disease cells, but not constitutionally, in a given individual. Examples of such somatically-acquired variations include the point mutations that frequently result in altered function of various genes that are involved in development of cancers. This term also encompasses DNA alterations that are present constitutionally, that alter the function of the encoded protein in a readily demonstrable manner, and that can be inherited by the children of an affected individual. In this respect, the term overlaps with "polymorphism,” as defined herein, but generally refers to the subset of constitutional alterations.
  • Nucleic acid molecule A polymeric form of nucleotides, which may include both sense and anti-sense strands of RNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of the above.
  • a nucleotide refers to a ribonucleotide, deoxynucleotide or a modified form of either type of nucleotide.
  • a "nucleic acid molecule” as used herein is synonymous with “nucleic acid” and “polynucleotide.”
  • a nucleic acid molecule is usually at least 10 bases in length, unless otherwise specified. The term includes single and double stranded forms of DNA.
  • a polynucleotide may include either or both naturally occurring and modified nucleotides linked together by naturally occurring and/or non-naturally occurring nucleotide linkages.
  • Nucleotide Includes, but is not limited to, a monomer that includes a base linked to a sugar, such as a pyrimidine, purine or synthetic analogs thereof, or a base linked to an amino acid, as in a peptide nucleic acid (PNA).
  • a nucleotide is one monomer in a polynucleotide.
  • a nucleotide sequence refers to the sequence of bases in a polynucleotide.
  • Oligonucleotide A nucleic acid molecule generally comprising a length of 300 bases or fewer.
  • the term often refers to single stranded deoxyribonucleotides, but it can refer as well to single or double stranded ribonucleotides, RNA:DNA hybrids and double stranded DNAs, among others.
  • oligonucleotide also includes oligonucleosides (that is, an oligonucleotide minus the phosphate) and any other organic base polymer. In some examples, oligonucleotides are about 10 to about 90 bases in length, for example, 12, 13, 14, 15, 16, 17, 18, 19 or 20 bases in length.
  • Oligonucleotides are about 25, about 30, about 35, about 40, about 45, about 50, about 55, about 60 bases, about 65 bases, about 70 bases, about 75 bases or about 80 bases in length. Oligonucleotides may be single stranded, for example, for use as probes or primers, or may be double stranded, for example, for use in the construction of a mutant gene. Oligonucleotides can be either sense or anti sense oligonucleotides. An oligonucleotide can be modified as discussed above in reference to nucleic acid molecules. Oligonucleotides can be obtained from existing nucleic acid sources (for example, genomic or cDNA), but can also be synthetic (for example, produced by laboratory or in vitro oligonucleotide synthesis).
  • a first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence.
  • a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence.
  • operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.
  • ORF Open reading frame
  • PNA Peptide Nucleic Acid
  • the nature of the carrier will depend on the particular mode of administration being employed.
  • parenteral formulations usually comprise injectable fluids that include pharmaceutically and physiologically acceptable fluids such as water, physiological saline, balanced salt solutions, aqueous dextrose, glycerol or the like as a vehicle.
  • non-toxic solid carriers can include, for example, pharmaceutical grades of mannitol, lactose, starch, or magnesium stearate.
  • pharmaceutical compositions to be administered can contain minor amounts of non-toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate.
  • Polymorphism A variation in the gene sequence. The polymorphisms can be those variations (DNA sequence differences) which are generally found between individuals or different ethnic groups and geographic locations which, while having a different sequence, produce functionally equivalent gene products.
  • polymorphisms also encompass variations which can be classified as alleles and/or mutations which can produce gene products which may have an altered function. Polymorphisms also encompass variations which can be classified as alleles and/or mutations which either produce no gene product or an inactive gene product or an active gene product produced at an abnormal rate or in an inappropriate tissue or in response to an inappropriate stimulus. Further, the term is also used interchangeably with allele as appropriate.
  • Polymorphisms can be referred to, for instance, by the nucleotide position at which the variation exists, by the change in amino acid sequence caused by the nucleotide variation, or by a change in some other characteristic of the nucleic acid molecule or protein that is linked to the variation.
  • Nucleic acid probes and primers can be readily prepared based on the nucleic acid molecules provided as indicators of susceptibility to in-stent restenosis or a related disease, condition or disorder. It is also appropriate to generate probes and primers based on fragments or portions of these nucleic acid molecules, particularly in order to distinguish between and among different alleles and haplotypes within a single gene. Also appropriate are probes and primers specific for the reverse complement of these sequences, as well as probes and primers to 5' or 3' regions.
  • a probe comprises an identifiable, isolated nucleic acid that recognizes a target nucleic acid sequence.
  • Probes include a nucleic acid that is attached to an addressable location, a detectable label or other reporter molecule and that hybridizes to a target sequence.
  • Typical labels include radioactive isotopes, enzyme substrates, co-factors, ligands, chemiluminescent or fluorescent agents, haptens, and enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are discussed, for example, in Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989 and Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999.
  • Primers are short nucleic acid molecules, for instance DNA oligonucleotides 10 nucleotides or more in length, for example that hybridize to contiguous complementary nucleotides or a sequence to be amplified. Longer DNA oligonucleotides may be about 15, 20, 25, 30 or 50 nucleotides or more in length. Primers can be annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand, and then the primer extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, for example, by the PCR or other nucleic-acid amplification methods known in the art, as described below.
  • Amplification primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, ⁇ 1991, Whitehead Institute for Biomedical Research, Cambridge, MA).
  • Primer Very 0.5, ⁇ 1991, Whitehead Institute for Biomedical Research, Cambridge, MA.
  • probes and primers can be selected that include at least 20, 25, 30, 35, 40, 45, 50 or more consecutive nucleotides of a target nucleotide sequences.
  • Nucleic acid molecules may be selected that comprise at least 10, 15, 20, 25, 30, 35,
  • nucleic acid molecules might comprise at least 10 consecutive nucleotides of a nucleic acid sequence shown in any one of the sequences discussed or described herein, and more particularly any 10 consecutive nucleotides overlapping one of the SNPs illustrated in any of these sequences. More particularly, probes and primers in some embodiments are selected so that they overlap or reside adjacent to at least one of the indicated SNPs indicated in the Sequence Listing or one of the Tables (such as Table 5, 6, or 7).
  • a purified nucleic acid preparation is one in which the specified protein is more enriched than the nucleic acid is in its generative environment, for instance within a cell or in a biochemical reaction chamber.
  • a preparation of substantially pure nucleic acid may be purified such that the desired nucleic acid represents at least 50% of the total nucleic acid content of the preparation.
  • a substantially pure nucleic acid will represent at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, or at least 95% or more of the total nucleic acid content of the preparation.
  • a recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination can be accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, such as by genetic engineering techniques.
  • Restenosis The reoccurrence of stenosis (an abnormal narrowing in a blood vessel or other tubular organ or structure). As used herein, this is generally restenosis of an artery, or other blood vessel, but possibly any hollow organ that has been "unblocked”. If restenosis occurs at the site where a stent has been placed in an artery, it is called in-stent restenosis (ISR).
  • ISR in-stent restenosis
  • RNA A typically linear polymer of ribonucleic acid monomers, linked by phosphodiester bonds. Naturally occurring RNA molecules fall into three classes, messenger (mRNA, which encodes proteins), ribosomal (rRNA, components of ribosomes), and transfer (tRNA, molecules responsible for transferring amino acid monomers to the ribosome during protein synthesis). Total RNA refers to a heterogeneous mixture of all three types of RNA molecules. Sample: A sample obtained from a plant or animal subject.
  • biological samples include all samples useful for genetic analysis in subjects, including, but not limited to: cells, tissues, and bodily fluids, such as blood; derivatives and fractions of blood (such as serum or plasma); extracted galls; biopsied or surgically removed tissue, including tissues that are, for example, unfixed, frozen, fixed in formalin and/or embedded in paraffin; tears; milk; skin scrapes; surface washings; urine; sputum; cerebrospinal fluid; prostate fluid; pus; bone marrow aspirates; BAL; saliva; cervical swabs; vaginal swabs; and oropharyngeal wash.
  • cells, tissues, and bodily fluids such as blood; derivatives and fractions of blood (such as serum or plasma); extracted galls; biopsied or surgically removed tissue, including tissues that are, for example, unfixed, frozen, fixed in formalin and/or embedded in paraffin; tears; milk; skin scrapes; surface washings; urine; sputum; cerebrospinal fluid; prostate fluid; pus;
  • Sequence identity The similarity between two nucleic acid sequences, or two amino acid sequences, is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homology); the higher the percentage, the more similar the two sequences are. Homologs or orthologs of nucleic acid or amino acid sequences will possess a relatively high degree of sequence identity when aligned using standard methods. This homology will be more significant when the orthologous proteins or nucleic acids are derived from species which are more closely related (such as human and chimpanzee sequences), compared to species more distantly related (such as human and C. elegans sequences).
  • orthologs are at least 50% identical at the nucleotide level and at least 50% identical at the amino acid level when comparing human orthologous sequences.
  • Methods of alignment of sequences for comparison are well known.
  • Various programs and alignment algorithms are described in: Smith & Waterman, Adv. Appl. Math. 2:482, 1981; Needleman & Wunsch, J. MoI. Biol. 48:443, 1970; Pearson & Lipman, Proc. Natl. Acad. ScL USA 85:2444, 1988; Higgins & Sharp, Gene, 73:237-44, 1988; Higgins & Sharp, CABIOS 5:151-3, 1989; Corpet et al, Nuc. Acids Res.
  • NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al, J. MoI. Biol. 215:403-10, 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. Each of these sources also provides a description of how to determine sequence identity using this program.
  • Homologous sequences are typically characterized by possession of at least 60%, 70%, 75%, 80%, 90%, 95% or at least 98% sequence identity counted over the full length alignment with a sequence using the NCBI Blast 2.0, gapped blastp set to default parameters. Queries searched with the blastn program are filtered with DUST (Hancock and Armstrong, Comput. Appl. Biosci. 10:67-70, 1994). It will be appreciated that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall outside of the ranges provided. Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences, due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.
  • Single Nucleotide Polymorphism A DNA sequence variation, occurring when a single nucleotide (adenine (A), thymine (T), cytosine (C) or guanine (G)) in the genome differs between members of the species.
  • single nucleotide polymorphism or SNP includes mutations and polymorphisms. SNPs may fall within coding sequences (CDS) of genes or between genes (intergenic regions). SNPs within a CDS change the codon, which may or may not change the amino acid in the protein sequence. The former may constitute different alleles. The latter are called silent mutations and typically occur in the third position of the codon (called the wobble position).
  • Specific binding agent An agent that binds substantially only to a defined target.
  • a protein-specific binding agent binds substantially only the specified protein.
  • the term "X-protein specific binding agent” includes anti-X protein antibodies (and functional fragments thereof) and other agents (such as soluble receptors) that bind substantially only to the X protein (where "X" is a specified protein, or in some embodiments a specified domain or form of a protein, such as a particular allelic form of a protein).
  • Anti-X protein antibodies may be produced using standard procedures described in a number of texts, including Harlow and Lane (Antibodies, A Laboratory Manual, CSHL, New York, 1988). The determination that a particular agent binds substantially only to the specified protein may readily be made by using or adapting routine procedures.
  • One suitable in vitro assay makes use of the Western blotting procedure (described in many standard texts, including Harlow and Lane [Antibodies, A Laboratory Manual, CSHL, New York, 1988)).
  • Western blotting may be used to determine that a given protein binding agent, such as an anti-X protein monoclonal antibody, binds substantially only to the X protein.
  • Shorter fragments of antibodies can also serve as specific binding agents.
  • Fabs, Fvs, and single-chain Fvs (SCFvs) that bind to a specified protein would be specific binding agents.
  • These antibody fragments are defined as follows: (1) Fab, the fragment which contains a monovalent antigen-binding fragment of an antibody molecule produced by digestion of whole antibody with the enzyme papain to yield an intact light chain and a portion of one heavy chain; (2) Fab', the fragment of an antibody molecule obtained by treating whole antibody with pepsin, followed by reduction, to yield an intact light chain and a portion of the heavy chain; two Fab' fragments are obtained per antibody molecule; (3) (Fab') 2 , the fragment of the antibody obtained by treating whole antibody with the enzyme pepsin without subsequent reduction; (4) F(ab') 2 , a dimer of two Fab' fragments held together by two disulfide bonds; (5) Fv, a genetically engineered fragment containing the variable region of the light chain and the variable region of the
  • Subject Living multi-cellular vertebrate organisms, a category that includes human and non-human mammals (such as veterinary subjects).
  • Described herein is the identification of sixteen candidate susceptibility loci linked to ISR. Of these regions, seven contain the genes NOV, ARNTL, TAF4B, PKP4, FLJ21986 (encoding a hypothetical protein), EPHBl and STl 8. The remaining eleven susceptibility loci, located in cytobands 2p 16.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 13ql4.3,
  • haplotypes associated with an increased risk of developing restenosis comprising obtaining a nucleic acid sample from the subject and determining the nucleotide present at the chromosomal positions identified herein as part of a haplotype associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 159314989, 159324358, 159328041 and 159328524 of the PKP4 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 1121 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 120477232, 120478404 and 120498610 of the FLJ21986 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 112 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 120490715, 120495854, 120496829, 120504993, 120505089, 120505599, 120506127, 120506187, 120506423, 120513265, 120513339 and 120515067 of the NOV gene in the nucleic acid sample, wherein the presence of a haplotype comprising 112221212111 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 13240687, 13241857, 13250481, 13254501, 13254501 and 13255095 of the ARNTL gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21211 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 22175498 and 22176091 of the TAF4B gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 57171286, 57186783 and 57187625 of cytoband 2pl6.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 211 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 145650644, 145658039 and 145682952 of cytoband in the nucleic acid sample, wherein the presence of a haplotype comprising 221 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 13506254, 13506374, 13507300, 13507512, 13507965, 13513798 and 13514074 of cytoband 7p21.2 in the nucleic acid sample, wherein the presence of a haplotype comprising 1222211 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 75060532, 75060768, 75063372 and 75064286 of cytoband Ip31.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 1112 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 57113139, 57115011, 57128636 and 57129478 of cytoband 2pl6.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 2211 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 22093576 and 22113252 of cytoband 2p24.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 11 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 146511704 and 146552402 of cytoband 2q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 22 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 146613745 and 146620324 of cytoband 2q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 11 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 159141311, 159144184, 159151183, 159152825, 159155564, 159163035, 159180909 and 159181800 of the PKP4 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 22112122 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 136151449 and 136152557 of the EPHBl gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 53402598 and 53403770 of the ST18 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 12 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 13240687, 13241857, 13250481, 13254501, 13255061 and 13255095 of the ARNTL gene in the nucleic acid sample, wherein the presence of a haplotype comprising 212111 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 53592086, 53592319 and 53598855 of cytoband 13ql4.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 211 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 78367393, 78368329, 78370985 and 78371562 of cytoband 15q25.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 1221 is associated with an increased risk of developing restenosis.
  • the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 68396720, 68396879, 68396951 and 68397041 of cytoband 18q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 1111 is associated with an increased risk of developing restenosis.
  • the nucleotide present at a particular chromosomal position can be determined using any one of a number of methods described herein and/or well known in the art, such as by PCR, in situ hybridization, Southern blotting, allele-specific hybridization or using an array.
  • the nucleic acid sample can be obtained from any one of a number of sources, including, but not limited to cells, tissues, bodily fluid, blood, blood derivatives, blood fractions (such as serum or plasma), extracted galls, biopsied or surgically removed tissue, tears, milk, skin scrapes, surface washings, urine, sputum, cerebrospinal fluid, prostate fluid, pus, bone marrow aspirates, BAL, saliva, cervical swabs, vaginal swabs and oropharyngeal wash.
  • the nucleic acid sample is obtained from a bodily fluid of the subject, such as blood or a blood fraction.
  • the nucleic acid sample is obtained from cells or tissue of the subject.
  • a method comprising determining nucleotide(s) present at the chromosomal positions disclosed herein in two or more of the group consisting of the PKP4 gene, the FLJ21986 gene, the NOV gene, the ARNTL gene, the TAF4B gene, the EPHBl gene, the ST 18 gene, cytoband 2pl6.1, cytoband 4q31.21, cytoband 7p21.2, cytoband Ip31.1, cytoband 2p24.1, cytoband 2q22.3, cytoband 13ql4.3, cytoband 15q25.1 and cytoband 18q22.3.
  • nucleotide sequences for haplotypes defined using "1" and “2" or “A” and “B” can be precisely ascertained by cross-referencing the haplotype data to NCBI genomic sequence data and HapMap haplotype data. This process requires manual review of SNPs on the forward and reverse DNA strands.
  • Haplotypes described herein can be definitely linked to a precise nucleotide sequence according to previously described methods (see, for example, Cargill et al. Am. J. Hum. Genet. 80(2):273-90, 2007).
  • ISR in- stent restenosis
  • Affymetrix 10OK SNP chips which contain 116,204 single-nucleotide- polymorphisms
  • SNPs associated with restenosis were identified, and regions with more than one SNP within 250 kb (or 1 Mb) were selected for further analysis.
  • Haplotypes within these blocks were determined and tested for association between case and control haplotypes of restenosis. Those haplotypes with a population frequency of greater than 10% and p-value for association ⁇ 0.05 were analyzed further, and ranked by population frequency. Eight regions were found to be significant, five of which contained genes: NOV, ARNTL, PKP4, TAF4B and FLJ21986.
  • the haplotype analysis of SNP data was as follows: i.
  • Biomarkers identified in the CardioGene Study can be used in various ways, including, but not limited to, as diagnostic or predictive indicators to identify patients that are more or less likely to develop in-stent restenosis; as targets for therapeutic agents to reduce or prevent in-stent restenosis; and in the development of drug-eluting stents, based on compounds developed in light of the newly identified targets.
  • Specific applications include methods for predicting in-stent restenosis that employ a specific SNP or other sequence variation, or a specific collection of such sequences (a haplotype), such as those described herein; and screening methods for identifying compounds that influence restenosis, based on an identified biomarker.
  • NOV NOV (SEQ ID NO: 15; also known as CCN3 and nephroblastoma overexpressed gene; NCBI Accession No. AY082381.1; Gene Map Locus 8q24.1; molecular weight 39,164 Da) is a member of the CCN family of genes, which encode cysteine-rich, secreted proteins associated with the extracellular matrix (ECM) and function as regulatory proteins.
  • ECM extracellular matrix
  • CCNl and CCN2 have been extensively described and are known to support cell adhesion, stimulate adhesive signaling and induce focal adhesion complexes, with important contributions to vascular homeostasis (Mo et al. MoI. Cell Biol.
  • NOV is highly expressed in vascular smooth muscle cells of adult arterial media. Alterations in expression patterns have been defined in a rat model of vascular injury (Ellis et al. Arterioscler. Thromb. Vase. Biol. 20( ⁇ :1912-1919, 2000).
  • ARNTL (SEQ ID NO: 13; also known as MOP3, Bmal-1 ; NCBI Accession No. AF044288; Gene Map Locus 1 Ip 15; molecular weight 68,766 Da) is a basic helix-loop-helix protein that forms a heterodimer with CLOCK. ARNTL has primarily been described in its role as a regulator of circadian rhythms in mammals, with deletion of this gene in mice resulting in a loss of circadian rhythmicity and decreased locomotor activity (Bunger et al. Cell 103(7): 1009-1017, 2000).
  • ARNTL has 29% sequence homology to its namesake ARNT, which is also known as hypoxia inducible factor- l ⁇ (HIF-I ⁇ ). Hypoxia inducible factors have several well-defined roles in angiogenesis and vascular remodeling processes.
  • PKP4 (SEQ ID NO: 16; also known as p0071 and plakophilin 4; NCBI Accession No. BC048013.1; Gene Map Locus 2q23-q3; molecular weight 1134.3 kDa) is a member of the armadillo gene family and functions as part of the junctional plaque, which serves to cluster cadherins that in turn mediate cell to cell contact (Calkins et al. J. Biol. Chem. 27 '8(3) :1774-1783, 2003; Setzer et al. J. Invest. Dermatol. 123(3) :426-433, 2004).
  • Cadherins have been described extensively in vascular homeostasis, atherogenesis and vascular remodeling (Tzima et al. Nature 437(7057):426-43 l, 2005; Lambeng et al. Circ Res. 96(3) :384-39l, 2005; Carmeliet et al. Cell 98(2): 147- 157, 1999; Freiman e? ⁇ /. Science 293(5537) :2084-2087, 2001). As disclosed herein, this gene shows increased expression in atherosclerosis (see Figure 2).
  • FLJ21968 FLJ21986 (SEQ ID NO: 14) is annotated as a hypothetical protein. No prior functional data has been reported. The disclosure provided herein of vascular expression of FLJ21986 in human coronary arteries is the first specific report of expression in human tissues.
  • TAF4B TAF4B (SEQ ID NO: 17; also known as TAFII105; NCBI Accession No. Y09321.1; Gene Map Locus 7q31.32; predicted molecular weight 117,595 Da) is a TATA box binding protein (TBP)-associated factor.
  • TAF4B is a cell type-specific subunit of the transcription factor TFIID that is known to be expressed in gonadal tissues and B cells. It is thought to mediate transcription of a subset of genes required for folliculogenesis in the ovary, as well as being involved in the regulation of spermatogonial stem cell specification and proliferation (Falender et al Genes Dev. 79 ⁇ :794-803, 2005).
  • EPHBl (SEQ ID NO: 60; also known as ephrin receptor type Bl, tyrosine- protein kinase receptor EPH-2, ELK and HEK6; NCBI Accession No. NM 004441, deposited May 9, 1999; Gene Map Locus 3q22.1; predicted molecular weight 109,885 Da) is a type I membrane protein.
  • EPHB 1 functions as the receptor for members of the ephrin-B family (ephrin-B 1, -B2 and -B3). It is thought be involved in cell-cell interactions in the nervous system. The ligand-activated form interacts with GRB2, GRB 10 and NCK through their respective SH2 domains.
  • ST18 ST 18 (SEQ ID NO: 61; also known as suppression of tumorigenicity protein 18, zinc finger protein 387 and KIAA0535; NCBI Accession No. ABOl 1107, deposited April 10, 1998; Gene Map Locus 8ql 1.23; predicted molecular weight 115,155 Da) is a breast cancer tumor suppressor gene encoding a zinc-finger DNA-binding protein with six fingers of the C2HC type.
  • ISR in-stent restenosis
  • Serum or other blood fractions can be prepared in the conventional manner. For example, about 200 ⁇ L of serum can be used for the extraction of DNA for use in amplification reactions.
  • the sample can be used directly, concentrated (for example by centrifugation or filtration), purified, or any combination thereof, and an amplification reaction optionally performed.
  • rapid DNA preparation can be performed using a commercially available kit (such as the InstaGene Matrix, BioRad, Hercules, CA; the NucliSens isolation kit, Organon Teknika, Netherlands).
  • the DNA preparation method yields a nucleotide preparation that is accessible to, and amenable to, nucleic acid amplification.
  • variant elements including SNPs and haplotypes
  • SNPs and haplotypes are useful as markers, for instance to identify genetic material as being derived from a particular individual or in making assessments regarding the propensity of an individual to develop a particular disorder or condition, the ability of an individual to respond to a certain course of treatment, or in other diagnostic, prognostic and other methods described in more detail herein.
  • nucleic acids such as genomic DNA, RNA, and cDNA
  • Genetic material suitable for use in such methods can be generated or derived from a variety of sources.
  • nucleic acid molecules preferably genomic DNA
  • Cells can be obtained from biological samples, for instance from tissue samples or from bodily fluid samples that include cells (such as blood, urine, semen, exudates or saliva).
  • Detection methods of the disclosure can be used to detect variant elements in DNA in a biological sample in intact cells (for instance, using in situ hybridization) or in extracted DNA (for instance, using Southern blot hybridization).
  • the nucleic acid samples obtained from the subject may be amplified from the clinical sample prior to detection.
  • DNA sequences are amplified.
  • RNA sequences are amplified.
  • Any nucleic acid amplification method can be used.
  • polymerase chain reaction PCR
  • TMA transcription-mediated amplification
  • PASA polymerase chain reaction of specific alleles
  • ligase chain reaction ligase chain reaction and nested polymerase chain reaction.
  • a pair of primers can be utilized in the amplification reaction.
  • One or both of the primers can be labeled, for example with a detectable radiolabel, fluorophore, or biotin molecule.
  • the pair of primers may include an upstream primer (which binds 5' to the downstream primer) and a downstream primer (which binds 3' to the upstream primer).
  • the pair of primers used in the amplification reaction can be selective primers which permit amplification of a nucleic acid involved in ISR.
  • primers can be included in the amplification reaction as an internal control.
  • these primers can be used to amplify a "housekeeping" nucleic acid molecule and serve to provide confirmation of appropriate amplification.
  • a target nucleic acid molecule including primer hybridization sites can be constructed and included in the amplification reactor.
  • Amplification products may be assayed in a variety of ways, including size analysis, restriction digestion followed by size analysis, detecting specific tagged oligonucleotide primers in the reaction products, allele-specific oligonucleotide (ASO) hybridization, sequencing, hybridization, and the like.
  • size analysis restriction digestion followed by size analysis
  • ASO allele-specific oligonucleotide
  • PCR-based detection assays include multiplex amplification of a plurality of polymorphisms simultaneously. For example, it is well known in the art to select PCR primers to generate PCR products that do not overlap in size and can be analyzed simultaneously. Alternatively, it is possible to amplify different polymorphisms with primers that are differentially labeled and thus can each be detected. Other techniques are known in the art to allow multiplex analyses of a plurality of polymorphisms. A fragment of a gene may be amplified to produce copies and it may be determined whether copies of the fragment contain the particular protective polymorphism or genotype. E. Detecting Single Nucleotide Alterations
  • Single nucleotide alterations can be detected by a variety of techniques in addition to merely sequencing the target sequence.
  • Constitutional single nucleotide alterations can arise either from new germline mutations, or can be inherited from a parent who possesses a SNP or mutation in their own germline DNA.
  • the techniques used in evaluating either somatic or germline single nucleotide alterations include hybridization using allele-specific oligonucleotides (ASOs) (Wallace et al, CSHL Symp. Quant. Biol. 57:257-261, 1986; Stoneking et al., Am. J. Hum. Genet.
  • ASOs allele-specific oligonucleotides
  • Allele-specific oligonucleotide hybridization involves hybridization of probes to the sequence, stringent washing and signal detection.
  • Other new methods include techniques that incorporate more robust scoring of hybridization. Examples of these procedures include the ligation chain reaction (ASOH plus selective ligation and amplification), as disclosed in Wu and Wallace (Genomics 4:560-569, 1989); mini- sequencing (ASOH plus a single base extension) as discussed in Syvanen (Meth. MoI Biol. 95:291-298, 1998); and the use of DNA chips (miniaturized ASOH with multiple oligonucleotide arrays) as disclosed in Lipshutz et al (BioTechniques 19:442-447, 1995).
  • ASOH with single- or dual-labeled probes can be merged with PCR, as in the 5'-exonuclease assay (Heid et al, Genome Res. (5:986-994, 1996), or with molecular beacons (as in Tyagi and Kramer, Nat. Biotechnol 74:303-308, 1996).
  • DASH dynamic allele-specific hybridization
  • a target sequence is amplified by PCR in which one primer is biotinylated.
  • the biotinylated product strand is bound to a streptavidin-coated microtiter plate well, and the non-biotinylated strand is rinsed away with alkali wash solution.
  • An oligonucleotide probe specific for one allele is hybridized to the target at low temperature. This probe forms a duplex DNA region that interacts with a double strand- specific intercalating dye.
  • the dye When subsequently excited, the dye emits fluorescence proportional to the amount of double-stranded DNA (probe-target duplex) present.
  • the sample is then steadily heated while fluorescence is continually monitored. A rapid fall in fluorescence indicates the denaturing temperature of the probe-target duplex.
  • T m melting temperature
  • oligonucleotides can then be labeled radioactively with isotopes (such as 32 P) or non-radioactively, with tags such as biotin (Ward and Langer et al, Proc. Natl. Acad. ScL USA 75:6633-6657, 1981), and hybridized to individual DNA samples immobilized on membranes or other solid supports by dot-blot or transfer from gels after electrophoresis.
  • tags such as biotin (Ward and Langer et al, Proc. Natl. Acad. ScL USA 75:6633-6657, 1981)
  • hybridized to individual DNA samples immobilized on membranes or other solid supports by dot-blot or transfer from gels after electrophoresis.
  • These specific sequences are visualized by methods such as autoradiography, fluorometric reactions (Landegren et al., Science 242:229-237, 1989) or colorimetric reactions (Gebeyehu et al., Nu
  • an ASO specific for a normal allele the absence of hybridization would indicate a mutation in the particular region of the gene, or a deleted gene.
  • an ASO specific for a mutant allele hybridizes to a sample then that would indicate the presence of a mutation in the region defined by the ASO.
  • Additional methods include fluorescence polarization methods (see, for example, Kwok, Hum. Mutat., 19(4):3 l5-23, 2002); microbead methods (such as those described by Oliphant et al. Biotechniques 2002:56-58, 60-61, 2001; and Shen et al, Genet. Eng. News, 23(6), 2003); and mass spectrophotometery methods (for example, see Jurinke et al., Methods MoI Biol. 187: 179-92, 2002; Amexis et al. Proc. Natl. Acad. ScL U.S.A. 98(21): 12097-102, 2001; Jurinke et al. Adv. Biochem.
  • the oligonucleotide ligation assay (OLA), as described by Nickerson et al. ⁇ Proc. Natl. Acad. ScL U.S.A. 57:8923-8927, 1990), allows the differentiation between individuals who are homozygous versus heterozygous for alleles indicated herein.
  • OLA oligonucleotide ligation assay
  • Sections of non-coding nucleic acid identified herein, particularly those identified herein as including a variant can be tested for functionality or changes in functionality between two or more alleles.
  • segments of DNA can be amplified separately from individuals homozygous for risk alleles and from individuals homozygous for non-risk alleles.
  • Each segment is cloned upstream of a reporter gene (such as luciferase), the resulting constructs transfected into various cell lines, such as endothelial cells, or vascular muscle smooth cells, and the relative amount of luciferase reporter expression compared.
  • a reporter gene such as luciferase
  • Additional possible susceptibility SNPs in the region defined herein also can be identified. By way of example, this can be done by surveying public databases of SNPs, and by sequencing DNA from subjects developing ISR and from controls. These SNPs can then be tested for evidence of association with ISR by genotyping cases and controls, for instance using methods like those described herein. SNPs that show the strongest evidence for association may be better candidates for the causative SNP. This genotype data can also be used to test haplotypes for evidence of association with disease, to help determine whether as yet unidentified SNPs may be more strongly associated.
  • the findings reported herein can be further strengthened by collecting and testing additional case-control samples for evidence of association of the identified SNPs and haplotype with ISR.
  • the locations of all the identified SNPs can be compared to segments of DNA conserved across species, because SNPs located in these segments are believed to be more likely to be affect gene expression or function.
  • SNPs found to be linked to susceptibility to ISR affect the ability of protein(s) to bind to the surrounding segment of DNA.
  • Methods for determining binding are well known in the art, including, but not limited to methods described herein. V. Representative Uses of SNPs and Haplotypes
  • the variants (including individual SNPs and haplotypes) described herein are useful as markers or indicators in a variety of different methods. They can be used, for instance, as diagnostic or predictive indicators to identify patients that are more or less likely to suffer restenosis or in-stent restenosis; as targets for therapeutic agents to reduce or prevent in-stent restenosis; in the development of drug-eluting stents based on compounds developed in light of the targets; and in monitoring clinical trials for the purposes of predicting outcomes of developing or ongoing therapeutic or treatment regimens.
  • results of such methods can be used to develop or recommend a course of prophylactic treatment for an individual who is identified as having a specific SNP or combination of SNPs (or a haplotype), to prescribe or develop a course of therapy after identification that a subject has or suffers from a disease or disorder, or to alter or adapt an ongoing therapeutic regimen.
  • the SNPs and/or haplotypes may also be used in risk- stratification of patients. For example, patients may be made aware of their chance for developing ISR based on whether or not they have one or more risk SNPs or haplotypes. In addition, additional research can be carried out to determine what additional steps might be taken to help in preventing the development of ISR in those patients at increased risk.
  • Certain embodiments therefore include diagnostic methods for detecting one or more SNPs or a haplotype in a biological sample, to thereby determine whether a subject is at risk of developing a disorder or disease or condition linked to one or more of the SNPs or the haplotypes described herein, or whether the subject is afflicted with the disease, condition or disorder.
  • the subject methods also can be used to determine whether a subject is at risk for passing on the susceptibility to develop a disease, condition or disorder to their offspring.
  • prognostic, predictive methods for determining whether a subject is at risk of developing a disease, condition or disorder that affects endothelial cell or vascular smooth muscle cell proliferation or migration including for instance restenosis, such as ISR.
  • SNP sequences or haplotypes can be assayed in a biological sample from a subject.
  • Such assays can be used for prognostic, diagnostic, or predictive purpose to prophylactically or therapeutically treat an individual prior to or after the onset of a disorder, disease or condition (such as ISR) associated with one or more of the SNPs/haplotypes described herein, specifically those located at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l .2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l .23, 13ql4.3, 15q25.1 and 18q22.3, such as those discussed herein.
  • a disorder, disease or condition such as ISR
  • nucleotide variants including individual SNPs and haplotypes
  • the nucleotide variants also can be used for generating polynucleotide reagents. Methods are also provided for identifying or screening for compounds useful for treating or influencing or preventing a disease, disorder or condition associated with a SNP or haplotype located at cytobands 2q24.1 5 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3, such as those discussed herein.
  • proteins such as an ARNTL, NOV, PKP4, TAF4B or FLJ21986 variant protein
  • purified protein may be used for functional analyses, antibody production, diagnostics, and patient therapy. Studies such as these will improve understanding of the process of restenosis, which can then be translated into measures that improve patient care.
  • DNA sequences of the ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 variant cDNAs and regulatory regions, or gene or expressed sequence tag (EST) sequences contained within the genomic region described herein, can be manipulated in studies to understand the expression of the gene and the function of its product.
  • Variant or allelic forms of a human ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 gene may be isolated based upon information contained herein, and may be studied in order to detect alterations in expression patterns in terms of relative quantities, tissue specificity and functional properties of the encoded ARNTL, NOV, PKP4, TAF4B or FLJ21986 variant protein (such as influence on endothelial cell or vascular smooth muscle cell proliferation or migration).
  • Partial or full-length cDNA sequences which encode for the subject protein, may be ligated into bacterial expression vectors.
  • Methods for expressing large amounts of protein from a cloned gene introduced into Escherichia coli (E. coli) or more preferably baculovirus/Sf9 cells may be utilized for the purification, localization and functional analysis of proteins.
  • fusion proteins consisting of amino terminal peptides encoded by a portion of a gene native to the cell in which the protein is expressed (for example, an E. coli lacZ or trpE gene for bacterial expression) linked to a variant protein may be used to prepare polyclonal and monoclonal antibodies against these proteins. Thereafter, these antibodies may be used to purify proteins by immunoaffmity chromatography, in diagnostic assays to quantitate the levels of protein and to localize proteins in tissues and individual cells by immunofluorescence.
  • Intact native protein may also be produced in large amounts for functional studies. Methods and plasmid vectors for producing fusion proteins and intact native proteins in culture are well known in the art, and specific methods are described in Sambrook et al (In Molecular Cloning: A Laboratory Manual, Ch. 17, CSHL, New York, 1989). Such fusion proteins may be made in large amounts, are easy to purify, and can be used to elicit antibody response. Native proteins can be produced in bacteria by placing a strong, regulated promoter and an efficient ribosome-binding site upstream of the cloned gene. If low levels of protein are produced, additional steps may be taken to increase protein production; if high levels of protein are produced, purification is relatively easy. Suitable methods are presented in Sambrook et al.
  • Vectors suitable for the production of intact native proteins include pKC30 (Shimatake and Rosenberg, Nature 292: 128, 1981), pKK177-3 (Amann and Brosius, Gene 40: 183, 1985) and pET-3 (Srudiar and Moffatt, J. MoI. Biol. 189:113, 1986).
  • Fusion proteins may be isolated from protein gels, lyophilized, ground into a powder and used as an antigen.
  • the DNA sequence can also be transferred from its existing context to other cloning vehicles, such as other plasmids, bacteriophages, cosmids, animal viruses and yeast artificial chromosomes (YACs) (Burke et al, Science 23(5:806-812, 1987).
  • YACs yeast artificial chromosomes
  • vectors may then be introduced into a variety of hosts including somatic cells, and simple or complex organisms, such as bacteria, fungi (Timberlake and Marshall, Science 244: 1313-1317, 1989), invertebrates, plants (Gasser and Fraley, Science 244: 1293, 1989), and animals (Pursel et al, Science 244:1281-1288, 1989), in which cells or organisms are rendered transgenic by the introduction of the heterologous cDNA.
  • somatic cells such as bacteria, fungi (Timberlake and Marshall, Science 244: 1313-1317, 1989), invertebrates, plants (Gasser and Fraley, Science 244: 1293, 1989), and animals (Pursel et al, Science 244:1281-1288, 1989), in which cells or organisms are rendered transgenic by the introduction of the heterologous cDNA.
  • the cDNA sequence may be ligated to heterologous promoters, such as the simian virus (SV) 40 promoter in the pSV2 vector (Mulligan and Berg, Proc. Natl. Acad. ScL USA 78:2072-2076, 1981), and introduced into cells, such as monkey COS-I cells (Gluzman, Cell 23: 175-182, 1981), to achieve transient or long-term expression.
  • SV simian virus
  • the stable integration of the chimeric gene construct may be maintained in mammalian cells by biochemical selection, such as neomycin (Southern and Berg, J. MoI. Appl. Genet. 7:327-341, 1982) and mycophenolic acid (Mulligan and Berg, Proc. Natl. Acad. ScL U.S.A. 75:2072-2076, 1981).
  • DNA sequences can be manipulated with standard procedures such as restriction enzyme digestion, fill-in with DNA polymerase, deletion by exonuclease, extension by terminal deoxynucleotide transferase, ligation of synthetic or cloned DNA sequences, site- directed sequence-alteration via single-stranded bacteriophage intermediate or with the use of specific oligonucleotides in combination with PCR or other in vitro amplification.
  • the cDNA sequence (or portions derived from it) or a mini gene (a cDNA with an intron and its own promoter) may be introduced into eukaryotic expression vectors by conventional techniques. These vectors are designed to permit the transcription of the cDNA in eukaryotic cells by providing regulatory sequences that initiate and enhance the transcription of the cDNA and ensure its proper splicing and polyadenylation. Vectors containing the promoter and enhancer regions of the SV40 or long terminal repeat (LTR) of the Rous Sarcoma virus and polyadenylation and splicing signal from SV40 are readily available (Mulligan et al, Proc. Natl. Acad. ScL U.S.A.
  • the level of expression of the cDNA can be manipulated with this type of vector, either by using promoters that have different activities (for example, the baculovirus pAC373 can express cDNAs at high levels in S. frugiperda cells (Summers and Smith, In Genetically Altered Viruses and the Environment, Fields et al.
  • the expression of the cDNA can be monitored in the recipient cells 24 to 72 hours after introduction (transient expression).
  • some vectors contain selectable markers such as the gpt (Mulligan and Berg, Proc. Natl. Acad. ScL U.S.A. 75:2072-2076, 1981) or neo (Southern and Berg, J. MoI. Appl. Genet. 1 :327-341, 1982) bacterial genes. These selectable markers permit selection of transfected cells that exhibit stable, long-term expression of the vectors (and therefore the cDNA).
  • the vectors can be maintained in the cells as episomal, freely replicating entities by using regulatory elements of viruses such as papilloma (Sarver et al, MoI. Cell Biol.
  • the transfer of DNA into eukaryotic, in particular human or other mammalian cells is now a conventional technique.
  • the vectors are introduced into the recipient cells as pure DNA, for example, by transfection using precipitation with calcium phosphate (Graham and vander Eb, Virology 52:466, 1973) or strontium phosphate (Brash et al, MoI. Cell Biol. 7:2013, 1987); electroporation (Neumann et al, EMBO J 1 :841, 1982); lipofection (Feigner et al, Proc. Natl. Acad. Sci U.S.A. 84:7413, 1987); DEAE dextran (McCuthan et al, J. Natl. Cancer Inst.
  • the cDNA, or fragments thereof can be introduced by infection with virus vectors.
  • Systems are developed that use, for example, retroviruses (Bernstein et al, Gen. Engr'g 7:235, 1985), adenoviruses (Ahmad et al, J. Virol. 57:267, 1986), or Herpes virus (Spaete et al, Cell 30:295, 1982).
  • Protein encoding sequences can also be delivered to target cells in vitro via non-infectious systems, for instance liposomes.
  • eukaryotic expression systems can be used for studies of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 variant encoding nucleic acids and mutant forms of these molecules, ARNTL, NOV, PKP4, TAF4B or FLJ21986 variant proteins and mutant forms of these proteins, as well as altered regulator sequences of these genes or variants of the other genes or ESTs or other sequence located in the region of cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql 1.23, 13ql4.3, 15q25.1 and 18q22.3, discussed herein.
  • the eukaryotic expression systems may also be used to study the function of the normal complete protein, specific portions of the protein, or of naturally occurring or artificially produced mutant proteins, as well as regulatory regions.
  • the expression vectors containing an ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 sequence or cDNA (or a sequence or cDNA corresponding to a gene or EST or other sequence located in the region at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql 1.23, 13ql4.3, 15q25.1 and 18q22.3, described herein), or fragments or variants or mutants thereof, can be introduced into human cells, mammalian cells from other species or non-mammalian cells as desired.
  • monkey COS cells Gluzman, Cell 23: 175-182, 1981
  • monkey COS cells that produce high levels of the SV40 T antigen and permit the replication of vectors containing the SV40 origin of replication
  • Chinese hamster ovary CHO
  • mouse NIH 3T3 fibroblasts or human fibroblasts or lymphoblasts may be used.
  • the present disclosure thus encompasses recombinant vectors that comprise all or part of an ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 variant gene or cDNA sequences, or a regulatory sequence thereof, for expression in a suitable host.
  • the DNA is operatively linked in the vector to an expression control sequence in the recombinant DNA molecule so that a polypeptide can be expressed, or the regulatory sequence is operatively linked to a reporter gene.
  • the expression control sequence may be selected from the group consisting of sequences that control the expression of genes of prokaryotic or eukaryotic cells and their viruses and combinations thereof.
  • the expression control sequence may be specifically selected from the group consisting of the lac system, the trp system, the tac system, the trc system, major operator and promoter regions of phage lambda, the control region of fd coat protein, the early and late promoters of SV40, promoters derived from polyoma, adenovirus, retrovirus, baculovirus and simian virus, the promoter for 3-phosphoglycerate kinase, the promoters of yeast acid phosphatase, the promoter of the yeast alpha-mating factors and combinations thereof.
  • the host cell which may be transfected with the vector of this disclosure, may be selected from the group consisting of E. coli, Pseudomonas, Bacillus subtilis, Bacillus stearothermophilus or other bacilli; other bacteria; yeast; fungi; insect; mouse or other animal; or plant hosts; or human tissue cells. It is appreciated that for mutant or variant ARNTL, NOV, PKP 4, TAF4B, EPHBl,
  • STl 8 or FLJ21986 DNA sequences similar systems are employed to express and produce the mutant product.
  • fragments of an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein can be expressed essentially as detailed above. Such fragments include individual ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein domains or sub-domains, as well as shorter fragments such as peptides. Protein fragments having therapeutic properties may be expressed in this manner also, including for instance substantially soluble fragments.
  • Monoclonal or polyclonal antibodies may be produced to either a wildtype or reference protein or specific allelic forms of these proteins, for instance particular portions that contain a differential amino acid encoded by a SNP and therefore may provide a distinguishing epitope, for instance antibodies produced to an ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986 protein or peptide.
  • an antibody generated to a specified target protein or a fragment thereof would recognize and bind that protein and would not substantially recognize or bind to other proteins found in target cells, for instance human cells.
  • an antibody is specific for (or measurably preferentially binds to) an epitope in a variant protein (such as an allele of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 as described herein) versus the reference protein, or vice versa.
  • an antibody specifically detects a target protein or form of the target protein is made by any one of a number of standard immunoassay methods; for instance, the western blotting technique (Sambrook et ah, In Molecular Cloning: A Laboratory Manual, CSHL, New York, 1989).
  • a given antibody preparation such as one produced in a mouse
  • total cellular protein is extracted from human cells (for example, lymphocytes) and electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel.
  • the proteins are then transferred to a membrane (for example, nitrocellulose) and the antibody preparation is incubated with the membrane.
  • an anti-mouse antibody conjugated to an enzyme such as alkaline phosphatase.
  • an enzyme such as alkaline phosphatase.
  • an alkaline phosphatase substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the production of a dense blue compound by immunolocalized alkaline phosphatase.
  • Antibodies that specifically detect the target protein will, by this technique, be shown to bind to the target protein band (which will be localized at a given position on the gel determined by its molecular weight). Non-specific binding of the antibody to other proteins may occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be recognized by one skilled in the art by the weak signal obtained on the Western blot relative to the strong primary signal arising from the specific antibody-target protein binding.
  • Substantially pure ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein or protein fragment (peptide) suitable for use as an immunogen may be isolated from the transfected or transformed cells as described above. Concentration of protein or peptide in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per milliliter. Monoclonal or polyclonal antibody to the protein can then be prepared as follows: A. Monoclonal Antibody Production by Hybridoma Fusion
  • Monoclonal antibody to epitopes of the target protein identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler and Milstein ⁇ Nature 256:495-497, 1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody-producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess un-fused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media).
  • HAT media aminopterin
  • the successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued.
  • Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall (Meth. Enzymol. 70:419-439, 1980), and derivative methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (Antibodies, A Laboratory Manual, CSHL, New York, 1988).
  • Polyclonal antiserum containing antibodies to heterogeneous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with either inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appear to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis et al. ⁇ J. Clin. Endocrinol. Metab. 33:988-991, 1971).
  • Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony et al. ⁇ In Handbook of Experimental Immunology, Wier, D. (ed.) chapter 19. Blackwell, 1973). Plateau concentration of antibody is usually in the range of about 0.1 to 0.2 mg/ml of serum (about 12 ⁇ M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher ⁇ Manual of Clinical Immunology, Ch. 42, 1980).
  • a third approach to raising antibodies against a specific protein or peptide is to use one or more synthetic peptides synthesized on a commercially available peptide synthesizer based upon the predicted amino acid sequence of the protein or peptide.
  • Polyclonal antibodies can be generated by injecting these peptides into, for instance, rabbits or mice.
  • Antibodies may be raised against proteins and peptides by subcutaneous injection of a DNA vector that expresses the desired protein or peptide, or a fragment thereof, into laboratory animals, such as mice. Delivery of the recombinant vector into the animals may be achieved using a hand-held form of the Biolistic system (Sanford et al, Particulate ScL Technol. 5:27-37, 1987) as described by Tang et al. ⁇ Nature 356:152-154, 1992).
  • Expression vectors suitable for this purpose may include those that express a protein- encoding sequence (for instance, a protein encoding ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986) under the transcriptional control of either the human ⁇ -actin promoter or the cytomegalovirus (CMV) promoter.
  • a protein- encoding sequence for instance, a protein encoding ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986
  • CMV cytomegalovirus
  • Antibody preparations prepared according to these protocols are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of antigen in a biological sample; or for immunolocalization of the specified protein.
  • antibodies such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-specific monoclonal antibodies
  • Antibodies with a desired binding specificity can be commercially humanized (Scotgene, Scotland, UK; Oxford Molecular, Palo Alto, CA).
  • Antibodies can be produced that specifically recognize protein variants (and peptides derived therefrom).
  • production of antibodies (and fragments and engineered versions thereof) that recognize at least one variant protein with a higher affinity than they recognize a corresponding protein is beneficial, as the resultant antibodies can be used in analysis, diagnosis and treatment (for example, inhibition or enhancement of protein action, such as, for instance, inhibition or enhancement of a biological activity of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986), as well as in study and examination of the proteins themselves.
  • a peptide taken from a variation-specific region of the target protein includes any peptide (usually four or more amino acids in length) that overlaps with one or more of SNP-encoded variants in a coding sequence described herein. Longer peptides also can be used, and in some instances will produce a stronger or more reliable immunogenic response. Thus, it is contemplated in some embodiments that more than four amino acids are used to elicit the immune response, for instance, at least 5, at least 6, at least 8, at least 10, at least 12, at least 15, at least 18, at least 20, at least 25, or more, such as 30, 40, 50, or even longer peptides. Also, it will be understood by those of ordinary skill that it is beneficial in some instances to include adjuvants and other immune response enhancers, including passenger peptides or proteins when using peptides to induce an immune response for production of antibodies.
  • adjuvants and other immune response enhancers including passenger peptides or proteins when using peptides to induce an immune response for production of antibodies.
  • Embodiments are not limited to antibodies that recognize epitopes containing the actual mutation identified in each variant. Instead, it is contemplated that variant-specific antibodies also may each recognize an epitope located anywhere throughout the specified variant molecule, which epitopes are changed in conformation and/or availability because of the mutation. Antibodies directed to any of these variant-specific epitopes are also encompassed herein.
  • the following references provide descriptions of methods for making antibodies specific to mutant proteins: Hills et al., ⁇ Int. J. Cancer 63: 537-543, 1995); Reiter & Maihle (Nucleic Acids Res. 24: 4050-4056, 1996); Okamoto et al. (Br. J.
  • Similar methods can be employed to generate antibodies specific to specific protein variants, including variants of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or another protein encoded by a gene or EST or other sequence in the region at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3 discussed herein.
  • methods for detecting a polymorphism in the ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 genes use the arrays disclosed herein.
  • Such arrays can include nucleic acid molecules.
  • the array includes nucleic acid oligonucleotide probes that can hybridize to polymorphic ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 gene sequences, such as those polymorphisms discussed herein.
  • Certain of such arrays (as well as the methods described herein) can include other polymorphisms associated with risk or protection from developing ISR, as well as other sequences, such as one or more probes that recognize one or more housekeeping genes.
  • ISR detection arrays are used to determine the genetic susceptibility of a subject to developing ISR.
  • a set of oligonucleotide probes is attached to the surface of a solid support for use in detection of a polymorphism in the ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genes, such as those amplified nucleic acid sequences obtained from the subject.
  • an oligonucleotide probe can be included to detect the presence of this amplified nucleic acid molecule.
  • the oligonucleotide probes bound to the array can specifically bind sequences amplified in an amplification reaction (such as under high stringency conditions). Oligonucleotides comprising at least 15, 20, 25, 30, 35, 40 or more consecutive nucleotides of the ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genes may be used.
  • oligonucleotides form base-paired duplexes with nucleic acid molecules that have a complementary base sequence.
  • the stability of the duplex is dependent on a number of factors, including the length of the oligonucleotides, the base composition, and the composition of the solution in which hybridization is effected.
  • the effects of base composition on duplex stability may be reduced by carrying out the hybridization in particular solutions, for example in the presence of high concentrations of tertiary or quaternary amines.
  • the thermal stability of the duplex is also dependent on the degree of sequence similarity between the sequences.
  • each oligonucleotide sequence employed in the array can be selected to optimize binding of target ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 nucleic acid sequences.
  • An optimum length for use with a particular ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 nucleic acid sequence under specific screening conditions can be determined empirically.
  • the length for each individual element of the set of oligonucleotide sequences including in the array can be optimized for screening.
  • oligonucleotide probes are from about 20 to about 35 nucleotides in length or about 25 to about 40 nucleotides in length.
  • the oligonucleotide probe sequences forming the array can be directly linked to the support, for example via the 5'- or 3'-end of the probe.
  • the oligonucleotides are bound to the solid support by the 5' end.
  • one of skill in the art can determine whether the use of the 3' end or the 5' end of the oligonucleotide is suitable for bonding to the solid support.
  • the internal complementarity of an oligonucleotide probe in the region of the 3' end and the 5' end determines binding to the support.
  • oligonucleotide probes can be attached to the support by non-ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 sequences such as oligonucleotides or other molecules that serve as spacers or linkers to the solid support.
  • an array includes protein sequences, which include at least one
  • proteins or antibodies forming the array can be directly linked to the support.
  • the proteins or antibodies can be attached to the support by spacers or linkers to the solid support.
  • Abnormalities in ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and FLJ21986 proteins can be detected using, for instance, an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and/or FLJ21986 protein-specific binding agent, which in some instances will be detectably labeled.
  • detecting an abnormality includes contacting a sample from the subject with an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 and/or FLJ21986 protein-specific binding agent; and detecting whether the binding agent is bound by the sample and thereby measuring the levels of the ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 and/or FLJ21986 protein present in the sample, in which a difference in the level of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 and/or FLJ21986 protein in the sample, relative to the level of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 and/or FLJ21986 protein found an analogous sample from a subject not predisposed to developing ISR, or a standard ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and/or FLJ21986 protein level in analogous samples from a
  • the microarray material is formed from glass (silicon dioxide).
  • Suitable silicon dioxide types for the solid support include, but are not limited to: aluminosilicate, borosilicate, silica, soda lime, zinc titania and fused silica (for example see Schena, Microarray Analysis. John Wiley & Sons, Inc, Hoboken, New Jersey, 2003).
  • the attachment of nucleic acids to the surface of the glass can be achieved by methods known in the art, for example by surface treatments that form from an organic polymer.
  • Particular examples include, but are not limited to: polypropylene, polyethylene, polybutylene, polyisobutylene, polybutadiene, polyisoprene, polyvinylpyrrolidine, polytetrafluroethylene, polyvinylidene difluroide, polyfluoroethylene-propylene, polyethylenevinyl alcohol, polymethylpentene, polycholorotrifluoroethylene, polysulfornes, hydroxylated biaxially oriented polypropylene, aminated biaxially oriented polypropylene, thiolated biaxially oriented polypropylene, etyleneacrylic acid, thylene methacrylic acid, and blends of copolymers thereof (see U.S. Patent No. 5,985,567, herein incorporated by reference), organosilane compounds that provide chemically active amine or aldehyde groups, epoxy or polylysine treatment of the microarray.
  • a solid support surface is polypropylene.
  • suitable characteristics of the material that can be used to form the solid support surface include: being amenable to surface activation such that upon activation, the surface of the support is capable of covalently attaching a biomolecule such as an oligonucleotide thereto; amenability to "in situ" synthesis of biomolecules; being chemically inert such that at the areas on the support not occupied by the oligonucleotides are not amenable to non-specific binding, or when non-specific binding occurs, such materials can be readily removed from the surface without removing the oligonucleotides.
  • the surface treatment is amine-containing silane derivatives. Attachment of nucleic acids to an amine surface occurs via interactions between negatively charged phosphate groups on the DNA backbone and positively charged amino groups (Schena, Microarray Analysis. John Wiley & Sons, Inc, Hoboken, New Jersey, 2003).
  • reactive aldehyde groups are used as surface treatment. Attachment to the aldehyde surface is achieved by the addition of 5 '-amine group or amino linker to the DNA of interest. Binding occurs when the nonbonding electron pair on the amine linker acts as a nucleophile that attacks the electropositive carbon atom of the aldehyde group.
  • a wide variety of array formats can be employed in accordance with the present disclosure.
  • One example includes a linear array of oligonucleotide bands, generally referred to in the art as a dipstick.
  • Another suitable format includes a two-dimensional pattern of discrete cells (such as 4096 squares in a 64 by 64 array).
  • other array formats including, but not limited to slot (rectangular) and circular arrays are equally suitable for use (see U.S. Patent No. 5,981,185, herein incorporated by reference).
  • the array is formed on a polymer medium, which is a thread, membrane or film.
  • An example of an organic polymer medium is a polypropylene sheet having a thickness on the order of about 1 mm (0.001 inch) to about 20 mm, although the thickness of the film is not critical and can be varied over a fairly broad range.
  • Particularly disclosed for preparation of arrays are biaxially oriented polypropylene (BOPP) films; in addition to their durability, BOPP films exhibit a low background fluorescence.
  • the array is a solid phase, Allele-Specific Oligonucleotides (ASO) based nucleic acid array.
  • ASO Allele-Specific Oligonucleotides
  • the array formats of the present disclosure can be included in a variety of different types of formats.
  • a "format” includes any format to which the solid support can be affixed, such as microtiter plates, test tubes, inorganic sheets, dipsticks, and the like.
  • the solid support is a polypropylene thread
  • one or more polypropylene threads can be affixed to a plastic dipstick-type device
  • polypropylene membranes can be affixed to glass slides.
  • the particular format is, in and of itself, unimportant. All that is necessary is that the solid support can be affixed thereto without affecting the functional behavior of the solid support or any biopolymer absorbed thereon, and that the format (such as the dipstick or slide) is stable to any materials into which the device is introduced (such as clinical samples and hybridization solutions).
  • the arrays of the present disclosure can be prepared by a variety of approaches.
  • oligonucleotide or protein sequences are synthesized separately and then attached to a solid support (see U.S. Patent No. 6,013,789, herein incorporated by reference).
  • sequences are synthesized directly onto the support to provide the desired array (see U.S. Patent No. 5,554,501, herein incorporated by reference).
  • Suitable methods for covalently coupling oligonucleotides and proteins to a solid support and for directly synthesizing the oligonucleotides or proteins onto the support are known to those working in the field; a summary of suitable methods can be found in Matson et ah, Anal. Biochem.
  • the oligonucleotides are synthesized onto the support using conventional chemical techniques for preparing oligonucleotides on solid supports (such as see PCT Publication Nos. WO 85/01051 and WO 89/10977, or U.S. Patent No. 5,554,501, each of which is herein incorporated by reference).
  • a suitable array can be produced using automated means to synthesize oligonucleotides in the cells of the array by laying down the precursors for the four bases in a predetermined pattern.
  • a multiple-channel automated chemical delivery system is employed to create oligonucleotide probe populations in parallel rows (corresponding in number to the number of channels in the delivery system) across the substrate.
  • the substrate can then be rotated by 90° to permit synthesis to proceed within a second (2°) set of rows that are now perpendicular to the first set. This process creates a multiple-channel array whose intersection generates a plurality of discrete cells.
  • the oligonucleotide probes on the array include one or more labels, that permit detection of oligonucleotide probe:target sequence hybridization complexes.
  • kits that can be used to determine whether a subject, such as an otherwise healthy human subject, is genetically predisposed to ISR. Such kits allow one to determine if a subject has one or more genetic mutations or polymorphisms in ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 gene sequences.
  • kits contain reagents useful for determining the presence or absence of at least one polymorphism in a subject's ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genes, such as probes or primers that selectively hybridize to an ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 polymorphic sequence identified herein.
  • Such kits can be used with the methods described herein to determine a subject's ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genotype or haplotype.
  • Oligonucleotide probes and/or primers may be supplied in the form of a kit for use in detection of a specific ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 sequence, such as a SNP or haplotype described herein, in a subject.
  • a kit for use in detection of a specific ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 sequence, such as a SNP or haplotype described herein, in a subject.
  • an appropriate amount of one or more of the oligonucleotide primers is provided in one or more containers.
  • the oligonucleotide primers may be provided suspended in an aqueous solution or as a freeze-dried or lyophilized powder, for instance.
  • the container(s) in which the oligonucleotide(s) are supplied can be any conventional container that is capable of holding the supplied form, for instance, microfuge tubes, ampoules, or bottles.
  • pairs of primers may be provided in pre-measured single use amounts in individual, typically disposable, tubes or equivalent containers. With such an arrangement, the sample to be tested for the presence of a ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 polymorphism can be added to the individual tubes and amplification carried out directly.
  • the amount of each oligonucleotide primer supplied in the kit can be any appropriate amount, depending for instance on the market to which the product is directed.
  • the amount of each oligonucleotide primer provided would likely be an amount sufficient to prime several PCR amplification reactions.
  • Those of ordinary skill in the art know the amount of oligonucleotide primer that is appropriate for use in a single amplification reaction. General guidelines may for instance be found in Innis et ⁇ l. (PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990), Sambrook et ⁇ l. (In Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, New York, 1989), and Ausubel et ⁇ l. (In Current Protocols in Molecular Biology. Greene Publ.
  • a kit may include more than two primers, in order to facilitate the in vitro amplification of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 -encoding sequences, for instance a specific target ARNTL, NOV, PKP 4, TAF4B, EPHBl, ST 18 or FLJ21986 gene or the 5' or 3' flanking region thereof.
  • kits may also include the reagents necessary to carry out nucleotide amplification reactions, including, for instance, DNA sample preparation reagents, appropriate buffers (such as polymerase buffer), salts (for example, magnesium chloride), and deoxyribonucleotides (dNTPs).
  • appropriate buffers such as polymerase buffer
  • salts for example, magnesium chloride
  • dNTPs deoxyribonucleotides
  • Kits may in addition include either labeled or unlabeled oligonucleotide probes for use in detection of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 polymorphisms or haplotypes.
  • these probes will be specific for a potential polymorphic site that may be present in the target amplified sequences.
  • the appropriate sequences for such a probe will be any sequence that includes one or more of the identified polymorphic sites, such that the sequence the probe is complementary to a polymorphic site and the surrounding ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or
  • such probes are of at least 6 nucleotides in length, and the polymorphic site occurs at any position within the length of the probe. It is often beneficial to use longer probes, in order to ensure specificity.
  • the probe is at least 8, at least 10, at least 12, at least 15, at least 20, at least 30 nucleotides or longer.
  • control sequences may comprise human (or non-human) ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 nucleic acid molecule(s) with known sequence at one or more target SNP positions, such as those described herein.
  • controls may also comprise non-ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 nucleic acid molecules.
  • kits may also include some or all of the reagents necessary to carry out RT-PCR in vitro amplification reactions, including, for instance, RNA sample preparation reagents (including for example, an RNase inhibitor), appropriate buffers (for example, polymerase buffer), salts (for example, magnesium chloride), and deoxyribonucleotides (dNTPs).
  • RNA sample preparation reagents including for example, an RNase inhibitor
  • appropriate buffers for example, polymerase buffer
  • salts for example, magnesium chloride
  • dNTPs deoxyribonucleotides
  • kits may in addition include either labeled or unlabeled oligonucleotide probes for use in detection of the in vitro amplified target sequences.
  • the appropriate sequences for such a probe will be any sequence that falls between the annealing sites of the two provided oligonucleotide primers, such that the sequence the probe is complementary to is amplified during the PCR reaction.
  • these probes will be specific for a potential polymorphism that may be present in the target amplified sequences.
  • control sequences for use in the RT-PCR reactions.
  • the design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.
  • Kits for the detection or analysis of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein expression are also encompassed.
  • kits may include at least one target protein specific binding agent (for example, a polyclonal or monoclonal antibody or antibody fragment that specifically recognizes a ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein, or a specific polymorphic form of a ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein) and may include at least one control (such as a determined amount of target ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986 protein, or a sample containing a determined amount of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein).
  • target protein specific binding agent for example, a polyclonal or monoclonal antibody or antibody fragment that specifically recognizes a ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986
  • the ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-protein specific binding agent and control may be contained in separate containers.
  • the antibodies may have the ability to distinguish between polymorphic forms of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein.
  • ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein or isoform expression detection kits may also include a means for detecting ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986:binding agent complexes, for instance the agent may be detectably labeled. If the detectable agent is not labeled, it may be detected by second antibodies or protein A, for example, which may also be provided in some kits in one or more separate containers. Such techniques are well known.
  • kits may include instructions for carrying out the assay. Instructions will allow the tester to determine ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 expression level. Reaction vessels and auxiliary reagents such as chromogens, buffers, enzymes, etc. may also be included in the kits.
  • the instructions can provide calibration curves or charts to compare with the determined (for example, experimentally measured) values.
  • kits that allow differentiation between individuals who are homozygous versus heterozygous for specific SNPs (or haplotypes) of the ARNTL, NOV, PKP 4, TAF4B or FLJ21986 genes as described herein.
  • kits provide the materials necessary to perform oligonucleotide ligation assays (OLA), as described in Nickerson ef ⁇ /., Proc. Natl. Acad. ScL U.S.A. 57:8923-8927, 1990.
  • kits contain one or more microtiter plate assays, designed to detect polymorphism(s) in a ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 sequence of a subject, as described herein. Instructions in these kits will allow the tester to determine whether a specified ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 allele is present, and whether it is homozygous or heterozygous. It may also be advantageous to provide in the kit one or more control sequences for use in the OLA reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.
  • the kit may involve the use of a number of assay formats including those involving nucleic acid binding, such binding to filters, beads, or microtiter plates and the like. Techniques may include dot blots, RNA blots, DNA blots, PCR, restriction fragment length polymorphism (RFLP), and the like.
  • Techniques may include dot blots, RNA blots, DNA blots, PCR, restriction fragment length polymorphism (RFLP), and the like.
  • Microarray-based kits are also provided. These microarray kits may be of use in genotyping analyses. In general, these kits include one or more oligonucleotides provided immobilized on a substrate, for example at an addressable location. The kit also includes instructions, usually written instructions, to assist the user in probing the array. Such instructions can optionally be provided on a computer readable medium
  • Kits may additionally include one or more buffers for use during assay of the provided array.
  • buffers may include a low stringency wash, a high stringency wash, and/or a stripping solution. These buffers may be provided in bulk, where each container of buffer is large enough to hold sufficient buffer for several probing or washing or stripping procedures. Alternatively, the buffers can be provided in pre-measured aliquots, which would be tailored to the size and style of array included in the kit. Certain kits may also provide one or more containers in which to carry out array-probing reactions.
  • Kits may in addition include one or more containers of detector molecules, such as antibodies or probes (or mixtures of antibodies, mixtures of probes, or mixtures of the antibodies and probes), for detecting biomolecules captured on the array.
  • the kit may also include either labeled or unlabeled control probe molecules, to provide for internal tests of either the labeling procedure or probing of the array, or both.
  • the control probe molecules may be provided suspended in an aqueous solution or as a freeze-dried or lyophilized powder, for instance.
  • the container(s) in which the controls are supplied can be any conventional container that is capable of holding the supplied form, for instance, microfuge tubes, ampoules, or bottles.
  • control probes may be provided in pre- measured single use amounts in individual, typically disposable, tubes or equivalent containers.
  • the amount of each control probe supplied in the kit can be any particular amount, depending for instance on the market to which the product is directed. For instance, if the kit is adapted for research or clinical use, sufficient control probe(s) likely will be provided to perform several controlled analyses of the array. Likewise, where multiple control probes are provided in one kit, the specific probes provided will be tailored to the market and the accompanying kit.
  • a plurality of different control probes will be provided in a single kit, each control probe being from a different type of specimen found on an associated array (for example, in a kit that provides both eukaryotic and prokaryotic specimens, a prokaryote-specific control probe and a separate eukaryote-specific control probe may be provided).
  • kits may also include the reagents necessary to carry out one or more probe-labeling reactions.
  • the specific reagents included will be chosen in order to satisfy the end user's needs, depending on the type of probe molecule (for example, DNA or RNA) and the method of labeling (for example, radiolabel incorporated during probe synthesis, attachable fluorescent tag, etc.).
  • kits are provided for the labeling of probe molecules for use in assaying arrays provided herein. Such kits may optionally include an array to be assayed by the so labeled probe molecules.
  • a protein such as a reporter protein
  • Such mutants allow insight into the physiological and/or psychological role of this genomic region, and more particularly the role of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 (and/or another protein encoded by a gene or EST or other sequence in the region of at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3, discussed herein) in a healthy and/or pathological organism.
  • mutant organisms are "genetically engineered,” meaning that information in the form of nucleotides has been transferred into the mutant's genome at a location, or in a combination, in which it would not normally exist. Nucleotides transferred in this way are said to be “non-native.” For example, a non-ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 promoter inserted upstream of an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-encoding sequence would be non-native (as would an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 promoter inserted upstream of a non-ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 encoding sequence). An extra copy of a gene (or cDNA) on a plasmid, transformed into
  • Mutants may be, for example, produced from mammals, such as mice or rats, that either express, over-express, or under-express a specific allelic variant or haplotype or diplotype of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986, or that do not express ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 at all.
  • Over-expression mutants are made by increasing the number of specified genes in the organism, or by introducing a specific allele into the organism under the control of a constitutive or inducible or viral promoter such as the mouse mammary tumor virus (MMTV) promoter or the whey acidic protein (WAP) promoter or the metallothionein promoter.
  • MMTV mouse mammary tumor virus
  • WAP whey acidic protein
  • Mutants that under-express a protein may be made by using an inducible or repressible promoter, or by deleting the target gene, or by destroying or limiting the function of the target gene, for instance by disrupting the gene by transposon insertion.
  • Antisense genes or molecules may be engineered into the organism, under a constitutive or inducible promoter, to decrease or prevent expression of a specific target gene (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986), as known to those of ordinary skill in the art.
  • a specific target gene such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986
  • a mutant mouse over-expressing a heterologous protein (such as a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein) may be made by constructing a plasmid having an ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 allele encoding sequence driven by a promoter, such as the mouse mammary tumor virus (MMTV) promoter or the whey acidic protein (WAP) promoter.
  • MMTV mouse mammary tumor virus
  • WAP whey acidic protein
  • expression of ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 or another gene (such as a reporter gene) can be driven by regulatory sequences from ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986, including specifically regions developed based on the SNP-containing sequences described herein.
  • the oocytes are implanted into pseudopregnant females, and the litters are assayed for insertion of the transgene. Multiple strains containing the transgene are then available for study.
  • WAP is quite specific for mammary gland expression during lactation, and MMTV is expressed in a variety of tissues including mammary gland, salivary gland and lymphoid tissues. Many other promoters might be used to achieve various patterns of expression, such as the metallothionein promoter.
  • An inducible system may be created in which the subject expression construct is driven by a promoter regulated by an agent that can be fed to the mouse, such as tetracycline.
  • an agent that can be fed to the mouse, such as tetracycline.
  • a mutant knockout animal for example, a mouse
  • a mutant knockout animal from which a specific gene is deleted can be made by removing all or some of the coding regions of the gene from embryonic stem cells.
  • the methods of creating deletion mutations by using a targeting vector have been described (Thomas and Capecchi, Cell 57:503-512, 1987).
  • knock-ins In addition to knock-out systems, it is also beneficial to generate "knock-ins" that have lost expression of the native protein but have gained expression of a different, usually mutant or identified allelic form of the same protein, or expression of the native protein under control of an altered regulatory sequence.
  • ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 can be expressed in a knockout background in order to provide model systems for studying the effects of these mutants.
  • the resultant knock-in organisms provide systems for studying restenosis, particularly ISR, influence on endothelial or vascular smooth muscle cell proliferation, migration, and so forth.
  • knock-in organisms can be generated in which a reporter gene, or ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 itself, is expressed under the influence of a non-coding variant sequence described herein.
  • expression of the same protein is driven by risk alleles at SNPs compared to non-risk alleles.
  • Those of ordinary skill in the relevant art know methods of producing knock-in organisms. See, for instance, Rane et al. (MoI. Cell Biol. 22: 644-656, 2002); Sotillo et al. [EMBO J, 20: 6637-6647, 2001); Luo et al.
  • the following assays are designed to identify compounds that interact with (for example, bind to) a variant form of an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986; compounds that interact with (bind to) intracellular proteins that interact with such a variant form; compounds that interfere with the interaction of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 with transmembrane or intracellular proteins involved in signal transduction; and compounds which modulate the activity of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 ⁇ i.e., modulate the level of gene expression) or modulate the level of activity of a variant form of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986.
  • Assays may additionally be utilized which identify compounds which bind to ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 regulatory sequences (such as promoter sequences) and which may modulate gene expression (see, for example, Platt, J. Biol. Chem. 2(59:28558-28562, 1994).
  • these assays also can be used to identify compounds that interact in any of the ways listed above with another gene, regulatory sequence, gene corresponding with an EST, or protein encoded thereby, from cytoband 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3, described herein as being linked to susceptibility to ISR.
  • the compounds which may be screened in accordance with the disclosure include, but are not limited to peptides, antibodies and fragments thereof, and other organic compounds (for example, peptidomimetics, small molecules) that bind to one or more variant sequences (including variant regulatory sequences or encoding sequences) as described herein and either mimic the activity triggered by the natural ligand (for example, agonists) or inhibit the activity triggered by the natural ligand (for example, antagonists); as well as peptides, antibodies or fragments thereof, and other organic compounds that mimic the a variant (or a portion thereof) and bind to and "neutralize" natural ligand.
  • organic compounds for example, peptidomimetics, small molecules
  • Such compounds may include, but are not limited to, peptides such as, for example, soluble peptides, including, but not limited to members of random peptide libraries (see, for example, Lam et al, Nature 354:82-84, 1991; Houghten et al, Nature 354:84-86, 1991) and combinatorial chemistry-derived molecular libraries made of D- and/or L- configuration amino acids, phosphopeptides (including, but not limited to, members of random or partially degenerate, directed phosphopeptide libraries; see, for example, Songyang et al, Cell 72:161-11%, 1993), antibodies (including, but not limited to, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab') 2 and Fab expression library fragments, and epitope-binding fragments thereof), and small organic or inorganic molecules.
  • peptides such as, for example, soluble peptides, including
  • Other compounds which can be screened in accordance with the disclosure include, but are not limited to small organic molecules that are able to gain entry into an appropriate cell and affect the expression of an ARNTL, NO V, PKP 4, TAF4B, EPHBl, STl 8 or
  • FLJ21986 gene or some other gene involved in a related signal transduction pathway (such as by interacting with the regulatory region or transcription factors involved in gene expression); or such compounds that affect the activity of a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or the activity of some other intracellular factor involved in the signal transduction pathway.
  • Computer modeling and searching technologies permit identification of compounds, or the improvement of already identified compounds, that can modulate expression or activity of a variant target protein. Having identified such a compound or composition, the active/binding/effector sites or regions are identified. Such active sites typically might be ligand binding sites, such as the interaction domains of a molecule with a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 itself or a sequence encoding the protein or regulating the expression thereof, or the interaction domains of a molecule with a specific allelic variant in comparison to the interaction domains of that molecule with another variant of the protein.
  • ligand binding sites such as the interaction domains of a molecule with a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 itself or a sequence encoding the protein or regulating the expression thereof, or the interaction domains of a molecule with a specific allelic variant in comparison to the interaction domains
  • the active site can be identified using methods known in the art including, for example, from the amino acid sequences of peptides, from the nucleotide sequences of nucleic acids, or from study of complexes of the relevant compound or composition with its natural ligand. In the latter case, chemical methods can be used to find the active site by finding where on the factor the complexed ligand is found. Next, the three dimensional geometric structure of the active site is determined. This can be done by known methods can determine a complete molecular structure. On the other hand, solid or liquid phase NMR can be used to determine certain intra-molecular distances. Any other experimental method of structure determination can be used to obtain partial or complete geometric structures, such as high resolution electron microscopy.
  • the geometric structures may be measured with a complexed ligand, natural or artificial, which may increase the accuracy of the active site structure determined.
  • the structure of the specified target protein is compared to that of a "variant" of the specified protein and, rather than solve the entire structure, the structure is solved for the protein domains that are changed. If an incomplete or insufficiently accurate structure is determined, the methods of computer based numerical modeling can be used to complete the structure or improve its accuracy. Any recognized modeling method may be used, including parameterized models specific to particular biopolymers such as proteins or nucleic acids, molecular dynamics models based on computing molecular motions, statistical mechanics models based on thermal ensembles, or combined models.
  • candidate modulating compounds can be identified by searching databases containing compounds along with information on their molecular structure. Such a search seeks compounds having structures that match the determined active site structure and that interact with the groups defining the active site. Such a search can be manual, but is preferably computer assisted. These compounds found from this search are potential ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-modulating compounds.
  • these methods can be used to identify improved modulating compounds from an already known modulating compound or ligand.
  • the composition of the known compound can be modified and the structural effects of modification can be determined using the experimental and computer modeling methods described above applied to the new composition.
  • the altered structure is then compared to the active site structure of the compound to determine if an improved fit or interaction results. In this manner systematic variations in composition, such as by varying side groups, can be quickly evaluated to obtain modified modulating compounds or ligands of improved specificity or activity.
  • the structure of a specified protein or nucleic acid sequence, such as a regulatory sequence, is compared to that of a variant protein or sequence (encoded by a different allele of the same protein, or a variant non-coding nucleic acid sequence such as a regulatory sequence containing one or more SNPs). Then, potential inhibitors (or enhancers) are designed that bring about a structural change in the reference form so that it resembles the variant form. Or, potential mimics are designed that bring about a structural change in the variant form so that it resembles another variant form, or the form of the reference receptor.
  • potential inhibitors or enhancers
  • potential mimics are designed that bring about a structural change in the variant form so that it resembles another variant form, or the form of the reference receptor.
  • the inhibitors, enhancers, or mimics may influence the binding of one or more other proteins to the nucleic acid sequence, for instance in a way that affects the transcription of an encoding sequence that is operably linked to that nucleic acid sequence.
  • CHARMM performs the energy minimization and molecular dynamics functions.
  • QUANTA performs the construction, graphic modeling and analysis of molecular structure. QUANTA allows interactive construction, modification, visualization, and analysis of the behavior of molecules with each other.
  • Compounds identified via assays such as those described herein may be useful, for example, in elaborating the biological function of a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 gene product, and for designing therapeutic molecules useful in the diagnosis and/or treatment of restenosis, more specifically ISR, influence on endothelial or vascular smooth muscle cell proliferation or migration, and so forth.
  • In vitro systems may be designed to identify compounds capable of interacting with a variant protein or nucleic acid sequence identified by the SNPs described herein.
  • Compounds identified using such systems may be useful, for example, in modulating the activity of "wild type” (reference) and/or "variant” gene products (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986); in elaborating the biological function of such proteins; in screens for identifying compounds that disrupt normal protein-protein or protein- nucleic acid interactions; or to study or characterize the regulation of gene expression, for instance expression of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 or a reporter protein linked to a regulatory sequence from ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or another gene or EST or other sequence from cytoband 2q24.1, 7q31.31, 8q24.12, I
  • One type of assay that can be used to identify compounds that bind to a variant molecule involves preparing a reaction mixture of a variant molecule and a test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex which can be removed and/or detected in the reaction mixture.
  • the molecular species used can vary depending upon the goal of the screening assay.
  • the full length protein for example, ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986
  • a soluble truncated portion thereof or a fusion protein containing a variant peptide fused to a protein or polypeptide that affords advantages in the assay system (such as labeling, isolation of the resulting complex, etc.)
  • advantages in the assay system such as labeling, isolation of the resulting complex, etc.
  • oligonucleotides corresponding to a variant sequence containing at least one SNP position as discussed herein
  • fusion nucleic acid molecules containing a variant sequence can be used.
  • the screening assays can be conducted in a variety of ways. For example, one method to conduct such an assay involves anchoring a variant molecule (such as a protein, polypeptide, peptide or fusion protein, or nucleic acid) or the test substance(s), onto a solid phase and detecting variant molecule/test compound complexes anchored on the solid phase at the end of the reaction.
  • the variant molecule(s) may be anchored onto a solid surface, and the test compound(s), which is not anchored, may be labeled, either directly or indirectly.
  • microtiter plates may conveniently be utilized as the solid phase.
  • the anchored component may be immobilized by non-covalent or covalent attachments.
  • Non- covalent attachment may be accomplished by simply coating the solid surface (or a portion thereof) with a solution containing the protein (or nucleic acid) and drying.
  • an immobilized specific binding agent such as an antibody, preferably a monoclonal antibody, specific for the protein to be immobilized may be used to anchor the protein to the solid surface.
  • the surfaces may be prepared in advance and stored. In order to conduct the assay, the nonimmobilized component is added to the coated surface containing the anchored component.
  • any complexes formed will remain immobilized on the solid surface.
  • the detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously nonimmobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously nonimmobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; for example, using a labeled antibody specific for the previously nonimmobilized component (the antibody, in turn, may be directly labeled or indirectly labeled with a labeled anti-Ig antibody).
  • a reaction can be conducted in a liquid phase, the reaction products separated from unreacted components, and complexes detected.
  • detection can involve using an immobilized binding agent specific for the variant molecule (such as an antibody or other binding agent specific for a variant protein, polypeptide, peptide or fusion protein (for instance, ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986)) or specific for the test compound, to anchor or capture any complexes formed in solution, and a labeled antibody (or other binding agent) specific for the other component of the possible complex to detect anchored complexes.
  • an immobilized binding agent specific for the variant molecule such as an antibody or other binding agent specific for a variant protein, polypeptide, peptide or fusion protein (for instance, ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986)
  • a labeled antibody or other binding agent specific for the other component of the possible complex to detect anchored complexe
  • cell-based assays can be used to identity compounds that interact with a variant molecule.
  • cell lines that express a variant molecule such as a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 encoding sequence or a regulatory sequence variant or other non-coding sequence variant (or combination of two or more variants) or cell lines (For example, COS cells, CHO cells, HEK293 cells, etc.) that have been genetically engineered to express a variant (for example, by transfection or transduction of protein encoding DNA) can be used.
  • a variant molecule such as a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 encoding sequence or a regulatory sequence variant or other non-coding sequence variant (or combination of two or more variants) or cell lines (For example, COS cells, CHO cells, HEK293 cells, etc.) that have been genetically engineered to express
  • Interaction of the test compound with, for example, a variant protein (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986) expressed by the host cell, or a variant nucleic acid sequence present in the host cell can be determined by comparison or competition with a host cell not treated with the compound, or treated with another compound, or by examining one or more biological characteristics linked to the variant (such as endothelial or vascular smooth muscle cell proliferation or migration, for instance restenosis or ISR.
  • a variant protein such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986
  • a variant nucleic acid sequence present in the host cell can be determined by comparison or competition with a host cell not treated with the compound, or treated with another compound, or by examining one or more biological characteristics linked to the variant (such as endothelial or vascular smooth muscle cell proliferation or migration, for instance restenosis or ISR.
  • variant molecules such as a variant nucleic acid or polypeptide (such as those described herein) may be employed in a screening process for compounds which bind the variant molecule and which activate (agonists) or inhibit activation (antagonists) of the molecule or one linked thereto.
  • variant molecules described herein also may be used to assess the binding of small molecule substrates and ligands in, for example, cells, cell- free preparations, chemical libraries, and natural product mixtures.
  • substrates and ligands may be natural substrates and ligands or may be structural or functional mimetics (see Coligan et al. Current Protocols in Immunology 1 (2): Chapter 5, 1991).
  • such screening procedures involve providing appropriate cells that express a polypeptide of the present disclosure, or a reporter polypeptide operably linked to a non-coding variant nucleic acid found at cytoband 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3, such as those discussed herein.
  • Such cells include cells from mammals, insects, yeast, and bacteria.
  • a polynucleotide regulatory sequence or polynucleotide encoding the polypeptide is employed to transfect cells to thereby express a variant molecule.
  • the cell expressing the variant polypeptide or variant nucleic acid is then contacted with a test compound to observe binding, stimulation or inhibition of a functional response.
  • the technique may also be employed for screening of compounds which activate a molecule of the present disclosure by contacting such cells with compounds to be screened and determining whether such compound generates a signal, i.e., activates the polypeptide or reporter polypeptide.
  • Another method involves screening for compounds which are antagonists, and thus inhibit activation of a molecule of the present disclosure by determining inhibition of binding of labeled ligand, such as a factor that binds to a nucleic acid of the disclosure, to cells expressing the variant molecule or a reporter gene operably linked to a non-coding nucleic acid (such as a regulatory region).
  • labeled ligand such as a factor that binds to a nucleic acid of the disclosure
  • a reporter gene operably linked to a non-coding nucleic acid (such as a regulatory region).
  • Such a method involves transfecting a eukaryotic cell with a DNA encoding a variant molecule such that the cell expresses the molecule (or expresses a reporter gene under the control of a non-coding region containing a variant SNP or haplotype as described herein).
  • the cell is then contacted with a potential antagonist in the presence of a labeled form of a ligand or binding factor.
  • the ligand/factor can be labeled, for example, with radioactivity.
  • the amount of labeled ligand/factor bound to the variant molecule is measured, for example, by measuring radioactivity associated with transfected cells or membrane another fraction from these cells. If the compound binds to the variant molecule, the binding of labeled ligand/factor to the variant is inhibited as determined by a reduction of labeled ligand/factor that binds.
  • Therapeutic compounds and agents can be administered directly to the mammalian subject for modulation of activity of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 activity or expression, or the activity or expression of another gene, EST, or protein encoded by a gene or EST found in cytobands 2q24.1, 7q31.31, 8q24.12, l ip 15.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3, such as those discussed herein.
  • Administration is by any of the routes normally used for introducing a modulator compound into ultimate contact with the tissue to be treated.
  • the compounds or agents, alone or accompanied by one or more additional therapeutic agents, are administered in any suitable manner, optionally with pharmaceutically acceptable carrier(s). Suitable methods of administering such compounds/agents are available and well known to those of ordinary skill in the art, and, although more than one route can be used to administer a particular composition, a particular route can often provide a more immediate and more effective reaction than another route.
  • compositions of the present disclosure are determined in part by the particular composition being administered, as well as by the particular method used to administer the composition. Accordingly, there is a wide variety of suitable formulations of pharmaceutical compositions of the present disclosure (see, for example, Remington 's Pharmaceutical
  • Formulations suitable for administration include aqueous and non-aqueous solutions, isotonic sterile solutions, which can contain antioxidants, buffers, bacteriostats, and solutes that render the formulation isotonic, and aqueous and non-aqueous sterile suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, and preservatives.
  • compositions can be administered, for example, orally, parenterally, intrathecally, and so forth.
  • the formulations of compounds can be presented in unit-dose or multi-dose sealed containers, such as ampoules and vials. Solutions and suspensions can be prepared from sterile powders, granules, and tablets of the kind previously described.
  • the compounds/agents also can be optionally administered as part of a prepared food or drug.
  • the dose administered to a subject should be sufficient to affect a beneficial response in the subject over time.
  • the dose will be determined by the efficacy of the particular compound/agent employed and the condition of the subject, as well as the body weight or surface area of the area to be treated, and whether the subject is being treated prophylactically or after the identification and diagnosis of a specific disease, condition, or disorder.
  • the size of the dose also may be influenced by the existence, nature, and extent of any adverse side-effects that accompany the administration of a particular compound in a particular subject.
  • a physician may evaluate circulating plasma levels of the modulator, modulator toxicities, and the production of anti-modulator antibodies.
  • the dose equivalent of a modulator is from about 1 ng/kg to 10 mg/kg for a typical subject.
  • therapeutic compounds of the present disclosure can be administered at a rate determined by the LD 50 of the modulator, and the side effects of the inhibitor at various concentrations, as applied to the mass and overall health of the subject.
  • Administration can be accomplished via single or divided doses.
  • in-stent restenosis a case control association study was designed.
  • an association analysis was expected to be more powerful than linkage analysis for the detection of common disease alleles that confer modest disease risks for ISR (Risch and Merikangas Science 273 (5281): 1516-1517, 1996).
  • a genome-wide association approach was chosen, since the technologies are now available to assay SNPs at high density across the genome, and a truly unbiased approach would afford the best opportunity to identify genetic determinants of a complex disease in which the genetics are essentially unknown (Van Steen et al. Nat Genet. 37(7):683-691, 2005).
  • In-stent restenosis was clinically defined as any target vessel revascularization or the development of symptomatic ischemia in the target vessel territory (ischemic symptoms or a positive functional ischemia test), as previously described (Ganesh et al. Pharmacogenomics. 5(7):952-1004, 2004). Basic clinical parameters relevant to ISR outcomes were not different between the two groups (see Table 1).
  • TTRS time to restenosis
  • Genomic DNA was isolated from blood using commercially available methods to extract genomic DNA from leukocytes (Qiagen, Valencia, CA). Genomic DNA also was obtained from the Centre d'Etude du Polymorphisme Humain (CEPH) cohort, for use as a processing control and evaluation of genotype data reproducibility. Each individual was genotyped using the Affymetrix GeneChip Mapping IOOK Set of microarrays, which consists of two chips ⁇ Xbal and HinDIII) with approximately 50,000 SNPs on each chip. Genomic DNA (250 ng) was digested with two restriction enzymes and processed according to the
  • Affymetrix protocol Image analysis of each chip was obtained with a laser scanner and raw image data was processed using GeneChipTM DNA Analysis Software (GDAS; Affymetrix, Santa Clara, CA). The CEPH control sample was inserted in between patient samples and analyzed on 22 sets of chips. Using the default settings of GDAS to implement the dynamic model (DM) genotype-calling algorithm, the median call rate per chip was 98.47%. The GDAS algorithm allows for increased stringency of the genotype calls. For the analyses presented, the GDAS genotype calling p-value stringency was increased from the default value of 0.25 to 0.05 (see Table T), and a call rate of 85% was needed on each chip from each patient for inclusion in the study.
  • DM dynamic model
  • non-Caucasian patients were excluded from analysis. This left 407 patients for analysis, with 150 cases (51 patients from case 1 and 99 patients from case T) and 257 controls.
  • a secondary analysis was conducted using the newer Bayesian Robust Linear Modeling using Mahalanobis distance (BRLMM) algorithm (Rabbee and Speed, Bioinformatics:22(l):l- ⁇ 2, 2006) to call genotypes from the raw chip image data, which improves heterozygote call rates.
  • the median call rate using this algorithm was 99.19%, and all subsequent analysis steps were conducted using the same methods used in the primary analysis of the DM-based data set.
  • Quality control of the genotype data was performed in a step-wise manner. First, patients with a chip call rate ⁇ 85% were excluded. Next, SNPs were removed from the analysis if the SNP was called in less than 25% of CEPH control replicate chips (22 sets of chips; 82 SNPs removed). To remove uninformative SNPs, SNPs with a minor allele frequency ⁇ 0.8% in the cohort were removed (6963 SNPs), monomorphic SNPs (855 SNPs) were removed, and SNPs with no homozygotes in the cohort were removed (9 SNPs).
  • SNPs with a minor allele frequency ⁇ 1% were removed (7816 SNPs) and SNPs on the X chromosome were excluded.
  • 104,581 SNPs were available for analysis.
  • HWE Hardy- Weinberg equilibrium
  • Univariate tests of Allelic Association In the primary analysis, the 99,523 SNPs from 407 patients' chips were analyzed using genotypes determined by the DM algorithm. First, univariate tests of allelic association were conducted using a training/test set approach. The cohort was randomly divided into two subsets, and tests of allelic association of each SNP were conducted in the training subset. The test set was used as an independent validation set of findings identified in the training set. Association of each SNP with disease status in the training set was tested by constructing a 2x2 contingency table of allele frequencies and conducting a chi-squared test.
  • a Cochran-Armitage trend test was also conducted using three genetic models: recessive, additive and dominant (Sasieni, Biometrics 53( ⁇ : 1253-1261, 1997; Slager and Schaid, Hum. Hered. 52 ⁇ : 149-153, 2001; Freidlin et al. Hum. Hered. 53(3): ⁇ A6- ⁇ 52, 2002).
  • the nominal p-value for multiple testing was corrected by applying the Bonferroni criterion.
  • 100,648SNPs were available for analysis. Analyses were conducted using SAS (SAS, Inc., Cary, NC). Negative analyses were used to try to link SNP findings to neighboring genes.
  • Haplotypes were defined in these regions as described using a 4-gamete test using Haploview software (Simon et al. J. Clin. Invest. 105(3):293-300, 2000). For each region, the most probable haplotype pair per subject was estimated using Haplo. stats software (Sinnwell and Schaid, Haplo Stats, version 1.2.0, http://mayoresearch.mayo.edu/mayo/research/biostat/schaid.cfm). Haplotypes with a population frequency greater than 5%, as determined in the CardioGene cohort, were then considered for clustering.
  • haplotype cluster status (AA, AB or BB)
  • RNA extracted from human tissue (heart, lung, liver, skeletal muscle, brain, skin, carotid artery, and fetal aorta), as well as universal human RNA, were purchased from a commercial source (Stratagene, Inc., La Jolla, CA). Coronary artery specimens were obtained from freshly explanted hearts at the time of heart transplantation at the Johns Hopkins Hospital, under a protocol that was reviewed and approved by the hospital IRB to be exempt status research. Arterial samples were examined grossly for pathology, flash frozen and stored at -80 0 C until homogenization for RNA isolation.
  • PCR primers were generated using a web-based prediction algorithm (available on the internet at genome.wi.mit.edu/genome- softwaree/other/primer3.html) (Table 3).
  • PCR amplification reaction using DNA polymerase (HotStarTaq®, QIAgen, Valencia, CA) and the gene-specific primers (see Table 3). Reactions were carried out for 25-30 cycles of amplification, consisting of 1 minute strand separation at 95°C, 1 minute of annealing at 59°C (NOV) or 66°C (all other primers), and 2 minute extension at 72°C, with a final 10 minute extension at 72°C.
  • PCR products were run on a 1% agarose gel with ethidium bromide and the bands visualized and photographed under UV light with the UVP gel documentation system.
  • Two ⁇ l of cDNA for each gene was amplified in the presence of 300 nM forward and reverse primers, 50 nM probe, 200 ⁇ M dNTPs, 4 mM MgC12 and 1.25 U of AmpliTaqTM gold DNA polymerase in Taqman buffer A (Applied Biosystems) in a total volume of 50 ⁇ l. Samples were heated for 10 minutes at 95°C and amplified in 40 cycles of 15 seconds at 95°C and 60 seconds at 60 0 C. A positive control was amplified on each plate to verify the amplification efficiency within each experiment.
  • coronary artery specimens were either snap-frozen, for subsequent RNA isolation, or immediately fixed in 10% formalin overnight for immunohistochemical analysis.
  • Atherosclerotic tissues with calcification visible by flat plate x-ray were decalcified by immersion in 10% EDTA, pH 6.0 for 5 days. Matched unaffected coronary arteries were similarly fixed and stained concurrently.
  • Immunohistochemical staining was performed using anti- ARNTL antibody (rabbit polyclonal, Chemicon, Temecula, CA) diluted 1 :200; anti-NOV antibody (goat polyclonal, RD Systems, Tustin, CA) diluted 1 :3; anti-PKP- 4 antibody (rabbit polyclonal) diluted 1 :200; and anti-TAF4B rabbit antisera diluted 1 :100.
  • Diaminobenzidine staining was performed according to standard protocols. As a negative control for staining with anti-ARNTL, NOV and PKP4 antibodies, the procedure was repeated with no primary antibody, as well as IgG control for the corresponding species in which the primary antibody was raised. Pre-immune rabbit sera were used as for the corresponding control staining for TAF4B for each section stained.
  • haplotype analysis was performed mirroring the 2-group and 3 -group design of statistical testing performed for allelic association of each SNP.
  • Haplotype analysis using a 2-group test of the assayed SNPs within a set distance of each SNP (4-gamete test, using Haploview) comparing all cases versus controls showed no significant haplotypes.
  • the two case groups were treated as non-ordered outcomes, and significant haplotypes, ranging from 2-6 SNPs, were identified around each of these SNPs.
  • none of these haplotype results included the SNPs identified in each region.
  • SNP on chromosome 7 SNP A-1745890 / rs958505
  • KCNV2 potential channel, subfamily V, member T
  • SNP A- 1686530 / rs 10515758 on chromosome 5 is 554 kb upstream from the nearest gene, EBF (early B-cell factor).
  • SNP A-1686530 / rslO515758 the highest confidence was in the association of SNP A-1686530 / rslO515758, given that it was identified using an analysis of independent testing in separate subsets of the cohort.
  • SNP A-1745890 / rs958505 also passed significance threshold after Bonferroni correction, but was not identified in any analysis in which the patient cohort was analyzed using a validation approach. Linking these two SNPs to surrounding genes was unsuccessful based on further analysis of genotype data.
  • Multivariate methods are expected to more accurately assess the genetics of complex diseases (Ritchie et al. Am. J. Hum. Genet. 69(1): 138-147, 2001; Hoh and Ott. Hum. Hered. 50(l):85-89, 2000; Hoh and Ott. Nat. Rev. Genet. 4(9):701-709, 2003; Culverhouse et al. Genet. Epidemiol. 27(2): 141-152, 2004). Multivariate approaches that considered SNP genotype patterns and SNP-SNP interactions (Halushka et al. Nat. Genet. 22(3):239-247, 1999) were considered, since univariate methods did not yield functional hypotheses.
  • the signals identified in the univariate analyses may be of insufficient strength to withstand a severe multiple testing correction such as the Bonferroni method, but may still be further evaluated for meaningful genotype patterns related to disease status. Furthermore, these patterns would assist in identifying candidate susceptibility regions for ISR. Multivariate approaches were first considered that address SNP-SNP interactions, as well as SNP genotype patterns (Glazier et al. Science 298(5602):2345-2349, 2002), using methodologies supported in the literature.
  • Regions for further analysis were selected if they contained more than one SNP within 250 kb of one another (see Table 5), allowing the first and last SNP within a contiguous region to define the initial boundaries for each "block.”
  • the final block boundaries for further testing were defined by adding 3 kb to either end of the initial block defined by the physical positions of the SNPs identified by univariate analysis.
  • the final haplotypes were tested for association with restenosis, and after Bonferroni multiple testing correction was applied in a global manner to all 84 haplotypes tested across all regions, eight regions were identified containing haplotypes with significant association with ISR. Among these, five regions contained the genes NOV, ARNTL, PKP4, TAF4B and FLJ21986 (see Tables 5, 6, and 7).
  • haplotype block is significant in the haplotype based association analysis ⁇ The haplotype block is significant, however, there is no gene located in the block Bold indicates that the gene locates in this haplotype block
  • haplotype block is significant in the haplotype based association analysis ⁇ The haplotype block is significant, however, there is no gene located in the block Bold indicates that the gene locates in this haplotype block
  • SNP does not comprise haplotype *SNP part of a significant haplotype block Bold: Significant but no SNP nearby
  • ARNTL and NOV staining is observed most specifically in the media of normal coronary artery, with less distinct staining for PKP 4 and TAF4b ( Figure 2B).
  • NO V is expressed in cellular regions of atherosclerotic plaque as well as the arterial media, as seen in the normal coronary artery ( Figure 2B).
  • Movat pentachrome staining demonstrates heterogeneous staining within the neointimal lesion, with cellular regions as well as more fibrous and less cellular regions indicated as yellow staining.
  • positive staining is observed for ARNTL, NOV and PKP4 ( Figure 2C).
  • time interval was examined between stent implantation and first presentation with evidence of clinical restenosis as a quantitative trait.
  • This time interval provides additional phenotypic characterization of each patient, as it is a marker of disease severity, with more severe exuberant vascular wound repair mechanisms operative in ISR occurring early after stent implantation.
  • the trait is independent of the binary ISR outcome and was analyzed in order to gain further understanding of allele copy number effects on the disease phenotype. Analysis of the time interval was first conducted as a continuous variable against each S ⁇ P in significant haplotypes in a case-only Cox regression analysis.
  • Haplotypes in the regions associated with ISR are more informative than any single SNP in explaining inter-individual variation in the time to development of ISR, supporting the need to consider multivariate approaches to this data.
  • Haplotype3 is excluded for case only analysis
  • Tables 18-29 shown below, provide the results of TTRS analysis based on haplotype clustering for the BRLMM results.
  • Bold SNPs in the Haplotypes column) distinguish two clusters. N/A indicates an extra plot.
  • SNPs in each significant haplotype block were analyzed using the Haplo. score part of the Haplostats package (Schaid et al. Am. J. Hum. Genet. 70:425-434, 2002). Observed haplotypes in the CardioGene patient sample were reported as frequencies among each case group and the control group.
  • Table 30 provides a summary of the SNPs identified in each significant haplotype block, including the chromosomal location of each SNP (designated as "location") and the nucleotide change corresponding to each polymorphism. Since each patient in the study has two strands of DNA at each locus, two haplotypes were estimated per patient and tallies reflect two counts per patient. The haplotype shown in bold is associated with ISR.
  • Haplotypes in the regional haplotype analysis results were defined using " 1 " and "2" coding for alleles.
  • “ 1 " corresponds to allele "A,” as annotated by Affymetrix.
  • “2" corresponds to allele “B.”
  • the haplotype sequences provided reflect the haplotype sequence that can be precisely ascertained by cross- referencing these data to NCBI genomic sequence data and HapMap haplotype data. This requires careful manual review of SNPs on the forward and reverse DNA strands.
  • Haplotypes identified in the CardioGene analyses may then be definitely linked to each haplotype' s precise nucleotide sequence, such as has been described by Cargill et al. ⁇ Am. J. Hum. Genet. 80(2):273-90, 2007).
  • Multi-locus models of epistasis contributing to disease have been considered for SNP-SNP interactions.
  • the absolute best multi-locus approach to a genome-wide association study such as this is unknown due to the complexities of not knowing the number of interacting loci, and the form of interaction. Therefore, the method applied was designed to investigate two-SNP interactions, recently described in a series of simulation experiments (Lin et al. Bioinformatics 20(8):1233-1240, 2004).
  • a 2-stage approach was used, in which the first stage selects SNPs with allelic association on a per-SNP bases meeting statistical significance with p ⁇ 0.01 (1154 SNPs) or p ⁇ 0.1 (10967 SNPs).
  • Clustering techniques have been well-validated and extensively applied in the analysis of microarray transcript analysis, to identify relevant patterns in genomic data.
  • the application of clustering methods to SNP genotype data has been limited, but methods have been developed and applied in specific situations, such as identifying loss of heterozygosity (Lindblad-Toh et al. Nat. Biotechnol. 18(9): 1001-1005, 2000; Janne et al. Oncogene 23(15):2716-2726, 2004; Lin et al. J. Biol. Chem. 280(9):8229-8237, 2005).
  • KNN K-nearest neighbor
  • CEPH control sample A single aliquot of DNA from the CEPH longitudinal study was obtained and inserted in to the 96-well plates into which DNA samples were aliquoted prior to analysis in this study.
  • the CEPH control sample therefore served as a quality control measure of the process of generating the genotype data.
  • the CEPH control sample was analyzed in 22 sets of chips, providing a rich dataset for evaluation of reproducibility of each SNP' s genotype call. The concordance was calculated as:
  • NOV nuclear deoxyribonucleic acid
  • ARNTL a hypothetical protein referred to as FLJ21986
  • TAF4B a hypothetical protein referred to as FLJ21986
  • FLJ21986 a hypothetical protein referred to as FLJ21986.
  • NOV has a known role in vascular remodeling relevant to angiogenesis and vascular wound repair previously reported in the literature (Ellis et al. Arterioscler. Thromb. Vase. Biol. 20(8):1912-1919, 2000; Carlson et al. Nature 429(6990):446-452, 2004).
  • Each of the five genes identified is expressed in human coronary arteries, to varying degrees and with differing patterns across the spectrum of arteries that are morphologically normal, atherosclerotic and restenotic. Interestingly, this gene shows the prominent difference in expression between normal and atherosclerotic arteries.
  • EPHBl ephrin receptor Bl
  • ST 18 suppression of tumorogenicity 18
  • GWA genome-wide association
  • This disclosure demonstrates the effectiveness of a method that allows full use of genome-wide SNP marker data that does not stringently correct allelic associations upfront, but does so in a secondary haplotype analysis that further explores patterns evident among rank order relationships among the univariate test results.
  • Multiple candidate susceptibility loci were identified containing five genes, each of which can be related to vascular homeostasis and alterations in disease.
  • the reported findings from univariate tests of allelic association are in keeping with the observation that non-coding regulatory variants may more often contribute to complex traits than coding sequence variants (King and Wilson Science 188(4184): 107-116, 1975; Korstanje and Paigen, Nat. Genet. 31(3):235-236, 2002; Symula et al. Nat. Genet.
  • ISR also serves as a more severe phenotypic model for vascular remodeling that occurs over longer time courses in response to more indolent vascular injuries, such as those that culminate in the development of atherosclerosis.
  • the time course of ISR delineates the two major causes of stent occlusion over the months after stent implantation, with occlusion due to ISR occurring within a year and atherosclerotic disease progression typically occurring after one year.
  • DES drug-elution stents
  • This disclosure provides the identification of several SNPs and specific haplotypes that are linked to susceptibility to ISR.
  • the disclosure further provides use of the identified SNPs and haplotypes in methods, including diagnostic, prognostic, and predictive methods, as well as methods for screening for compounds that interact with a variant nucleotide (such as specific variants discussed herein at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3) or compounds that interact with or influence the expression of a protein encoded thereby.
  • a variant nucleotide such as specific variants discussed herein at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Described herein is the identification of sixteen candidate susceptibility loci linked to in-stent restenosis (ISR). Of these regions, seven contain the genes NOV, ARNTL, TAF4B, PKP4, EPHB l, ST 18 and FLJ21986 (which encodes a hypothetical protein). The remaining eleven susceptibility loci are located in cytobands 2p 16.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 13ql4.3, 15q25.1 and 18q22.3. Provided are methods for identifying a subject having an increased risk of developing restenosis, comprising obtaining a nucleic acid sample from the subject and determining the nucleotide present at the chromosomal positions identified herein as part of a haplotype associated with an increased risk of developing restenosis.

Description

GENOMICS OF IN-STENT RESTENOSIS
CROSS REFERENCE TO RELATED CASE(S)
This application claims the benefit of U.S. Provisional Application No. 60/798,019, filed May 4, 2006, the entire content of which is hereby incorporated by reference.
FIELD OF THE DISCLOSURE
This disclosure relates to the diagnosis and treatment of restenosis, particularly in- stent restenosis. More particularly, it relates to the identification of single nucleotide polymorphisms (SNPs) and haplotypes linked to in-stent restenosis that are useful for the diagnosis and treatment of this disease.
BACKGROUND
Coronary artery disease (CAD) is the leading cause of death in industrialized countries, and stent implantation is a mainstay of revascularization therapy for atherosclerosis. However, in-stent restenosis (ISR), a recurrence of luminal occlusion within the stent, is a limitation of this therapy. ISR is a discrete focal vascular disease, characterized by a fibroproliferative response to the vascular "injury" induced by placement of a stent in a diseased artery. The response to injury follows a continuum in human arteries; some degree of cell proliferation occurs in all patients and can be thought of as a wound healing process. In some individuals, however, the wound healing becomes excessive, leading to exuberant vascular smooth muscle cell growth and extracellular matrix synthesis, and encroachment on the arterial lumen, resulting in a recurrence of clinical symptoms, typically within the first nine months post-procedure. Molecular and genetic studies suggest that ISR is primarily an inflammatory and proliferative disease, with distinct defined roles for various cell cycle proteins, growth factors, and inflammatory cytokines (Farb et al. Circulation 99(l):44-52, 1999; Ganesh et al. Pharmacogenomics 5(7):952-1004, 2004; Libby and Ganz Engl. J. Med. 337(6):418-419, 1997; Simon et al. J. Clin. Invest. 105(3):293-300, 2000; Zohlnhofer et al. MoI. Cell. 7(5): 1059-1069, 2001; Boehm et al. J. Clin. Invest. 114(3):419-426, 2004; McNamara et al. J. Clin. Invest. 91(l):94-98, 1993; Tanner et al. Circ. Res. 82(3):396-403, 1998; Yang et al. Proc. Natl. Acad. ScL U.S.A. 93(15):7905-7910, 1996; Serruys et al. N. Engl. J. Med. 354(5):483-495, 2006). Many of these have defined roles in atherosclerosis, which can be characterized as the culmination of more indolent vascular injury processes. Heritability of atherosclerosis and CAD are well established (Watkins and Farrall. Nat. Rev. Genet. 7(3): 163-173, 2006). What is not known, however, is the genetic basis of a patient's response to stent deployment and susceptibility to the development of ISR. However, the clinical phenotype of ISR is unique, in that the exact timing of injury to the vascular wall by the stent is known in each patient, and well-defined clinical endpoints exist for the determination of ISR within a defined timeframe after stenting. These features allow precise clinical phenotyping, a crucial point in the design of genetic studies of complex diseases (Zondervan and Cardon. Nat. Rev. Genet. 5(2):89-100, 2004) and make ISR an ideal model for studying the genetic basis of vascular remodeling seen in various vasculopathies.
SUMMARY
Described herein is the identification of sixteen candidate susceptibility loci linked to in-stent restenosis (ISR). Of these regions, seven contain the following genes: NOV, ARNTL, TAF4B, PKP4, EPHBl, ST 18 and FLJ21986, which encodes a hypothetical protein. Provided are methods for identifying a subject having an increased risk of developing restenosis, comprising obtaining a nucleic acid sample from the subject and determining the nucleotide present at the chromosomal positions identified herein as part of a haplotype associated with an increased risk of developing restenosis. Further provided is a method comprising determining nucleotide(s) present at the chromosomal positions identified herein in two or more of the group consisting of the PKP4 gene, the FLJ21986 gene, the
NOV gene, the ARNTL gene, the TAF4B gene, the EPHBl gene, the ST 18 gene, cytoband 2p 16.1 , cytoband 4q31.21, cytoband 7p21.2, cytoband Ip31.1 , cytoband 2p24.1 , cytoband 2q22.3, cytoband 13ql4.3, cytoband 15q25.1 and cytoband 18q22.3.
The foregoing and other features and advantages will become more apparent from the following detailed description of several embodiments, which proceeds with reference to the accompanying figures.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 is a series of diagrams showing linkage disequilibrium (LD) plotted as pairwise D' values for each of the eight regions identified in the regional haplotype analysis. In each panel, the schematic of LD calculated in the IOOK data is displayed above the genomic scale and the HapMap -derived LD calculations are displayed below the genomic scale. The vertical bars on the genomic scale represent SNPs used in these calculations. SNPs in the IOOK dataset are depicted in grey. Higher values of D' are indicated by darker grey. Regions with high D' values but low LOD score are depicted in light grey. Figure 2A is a digital image of expression of candidate genes in human tissues, as determined by RT-PCR (U = Universal human RNA; Co = Normal coronary artery; Ath = Atherosclerotic coronary artery; ISR = Coronary artery with in-stent restenosis; CA = Carotid artery; H = Heart; Lu = Lung; Li = Liver; SkM = Skeletal muscle; B = Brain; Skn = Skin). β-actin is included as a loading control. Figure 2B is a series of images showing Movat pentachrome and immunostaining of ARNTL, NOV, PKP 4 and TAF4B in a morphologically normal coronary artery, in an artery with atherosclerosis and in lung as a control tissue. Images are shown at 2x magnification. Figure 2C is a series of images showing a human coronary artery with ISR stained with Movat pentachrome. Removal of stent wires, which is necessary for sectioning of tissues, results in some disruption of overall arterial architecture, with the neointima intact. Further immunostaining of this coronary artery was performed in the same manner as for other arteries. Staining in the neointima is shown at 1Ox, 2Ox and 4Ox magnification.
Figures 3 shows quantitative trait analysis of the eight regions identified in the primary regional haplotype analysis: 1 Ip 15.2 (Figure 3A); 18ql 1.2 (Figure 3B); 2q24.1
(Figure 3C); 8q24.12 (Figure 3D); 7q31.31 (Figure 3E); 2pl6.1 (Figure 3F); 4q31.21 (Figure 3G); and 7p21.2 (Figure 3H). Cox regression analysis was performed, analyzing the time to the development of restenosis according to haplotype status AA, AB or BB.
Figure 4 is a flowchart showing a representative procedure for data cleanup/filtering prior to allelic association test of each SNP.
Figures 5 and 6 are flowcharts showing a representative procedure used for the genome-wide and regional haplotype association analysis for searching genetic variants of restenosis.
Figure 7 shows p-value plots for each of ABT2 (Figure 7A), ABT3 (Figure 7B), MNO (Figure 7C), MNl (Figure 7D), MN2 (Figure 7E), TTO (Figure 7F), TTl (Figure 7G) and TT2 (Figure 7H). The results of allele-based tests of association are presented using a 2- group strategy, in which all cases are compared to controls, and a 3 -group strategy, in which the two case groups are treated as ordinal outcomes and compared against controls. In the 3- group analysis, two SNPs pass the significance threshold, which is adjusted using a Bonferroni correction for 96,767 tests. The results of further allele-based tests of association are presented using various approaches. Comparing all cases versus controls, the following tests were performed: (a) a trend test with a recessive disease model; (b) a trend test using an additive model; and (c) a trend test using a dominant model. Treating the two case groups as ordinal traits, and thereby conducting a 3 -group analysis, the following additional tests were performed: (d) a means test -MNO (Figure 7C); (e) a means test - MNl (Figure 7D); and (f) a means test - MN2 (Figure 7E).
Figure 8 shows quantitative trait analysis for the following loci: 2q24.1 (Figure 8A), 7q31.31 (Figure 8B), 8q24.12 (Figure 8C), I lpl5.2 (Figure 8D), 18ql l.2 (Figure 8E), 2p 16.1 (Figure 8F), 4q31.21 (Figure 8G), and 7p21.2 (Figure 8H).
Figure 9 is a series of graphic chromosome maps, illustrating the position of each of five candidate ISR susceptibility genes (NOV, ARNTL, PKP4, TAF4B, and FLJ21986) identified herein.
SEQUENCE LISTING
Nucleic and amino acid sequences listed herein are shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids, as defined in 37 CF. R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand. All sequence database accession numbers referenced herein are understood to refer to the version of the sequence identified by that accession number as it was available on the designated date.
SEQ ID NO: 1 is the nucleotide sequence of the ARNTL 5' RT-PCR primer.
SEQ ID NO: 2 is the nucleotide sequence of the ARNTL 3' RT-PCR primer. SEQ ID NO: 3 is the nucleotide sequence of the FLJ21986 5' RT-PCR primer.
SEQ ID NO: 4 is the nucleotide sequence of the FLJ21986 3' RT-PCR primer.
SEQ ID NO: 5 is the nucleotide sequence of the NOV 5' RT-PCR primer.
SEQ ID NO: 6 is the nucleotide sequence of the NOV 3' RT-PCR primer.
SEQ ID NO: 7 is the nucleotide sequence of the PKP4 5' RT-PCR primer. SEQ ID NO: 8 is the nucleotide sequence of the PKP4 3' RT-PCR primer.
SEQ ID NO: 9 is the nucleotide sequence of the TAF4B 5' RT-PCR primer.
SEQ ID NO: 10 is the nucleotide sequence of the TAF4B 3' RT-PCR primer.
SEQ ID NO: 11 is the nucleotide sequence of the B-actin 5' RT-PCR primer.
SEQ ID NO: 12 is the nucleotide sequence of the B-actin 3' RT-PCR primer. SEQ ID NO: 13 is the nucleotide sequence of ARNTL (GenBank No. AF044288).
SEQ ID NO: 14 is the nucleotide sequence of FLJ21986 (GenBank No. AL832619.2).
SEQ ID NO: 15 is the nucleotide sequence of NOV (GenBank No. AY082381.1).
SEQ ID NO: 16 is the nucleotide sequence of PKP4 (GenBank No. BC048013.1). SEQ ID NO: 17 is the nucleotide sequence of TAF4B (GenBank No. Y09321.1).
SEQ ID NO: 18 is the nucleotide sequence of β-actin (GenBank No. NM OOl 101).
SEQ ID NO: 19 is the nucleotide sequence of GAPDH (GenBank No. M 17851.1).
SEQ ID NOs: 20-23 are the nucleotide sequences of the regions spanning four SNPs that are part of a significant haplotype block in the PKP4 gene.
SEQ ID NOs: 24-26 are the nucleotide sequences of the regions spanning three SNPs that are part of a significant haplotype block in the FLJ21986 gene.
SEQ ID NOs: 27-39 are the nucleotide sequences of the regions spanning thirteen SNPs that are part of a significant haplotype block in the NOV gene. SEQ ID NOs: 40-44 are the nucleotide sequences of the regions spanning five SNPs that are part of a significant haplotype block in the ARNTL gene.
SEQ ID NOs: 45 and 46 are the nucleotide sequences of the regions spanning two SNPs that are part of a significant haplotype block in the TAF4B gene.
SEQ ID NOs: 47-49 are the nucleotide sequences of the regions spanning three SNPs that are part of a significant haplotype block in cytoband 2p 16.1.
SEQ ID NOs: 50-52 are the nucleotide sequences of the regions spanning three SNPs that are part of a significant haplotype block in cytoband 4q31.21.
SEQ ID NOs: 53-59 are the nucleotide sequences of the regions spanning seven SNPs that are part of a significant haplotype block in cytoband 7p21.2. SEQ ID NO: 60 is the nucleotide sequence of EPHBl (GenBank No. NM 004441).
SEQ ID NO: 61 is the nucleotide sequence of ST 18 (GenBank No. ABOl 1107).
DETAILED DESCRIPTION /. Abbreviations
ASO Allele specific oligonucleotide
ASOH Allele-specific oligonucleotide hybridization
BMS Bare metallic stent
BRLMM Bayesian Robust Linear Modeling using Mahalanobis distance
CAD Coronary artery disease
CDCV Common-disease common-variant
CEPH Centre d'Etude du Polymorphisme Humain
DASH Dynamic allele-specific hybridization
DES Drug-elution stent
DM Dynamic model
DNA Deoxyribonucleic acid
EST Expressed sequence tag
GWA Genome-wide association
HWE Hardy- Weinberg equilibrium
ISR In-stent restenosis KNN K-nearest neighbor
LD Linkage disequilibrium
LOD Logarithmic odds
OLA Oligonucleotide ligation assay PCR Polymerase chain reaction
RNA Ribonucleic acid
RT-PCR Reverse transcriptase polymerase chain reaction
SNP Single nucleotide polymorphism
TTRS Time to restenosis
//. Terms
Unless otherwise noted, technical terms are used according to conventional usage. Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes V, published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081- 569-8).
In order to facilitate review of the various embodiments of the disclosure, the following explanations of specific terms are provided:
Allele: Any one of a number of viable DNA codings of the same gene (sometimes the term refers to a non-gene sequence) occupying a given locus (position) on a chromosome. An individual's genotype for that gene will be the set of alleles it happens to possess. In an organism which has two copies of each of its chromosomes (a diploid organism), two alleles make up the individual's genotype. In a diploid organism, when the two copies of the gene are identical - that is, have the same allele - they are said to be homozygous for that gene. A diploid organism which has two different alleles of the gene is said to be heterozygous.
As used herein, the process of "detecting alleles" may be referred to as "genotyping, determining or identifying an allele or polymorphism," or any similar phrase. The allele actually detected will be manifest in the genomic DNA of a subject, but may also be detectable from RNA or protein sequences transcribed or translated from this region.
Amplification: The use of a technique that increases the number of copies of a nucleic acid molecule in a sample. An example of in vitro amplification is the polymerase chain reaction (PCR), in which a biological sample obtained from a subject is contacted with a pair of oligonucleotide primers, under conditions that allow for hybridization of the primers to a nucleic acid molecule in the sample. The primers are extended under suitable conditions, dissociated from the template, and then re -annealed, extended, and dissociated to amplify the number of copies of the nucleic acid molecule. The product of amplification can be characterized by such techniques as electrophoresis, restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic acid sequencing.
Other examples of amplification methods include strand displacement amplification, as disclosed in U.S. Patent No. 5,744,311 ; transcription-free isothermal amplification, as disclosed in U.S. Patent No. 6,033,881 ; repair chain reaction amplification, as disclosed in PCT Publication No. WO 90/01069; ligase chain reaction amplification, as disclosed in EP- A-320,308; gap filling ligase chain reaction amplification, as disclosed in U.S. Patent No. 5,427,930; and NASBA™ RNA transcription-free amplification, as disclosed in U.S. Patent No. 6,025,134. An amplification method can be modified, including for example by additional steps or coupling the amplification with another protocol.
Antisense, Sense, and Antigene: Double-stranded DNA (dsDNA) has two strands, a 5' -> 3' strand, referred to as the plus strand, and a 3' -> 5' strand (the reverse complement), referred to as the minus strand. Because RNA polymerase adds nucleic acids in a 5' -> 3' direction, the minus strand of the DNA serves as the template for the RNA during transcription. Thus, the RNA formed will have a sequence complementary to the minus strand and identical to the plus strand (except that U is substituted for T).
Antisense molecules are molecules that are specifically hybridizable or specifically complementary to either RNA or the plus strand of DNA. Sense molecules are molecules that are specifically hybridizable or specifically complementary to the minus strand of DNA. Antigene molecules are either antisense or sense molecules directed to a dsDNA target.
Array: An arrangement of molecules, particularly biological macromolecules (such as polypeptides or nucleic acids) or cell or tissue samples, in addressable locations on or in a substrate. A "microarray" is an array that is miniaturized so as to require or be aided by microscopic examination for evaluation or analysis. These arrays are sometimes called DNA chips, or generally, biochips; though more formally they are referred to as microarrays, and the process of testing the gene patterns of an individual is sometimes called microarray profiling. DNA array fabrication chemistry and structure is varied, typically made up of 400,000 different features, each holding DNA from a different human gene, but some employing a solid-state chemistry to pattern as many as 780,000 individual features.
The array of molecules ("features") makes it possible to carry out a very large number of analyses on a sample at one time. In certain example arrays, one or more molecules (such as an oligonucleotide probe) will occur on the array a plurality of times (such as twice), for instance to provide internal controls. The number of addressable locations on the array can vary, for example from a few (such as three) to at least 50, at least 100, at least 200, at least 250, at least 300, at least 500, at least 600, at least 1000, at least 10,000, or more. In particular examples, an array includes nucleic acid molecules, such as oligonucleotide sequences that are at least 15 nucleotides in length, such as about 15-40 nucleotides in length, such as at least 18 nucleotides in length, at least 21 nucleotides in length, or even at least 25 nucleotides in length. In one example, the molecule includes oligonucleotides attached to the array via their 5'- or 3 '-end.
Within an array, each arrayed sample is addressable, in that its location can be reliably and consistently determined within the at least two dimensions of the array. The feature application location on an array can assume different shapes. For example, the array can be regular (such as arranged in uniform rows and columns) or irregular. Thus, in ordered arrays the location of each sample is assigned to the sample at the time when it is applied to the array, and a key may be provided in order to correlate each location with the appropriate target or feature position. Often, ordered arrays are arranged in a symmetrical grid pattern, but samples could be arranged in other patterns (such as in radially distributed lines, spiral lines, or ordered clusters). Addressable arrays usually are computer readable, in that a computer can be programmed to correlate a particular address on the array with information about the sample at that position (such as hybridization or binding data, including for instance signal intensity). In some examples of computer readable formats, the individual features in the array are arranged regularly, for instance in a Cartesian grid pattern, which can be correlated to address information by a computer.
Also contemplated herein are protein-based arrays, where the probe molecules are or include proteins, or where the target molecules are or include proteins, and arrays including nucleic acids to which proteins/peptides are bound, or vice versa. Binding or stable binding: An association between two substances or molecules, such as the hybridization of one nucleic acid molecule to another (or itself) and the association of an antibody with a peptide. An oligonucleotide molecule binds or stably binds to a target nucleic acid molecule if a sufficient amount of the oligonucleotide molecule forms base pairs or is hybridized to its target nucleic acid molecule, to permit detection of that binding. Binding can be detected by any procedure known to one skilled in the art, such as by physical or functional properties of the target: oligonucleotide complex. For example, binding can be detected functionally by determining whether binding has an observable effect upon a biosynthetic process such as expression of a gene, DNA replication, transcription, translation, and the like. Physical methods of detecting the binding of complementary strands of nucleic acid molecules, include but are not limited to, such methods as DNase I or chemical footprinting, gel shift and affinity cleavage assays, Northern blotting, dot blotting and light absorption detection procedures. For example, one method involves observing a change in light absorption of a solution containing an oligonucleotide (or an analog) and a target nucleic acid at 220 to 300 nm as the temperature is slowly increased. If the oligonucleotide or analog has bound to its target, there is a sudden increase in absorption at a characteristic temperature as the oligonucleotide (or analog) and target disassociate from each other, or melt. In another example, the method involves detecting a signal, such as a detectable label, present on one or both complementary strands.
The binding between an oligomer and its target nucleic acid is frequently characterized by the temperature (Tm) at which 50% of the oligomer is melted from its target. A higher (Tm) means a stronger or more stable complex relative to a complex with a lower
(Tm). A labeled target molecule "binds" to a nucleic acid molecule in a spot on an array if, after incubation of the (labeled) target molecule (usually in solution or suspension) with or on the array for a period of time (usually 5 minutes or more, for instance 10 minutes, 20 minutes, 30 minutes, 60 minutes, 90 minutes, 120 minutes or more, for instance over night or even 24 hours), a detectable amount of that molecule associates with a nucleic acid feature of the array to such an extent that it is not removed by being washed with a relatively low stringency buffer (such as higher salt (such as 3 x SSC or higher), room temperature washes). Washing can be carried out, for instance, at room temperature, but other temperatures (either higher or lower) also can be used. Targets will bind probe nucleic acid molecules within different features on the array to different extents, based at least on sequence homology, and the term "bind" encompasses both relatively weak and relatively strong interactions. Thus, some binding will persist after the array is washed in a more stringent buffer (such as lower salt (such as about 0.5 to about 1.5 x SSC), 55-65°C washes).
Where the probe and target molecules are both nucleic acids, binding of the test or reference molecule to a feature on the array can be discussed in terms of the specific complementarity between the probe and the target nucleic acids. Also contemplated herein are protein-based arrays, where the probe molecules are or comprise proteins, and/or where the target molecules are or comprise proteins, and arrays comprising nucleic acids to which proteins/pep tides are bound, or vice versa. cDNA: A DNA molecule lacking internal, non-coding segments (such as introns) and regulatory sequences that determine transcription. By way of example, cDNA may be synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells. Complementarity and percentage complementarity: Molecules with complementary nucleic acids form a stable duplex or triplex when the strands bind, (hybridize), to each other by forming Watson-Crick, Hoogsteen or reverse Hoogsteen base pairs. Stable binding occurs when an oligonucleotide molecule remains detectably bound to a target nucleic acid sequence under the required conditions. Complementarity is the degree to which bases in one nucleic acid strand base pair with the bases in a second nucleic acid strand. Complementarity is conveniently described by percentage, that is, the proportion of nucleotides that form base pairs between two strands or within a specific region or domain of two strands. For example, if 10 nucleotides of a 15- nucleotide oligonucleotide form base pairs with a targeted region of a DNA molecule, that oligonucleotide is said to have 66.67% complementarity to the region of DNA targeted.
In the present disclosure, "sufficient complementarity" means that a sufficient number of base pairs exist between an oligonucleotide molecule and a target nucleic acid sequence to achieve detectable binding. When expressed or measured by percentage of base pairs formed, the percentage complementarity that fulfills this goal can range from as little as about 50% complementarity to full (100%) complementary. In general, sufficient complementarity is at least about 50%, for example at least about 75% complementarity, at least about 90% complementarity, at least about 95% complementarity, at least about 98% complementarity, or even at least about 100% complementarity.
A thorough treatment of the qualitative and quantitative considerations involved in establishing binding conditions that allow one skilled in the art to design appropriate oligonucleotides for use under the desired conditions is provided by Beltz et al. (1983) Methods Enzymol 100:266-285; and by Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989. Coronary artery disease (CAD): Also called coronary heart disease or atherosclerotic heart disease. The end result of the accumulation of atheromatous plaques within the walls of the arteries that supply the myocardium (the muscle of the heart). While the symptoms and signs of coronary heart disease are noted in the advanced state of disease, most individuals with coronary heart disease show no evidence of disease for decades as the disease progresses before the first onset of symptoms, often a "sudden" heart attack, finally arise. After decades of progression, some of these atheromatous plaques may rupture and (along with the activation of the blood clotting system) start limiting blood flow to the heart muscle. The disease is the most common cause of sudden death. DNA (deoxyribonucleic acid): A long chain polymer which includes the genetic material of most living organisms (some viruses have genes including ribonucleic acid, RNA). The repeating units in DNA polymers are four different nucleotides, each of which includes one of the four bases (adenine, guanine, cytosine and thymine) bound to a deoxyribose sugar to which a phosphate group is attached. Triplets of nucleotides, referred to as codons, in DNA molecules code for amino acid in a polypeptide. The term codon is also used for the corresponding (and complementary) sequences of three nucleotides in the mRNA into which the DNA sequence is transcribed.
Fluorophore: A chemical compound, which when excited by exposure to a particular wavelength of light, emits light (i.e., fluoresces), for example at a different wavelength. Fluorophores can be described in terms of their emission profile, or "color." Green fluorophores, for example Cy3, FITC, and Oregon Green, are characterized by their emission at wavelengths generally in the range of 515-540 λ. Red fluorophores, for example Texas Red, Cy5 and tetramethylrhodamine, are characterized by their emission at wavelengths generally in the range of 590-690 λ. Examples of fluorophores that may be used are provided in U.S. Patent No.
5,866,366 to Nazarenko et ah, and include for instance: 4-acetamido-4'- isothiocyanatostilbene-2,2'disulfonic acid, acridine and derivatives such as acridine and acridine isothiocyanate, 5-(2'-aminoethyl)aminonaphthalene-l -sulfonic acid (EDANS), 4- amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate (Lucifer Yellow VS), N-(4- anilino- 1 -naphthyl)maleimide, anthranilamide, Brilliant Yellow, coumarin and derivatives such as coumarin, 7-amino-4-methylcoumarin (AMC, Coumarin 120), 7-amino-4- trifluoromethylcouluarin (Coumarin 151); cyanosine; 4',6-diaminidino-2-phenylindole (DAPI); 5', 5"-dibromopyrogallol-sulfonephthalein (Bromopyrogallol Red); 7-diethylamino- 3-(4'-isothiocyanatophenyl)-4-methylcoumarin; diethylenetriamine pentaacetate; 4,4'- diisothiocyanatodihydro-stilbene-2,2'-disulfonic acid; 4,4'-diisothiocyanatostilbene-2,2'- disulfonic acid; 5-[dimethylamino]naphthalene-l-sulfonyl chloride (DNS, dansyl chloride); 4-(4'-dimethylaminophenylazo)benzoic acid (DABCYL); 4-dimethylaminophenylazophenyl- 4'-isothiocyanate (DABITC); eosin and derivatives such as eosin and eosin isothiocyanate; erythrosin and derivatives such as erythrosin B and erythrosin isothiocyanate; ethidium; fluorescein and derivatives such as 5-carboxyfluorescein (FAM), 5-(4,6-dichlorotriazin-2- yl)aminofluorescein (DTAF), 2'7'-dimethoxy-4'5'-dichloro-6-carboxyfluorescein (JOE), fluorescein, fluorescein isothiocyanate (FITC), and QFITC (XRITC); fluorescamine; IRl 44; IR1446; Malachite Green isothiocyanate; 4-methylumbelliferone; ortho cresolphthalein; nitrotyrosine; pararosaniline; Phenol Red; B-phycoerythrin; o-phthaldialdehyde; pyrene and derivatives such as pyrene, pyrene butyrate and succinimidyl 1 -pyrene butyrate; Reactive Red 4 (Cibacron .RTM. Brilliant Red 3B-A); rhodamine and derivatives such as 6-carboxy-X- rhodamine (ROX), 6-carboxyrhodamine (R6G), lissamine rhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101 and sulfonyl chloride derivative of sulforhodamine 101 (Texas Red); N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA); tetramethyl rhodamine; tetramethyl rhodamine isothiocyanate (TRITC); riboflavin; rosolic acid and terbium chelate derivatives.
Other contemplated fluorophores include GFP (green fluorescent protein), Lissamine™, diethylaminocoumarin, fluorescein chlorotriazinyl, naphthofluorescein, 4,7- dichlororhodamine and xanthene and derivatives thereof. Other fluorophores known to those skilled in the art may also be used.
Genetic predisposition or risk: Susceptibility of a subject to a genetic disease. However, such susceptibility may or may not result in actual development of the disease. Haplotype: The genetic constitution of an individual chromosome. In diploid organisms, a haplotype contains one member of the pair of alleles for each site. A haplotype can refer to only one locus or to an entire genome. Haplotype can also refer to a set of single nucleotide polymorphisms (SNPs) found to be statistically associated on a single chromatid. In the context of the present disclosure, haplotypes described in the regional haplotype analysis results are defined using " 1 " and "2" coding for alleles, wherein " 1 " corresponds to allele "A," and "2" corresponds to allele "B." For example, a haplotype defined as 1122 corresponds to a haplotype AABB.
Hybridization: Oligonucleotides and their analogs hybridize by hydrogen bonding, which includes Watson-Crick, Hoogsteen or reversed Hoogsteen hydrogen bonding, between complementary bases. Generally, nucleic acid consists of nitrogenous bases that are either pyrimidines (cytosine (C), uracil (U), and thymine (T)) or purines (adenine (A) and guanine (G)). These nitrogenous bases form hydrogen bonds between a pyrimidine and a purine, and the bonding of the pyrimidine to the purine is referred to as "base pairing." More specifically, A will hydrogen bond to T or U, and G will bond to C. "Complementary" refers to the base pairing that occurs between to distinct nucleic acid sequences or two distinct regions of the same nucleic acid sequence.
"Specifically hybridizable" and "specifically complementary" are terms that indicate a sufficient degree of complementarity such that stable and specific binding occurs between the oligonucleotide (or its analog) and the DNA or RNA target. The oligonucleotide or oligonucleotide analog need not be 100% complementary to its target sequence to be specifically hybridizable. An oligonucleotide or analog is specifically hybridizable when binding of the oligonucleotide or analog to the target DNA or RNA molecule interferes with the normal function of the target DNA or RNA, and there is a sufficient degree of complementarity to avoid non-specific binding of the oligonucleotide or analog to non-target sequences under conditions where specific binding is desired, for example under physiological conditions in the case of in vivo assays or systems. Such binding is referred to as specific hybridization.
Hybridization conditions resulting in particular degrees of stringency will vary depending upon the nature of the hybridization method of choice and the composition and length of the hybridizing nucleic acid sequences. Generally, the temperature of hybridization and the ionic strength (especially the Na+ and/or Mg++ concentration) of the hybridization buffer will determine the stringency of hybridization, though wash times also influence stringency. Calculations regarding hybridization conditions required for attaining particular degrees of stringency are discussed by Sambrook et al. (ed.), Molecular Cloning: A
Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989, chapters 9 and 11; and Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999.
For purposes of the present disclosure, "stringent conditions" encompass conditions under which hybridization will only occur if there is less than 25% mismatch between the hybridization molecule and the target sequence. "Stringent conditions" may be broken down into particular levels of stringency for more precise definition. Thus, as used herein, "moderate stringency" conditions are those under which molecules with more than 25% sequence mismatch will not hybridize; conditions of "medium stringency" are those under which molecules with more than 15% mismatch will not hybridize, and conditions of "high stringency" are those under which sequences with more than 20% mismatch will not hybridize. Conditions of "very high stringency" are those under which sequences with more than 10% mismatch will not hybridize. The following is an exemplary set of hybridization conditions and is not meant to be limiting:
Very High Stringency (detects sequences that share 90% identity) Hybridization: 5x SSC at 65°C for 16 hours Wash twice: 2x SSC at room temperature (RT) for 15 minutes each
Wash twice: 0.5x SSC at 65°C for 20 minutes each
High Stringency (detects sequences that share 80% identity or greater) Hybridization: 5x-6x SSC at 65°C-70°C for 16-20 hours
Wash twice: 2x SSC at RT for 5-20 minutes each Wash twice: Ix SSC at 55°C-70°C for 30 minutes each
Low Stringency (detects sequences that share greater than 50% identity) Hybridization: 6x SSC at RT to 55°C for 16-20 hours
Wash at least twice: 2x-3x SSC at RT to 55°C for 20-30 minutes each In vitro amplification: Techniques that increase the number of copies of a nucleic acid molecule in a sample or specimen. An example of in vitro amplification is the polymerase chain reaction, in which a biological sample collected from a subject is contacted with a pair of oligonucleotide primers, under conditions that allow for the hybridization of the primers to nucleic acid template in the sample. The primers are extended under suitable conditions, dissociated from the template, and then re-annealed, extended, and dissociated to amplify the number of copies of the nucleic acid.
The product of in vitro amplification may be characterized by electrophoresis, restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic acid sequencing, using standard techniques. Other examples of in vitro amplification techniques include strand displacement amplification (see U.S. Patent No. 5,744,311); transcription-free isothermal amplification (see U.S. Patent No. 6,033,881); repair chain reaction amplification (see WO 90/01069); ligase chain reaction amplification (see EP-A-320 308); gap filling ligase chain reaction amplification (see U.S. Patent No. 5,427,930); coupled ligase detection and PCR (see U.S. Patent No. 6,027,889); and NASBA™ RNA transcription-free amplification (see U.S. Patent No. 6,025,134).
Isolated: An "isolated" biological component (such as a nucleic acid molecule, protein, or organelle) has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, such as other chromosomal and extra-chromosomal DNA and RNA, proteins and organelles.
Nucleic acid molecules and proteins that have been "isolated" include nucleic acid molecules and proteins purified by standard purification methods. The term also embraces nucleic acid molecules and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acid molecules and proteins.
Label: Detectable marker or reporter molecules, which can be attached to nucleic acids. Typical labels include fluorophores, radioactive isotopes, ligands, chemiluminescent agents, metal sols and colloids, and enzymes. Methods for labeling and guidance in the choice of labels useful for various purposes are discussed, e.g., in Sambrook et ah, in Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press (1989) and Ausubel et al, in Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences (1987). Linkage disequilibrium (LD): The non-random association of alleles at two or more loci, not necessarily on the same chromosome. LD describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies. The expected frequency of occurrence of two alleles that are inherited independently is the frequency of the first allele multiplied by the frequency of the second allele. Alleles that co-occur at expected frequencies are said to be in linkage equilibrium. Locus: The position of a gene (or other significant sequence) on a chromosome. Mutation: Any change of the DNA sequence within a gene or chromosome. In some instances, a mutation will alter a characteristic or trait (phenotype), but this is not always the case. Types of mutations include base substitution point mutations (for example, transitions or transversions), deletions, and insertions. Missense mutations are those that introduce a different amino acid into the sequence of the encoded protein; nonsense mutations are those that introduce a new stop codon. In the case of insertions or deletions, mutations can be in-frame (not changing the frame of the overall sequence) or frame shift mutations, which may result in the misreading of a large number of codons (and often leads to abnormal termination of the encoded product due to the presence of a stop codon in the alternative frame).
This term specifically encompasses variations that arise through somatic mutation, for instance those that are found only in disease cells, but not constitutionally, in a given individual. Examples of such somatically-acquired variations include the point mutations that frequently result in altered function of various genes that are involved in development of cancers. This term also encompasses DNA alterations that are present constitutionally, that alter the function of the encoded protein in a readily demonstrable manner, and that can be inherited by the children of an affected individual. In this respect, the term overlaps with "polymorphism," as defined herein, but generally refers to the subset of constitutional alterations.
Nucleic acid molecule: A polymeric form of nucleotides, which may include both sense and anti-sense strands of RNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of the above. A nucleotide refers to a ribonucleotide, deoxynucleotide or a modified form of either type of nucleotide. A "nucleic acid molecule" as used herein is synonymous with "nucleic acid" and "polynucleotide." A nucleic acid molecule is usually at least 10 bases in length, unless otherwise specified. The term includes single and double stranded forms of DNA. A polynucleotide may include either or both naturally occurring and modified nucleotides linked together by naturally occurring and/or non-naturally occurring nucleotide linkages.
Nucleotide: Includes, but is not limited to, a monomer that includes a base linked to a sugar, such as a pyrimidine, purine or synthetic analogs thereof, or a base linked to an amino acid, as in a peptide nucleic acid (PNA). A nucleotide is one monomer in a polynucleotide. A nucleotide sequence refers to the sequence of bases in a polynucleotide.
Oligonucleotide: A nucleic acid molecule generally comprising a length of 300 bases or fewer. The term often refers to single stranded deoxyribonucleotides, but it can refer as well to single or double stranded ribonucleotides, RNA:DNA hybrids and double stranded DNAs, among others. The term "oligonucleotide" also includes oligonucleosides (that is, an oligonucleotide minus the phosphate) and any other organic base polymer. In some examples, oligonucleotides are about 10 to about 90 bases in length, for example, 12, 13, 14, 15, 16, 17, 18, 19 or 20 bases in length. Other oligonucleotides are about 25, about 30, about 35, about 40, about 45, about 50, about 55, about 60 bases, about 65 bases, about 70 bases, about 75 bases or about 80 bases in length. Oligonucleotides may be single stranded, for example, for use as probes or primers, or may be double stranded, for example, for use in the construction of a mutant gene. Oligonucleotides can be either sense or anti sense oligonucleotides. An oligonucleotide can be modified as discussed above in reference to nucleic acid molecules. Oligonucleotides can be obtained from existing nucleic acid sources (for example, genomic or cDNA), but can also be synthetic (for example, produced by laboratory or in vitro oligonucleotide synthesis).
Operably (or Operatively) linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.
Open reading frame (ORF): A series of nucleotide triplets (codons) coding for amino acids without any internal termination codons. These sequences are usually translatable into a peptide.
Peptide Nucleic Acid (PNA): An oligonucleotide analog with a backbone comprised of monomers coupled by amide (peptide) bonds, such as amino acid monomers joined by peptide bonds.
Pharmaceutically acceptable carriers: The pharmaceutically acceptable carriers useful with compositions provided herein are conventional. By way of example, Martin, in Remington 's Pharmaceutical Sciences, published by Mack Publishing Co., Easton, PA, 19th Edition, 1995, describes compositions and formulations suitable for pharmaceutical delivery of the molecules and agents, including but not limited to nucleotides and proteins, herein disclosed. In general, the nature of the carrier will depend on the particular mode of administration being employed. For instance, parenteral formulations usually comprise injectable fluids that include pharmaceutically and physiologically acceptable fluids such as water, physiological saline, balanced salt solutions, aqueous dextrose, glycerol or the like as a vehicle. For solid compositions (for example, powder, pill, tablet, or capsule forms), conventional non-toxic solid carriers can include, for example, pharmaceutical grades of mannitol, lactose, starch, or magnesium stearate. In addition to biologically-neutral carriers, pharmaceutical compositions to be administered can contain minor amounts of non-toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate. Polymorphism: A variation in the gene sequence. The polymorphisms can be those variations (DNA sequence differences) which are generally found between individuals or different ethnic groups and geographic locations which, while having a different sequence, produce functionally equivalent gene products. The term can also refer to variants in the sequence which can lead to gene products that are not functionally equivalent. Polymorphisms also encompass variations which can be classified as alleles and/or mutations which can produce gene products which may have an altered function. Polymorphisms also encompass variations which can be classified as alleles and/or mutations which either produce no gene product or an inactive gene product or an active gene product produced at an abnormal rate or in an inappropriate tissue or in response to an inappropriate stimulus. Further, the term is also used interchangeably with allele as appropriate.
Polymorphisms can be referred to, for instance, by the nucleotide position at which the variation exists, by the change in amino acid sequence caused by the nucleotide variation, or by a change in some other characteristic of the nucleic acid molecule or protein that is linked to the variation.
Probes and Primers: Nucleic acid probes and primers can be readily prepared based on the nucleic acid molecules provided as indicators of susceptibility to in-stent restenosis or a related disease, condition or disorder. It is also appropriate to generate probes and primers based on fragments or portions of these nucleic acid molecules, particularly in order to distinguish between and among different alleles and haplotypes within a single gene. Also appropriate are probes and primers specific for the reverse complement of these sequences, as well as probes and primers to 5' or 3' regions.
A probe comprises an identifiable, isolated nucleic acid that recognizes a target nucleic acid sequence. Probes include a nucleic acid that is attached to an addressable location, a detectable label or other reporter molecule and that hybridizes to a target sequence. Typical labels include radioactive isotopes, enzyme substrates, co-factors, ligands, chemiluminescent or fluorescent agents, haptens, and enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are discussed, for example, in Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989 and Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999.
Primers are short nucleic acid molecules, for instance DNA oligonucleotides 10 nucleotides or more in length, for example that hybridize to contiguous complementary nucleotides or a sequence to be amplified. Longer DNA oligonucleotides may be about 15, 20, 25, 30 or 50 nucleotides or more in length. Primers can be annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand, and then the primer extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, for example, by the PCR or other nucleic-acid amplification methods known in the art, as described below.
Methods for preparing and using nucleic acid probes and primers are described, for example, in Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1- 3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999; and Innis et al. PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990. Amplification primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, © 1991, Whitehead Institute for Biomedical Research, Cambridge, MA). One of ordinary skill in the art will appreciate that the specificity of a particular probe or primer increases with its length. Thus, in order to obtain greater specificity, probes and primers can be selected that include at least 20, 25, 30, 35, 40, 45, 50 or more consecutive nucleotides of a target nucleotide sequences. Nucleic acid molecules may be selected that comprise at least 10, 15, 20, 25, 30, 35,
40, 50, 100, 150, 200, 250, 300 or more consecutive nucleotides of any of these or other portions of a nucleic acid molecule or a specific allele thereof, such as those disclosed herein. Thus, representative nucleic acid molecules might comprise at least 10 consecutive nucleotides of a nucleic acid sequence shown in any one of the sequences discussed or described herein, and more particularly any 10 consecutive nucleotides overlapping one of the SNPs illustrated in any of these sequences. More particularly, probes and primers in some embodiments are selected so that they overlap or reside adjacent to at least one of the indicated SNPs indicated in the Sequence Listing or one of the Tables (such as Table 5, 6, or 7). Purified: The term purified does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified nucleic acid preparation is one in which the specified protein is more enriched than the nucleic acid is in its generative environment, for instance within a cell or in a biochemical reaction chamber. A preparation of substantially pure nucleic acid may be purified such that the desired nucleic acid represents at least 50% of the total nucleic acid content of the preparation. In certain embodiments, a substantially pure nucleic acid will represent at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, or at least 95% or more of the total nucleic acid content of the preparation.
Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination can be accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, such as by genetic engineering techniques.
Restenosis: The reoccurrence of stenosis (an abnormal narrowing in a blood vessel or other tubular organ or structure). As used herein, this is generally restenosis of an artery, or other blood vessel, but possibly any hollow organ that has been "unblocked". If restenosis occurs at the site where a stent has been placed in an artery, it is called in-stent restenosis (ISR).
RNA: A typically linear polymer of ribonucleic acid monomers, linked by phosphodiester bonds. Naturally occurring RNA molecules fall into three classes, messenger (mRNA, which encodes proteins), ribosomal (rRNA, components of ribosomes), and transfer (tRNA, molecules responsible for transferring amino acid monomers to the ribosome during protein synthesis). Total RNA refers to a heterogeneous mixture of all three types of RNA molecules. Sample: A sample obtained from a plant or animal subject. As used herein, biological samples include all samples useful for genetic analysis in subjects, including, but not limited to: cells, tissues, and bodily fluids, such as blood; derivatives and fractions of blood (such as serum or plasma); extracted galls; biopsied or surgically removed tissue, including tissues that are, for example, unfixed, frozen, fixed in formalin and/or embedded in paraffin; tears; milk; skin scrapes; surface washings; urine; sputum; cerebrospinal fluid; prostate fluid; pus; bone marrow aspirates; BAL; saliva; cervical swabs; vaginal swabs; and oropharyngeal wash.
Sequence identity: The similarity between two nucleic acid sequences, or two amino acid sequences, is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homology); the higher the percentage, the more similar the two sequences are. Homologs or orthologs of nucleic acid or amino acid sequences will possess a relatively high degree of sequence identity when aligned using standard methods. This homology will be more significant when the orthologous proteins or nucleic acids are derived from species which are more closely related (such as human and chimpanzee sequences), compared to species more distantly related (such as human and C. elegans sequences). Typically, orthologs are at least 50% identical at the nucleotide level and at least 50% identical at the amino acid level when comparing human orthologous sequences. Methods of alignment of sequences for comparison are well known. Various programs and alignment algorithms are described in: Smith & Waterman, Adv. Appl. Math. 2:482, 1981; Needleman & Wunsch, J. MoI. Biol. 48:443, 1970; Pearson & Lipman, Proc. Natl. Acad. ScL USA 85:2444, 1988; Higgins & Sharp, Gene, 73:237-44, 1988; Higgins & Sharp, CABIOS 5:151-3, 1989; Corpet et al, Nuc. Acids Res. 16: 10881-90, 1988; Huang et al. Computer Appls. Biosci. 8:155-65, 1992; and Pearson et al, Meth. MoI. Bio. 24:307-31, 1994. Altschul et al, J. MoI. Biol. 215:403-10, 1990, presents a detailed consideration of sequence alignment methods and homology calculations.
The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al, J. MoI. Biol. 215:403-10, 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. Each of these sources also provides a description of how to determine sequence identity using this program.
Homologous sequences are typically characterized by possession of at least 60%, 70%, 75%, 80%, 90%, 95% or at least 98% sequence identity counted over the full length alignment with a sequence using the NCBI Blast 2.0, gapped blastp set to default parameters. Queries searched with the blastn program are filtered with DUST (Hancock and Armstrong, Comput. Appl. Biosci. 10:67-70, 1994). It will be appreciated that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall outside of the ranges provided. Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences, due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.
An alternative indication that two nucleic acid molecules are closely related is that the two molecules hybridize to each other under stringent conditions, as described under "specific hybridization."
Single Nucleotide Polymorphism or SNP: A DNA sequence variation, occurring when a single nucleotide (adenine (A), thymine (T), cytosine (C) or guanine (G)) in the genome differs between members of the species. As used herein, the term "single nucleotide polymorphism" (or SNP) includes mutations and polymorphisms. SNPs may fall within coding sequences (CDS) of genes or between genes (intergenic regions). SNPs within a CDS change the codon, which may or may not change the amino acid in the protein sequence. The former may constitute different alleles. The latter are called silent mutations and typically occur in the third position of the codon (called the wobble position). Specific binding agent: An agent that binds substantially only to a defined target.
Thus a protein-specific binding agent binds substantially only the specified protein. By way of example, as used herein, the term "X-protein specific binding agent" includes anti-X protein antibodies (and functional fragments thereof) and other agents (such as soluble receptors) that bind substantially only to the X protein (where "X" is a specified protein, or in some embodiments a specified domain or form of a protein, such as a particular allelic form of a protein).
Anti-X protein antibodies may be produced using standard procedures described in a number of texts, including Harlow and Lane (Antibodies, A Laboratory Manual, CSHL, New York, 1988). The determination that a particular agent binds substantially only to the specified protein may readily be made by using or adapting routine procedures. One suitable in vitro assay makes use of the Western blotting procedure (described in many standard texts, including Harlow and Lane [Antibodies, A Laboratory Manual, CSHL, New York, 1988)). Western blotting may be used to determine that a given protein binding agent, such as an anti-X protein monoclonal antibody, binds substantially only to the X protein.
Shorter fragments of antibodies can also serve as specific binding agents. For instance, Fabs, Fvs, and single-chain Fvs (SCFvs) that bind to a specified protein would be specific binding agents. These antibody fragments are defined as follows: (1) Fab, the fragment which contains a monovalent antigen-binding fragment of an antibody molecule produced by digestion of whole antibody with the enzyme papain to yield an intact light chain and a portion of one heavy chain; (2) Fab', the fragment of an antibody molecule obtained by treating whole antibody with pepsin, followed by reduction, to yield an intact light chain and a portion of the heavy chain; two Fab' fragments are obtained per antibody molecule; (3) (Fab')2, the fragment of the antibody obtained by treating whole antibody with the enzyme pepsin without subsequent reduction; (4) F(ab')2, a dimer of two Fab' fragments held together by two disulfide bonds; (5) Fv, a genetically engineered fragment containing the variable region of the light chain and the variable region of the heavy chain expressed as two chains; and (6) single chain antibody ("SCA"), a genetically engineered molecule containing the variable region of the light chain, the variable region of the heavy chain, linked by a suitable polypeptide linker as a genetically fused single chain molecule. Methods of making these fragments are routine.
Subject: Living multi-cellular vertebrate organisms, a category that includes human and non-human mammals (such as veterinary subjects).
Unless otherwise explained, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The singular terms "a," "an," and "the" include plural referents unless context clearly indicates otherwise. Similarly, the word "or" is intended to include "and" unless the context clearly indicates otherwise. Hence "comprising A or B" means including A, or B, or A and B. It is further to be understood that all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for description. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including explanations of terms, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
///. Overview of Several Embodiments
Described herein is the identification of sixteen candidate susceptibility loci linked to ISR. Of these regions, seven contain the genes NOV, ARNTL, TAF4B, PKP4, FLJ21986 (encoding a hypothetical protein), EPHBl and STl 8. The remaining eleven susceptibility loci, located in cytobands 2p 16.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 13ql4.3,
15q25.1 and 18q22.3, are not associated with a gene. For PKP4, ARNTL, cytoband 2pl6.1 and cytoband 2q22.3, two different significant haplotype blocks were identified. Thus, twenty haplotypes associated with an increased risk of developing restenosis are described herein. Provided herein are methods for identifying a subject having an increased risk of developing restenosis, comprising obtaining a nucleic acid sample from the subject and determining the nucleotide present at the chromosomal positions identified herein as part of a haplotype associated with an increased risk of developing restenosis.
In one embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 159314989, 159324358, 159328041 and 159328524 of the PKP4 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 1121 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 120477232, 120478404 and 120498610 of the FLJ21986 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 112 is associated with an increased risk of developing restenosis. In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 120490715, 120495854, 120496829, 120504993, 120505089, 120505599, 120506127, 120506187, 120506423, 120513265, 120513339 and 120515067 of the NOV gene in the nucleic acid sample, wherein the presence of a haplotype comprising 112221212111 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 13240687, 13241857, 13250481, 13254501, 13254501 and 13255095 of the ARNTL gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21211 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 22175498 and 22176091 of the TAF4B gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 57171286, 57186783 and 57187625 of cytoband 2pl6.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 211 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 145650644, 145658039 and 145682952 of cytoband in the nucleic acid sample, wherein the presence of a haplotype comprising 221 is associated with an increased risk of developing restenosis. In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 13506254, 13506374, 13507300, 13507512, 13507965, 13513798 and 13514074 of cytoband 7p21.2 in the nucleic acid sample, wherein the presence of a haplotype comprising 1222211 is associated with an increased risk of developing restenosis. In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 75060532, 75060768, 75063372 and 75064286 of cytoband Ip31.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 1112 is associated with an increased risk of developing restenosis. In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 57113139, 57115011, 57128636 and 57129478 of cytoband 2pl6.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 2211 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 22093576 and 22113252 of cytoband 2p24.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 11 is associated with an increased risk of developing restenosis. In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 146511704 and 146552402 of cytoband 2q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 22 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 146613745 and 146620324 of cytoband 2q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 11 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 159141311, 159144184, 159151183, 159152825, 159155564, 159163035, 159180909 and 159181800 of the PKP4 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 22112122 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 136151449 and 136152557 of the EPHBl gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 53402598 and 53403770 of the ST18 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 12 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 13240687, 13241857, 13250481, 13254501, 13255061 and 13255095 of the ARNTL gene in the nucleic acid sample, wherein the presence of a haplotype comprising 212111 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 53592086, 53592319 and 53598855 of cytoband 13ql4.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 211 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 78367393, 78368329, 78370985 and 78371562 of cytoband 15q25.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 1221 is associated with an increased risk of developing restenosis.
In another embodiment, the method comprises obtaining a nucleic acid sample from the subject; and determining the nucleotide present at chromosomal positions 68396720, 68396879, 68396951 and 68397041 of cytoband 18q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 1111 is associated with an increased risk of developing restenosis.
The nucleotide present at a particular chromosomal position can be determined using any one of a number of methods described herein and/or well known in the art, such as by PCR, in situ hybridization, Southern blotting, allele-specific hybridization or using an array.
The nucleic acid sample can be obtained from any one of a number of sources, including, but not limited to cells, tissues, bodily fluid, blood, blood derivatives, blood fractions (such as serum or plasma), extracted galls, biopsied or surgically removed tissue, tears, milk, skin scrapes, surface washings, urine, sputum, cerebrospinal fluid, prostate fluid, pus, bone marrow aspirates, BAL, saliva, cervical swabs, vaginal swabs and oropharyngeal wash. In one embodiment, the nucleic acid sample is obtained from a bodily fluid of the subject, such as blood or a blood fraction. In another embodiment, the nucleic acid sample is obtained from cells or tissue of the subject.
Further provided is a method comprising determining nucleotide(s) present at the chromosomal positions disclosed herein in two or more of the group consisting of the PKP4 gene, the FLJ21986 gene, the NOV gene, the ARNTL gene, the TAF4B gene, the EPHBl gene, the ST 18 gene, cytoband 2pl6.1, cytoband 4q31.21, cytoband 7p21.2, cytoband Ip31.1, cytoband 2p24.1, cytoband 2q22.3, cytoband 13ql4.3, cytoband 15q25.1 and cytoband 18q22.3. The nucleotide sequences for haplotypes defined using "1" and "2" or "A" and "B" can be precisely ascertained by cross-referencing the haplotype data to NCBI genomic sequence data and HapMap haplotype data. This process requires manual review of SNPs on the forward and reverse DNA strands. Haplotypes described herein can be definitely linked to a precise nucleotide sequence according to previously described methods (see, for example, Cargill et al. Am. J. Hum. Genet. 80(2):273-90, 2007).
IV. Genomics ofln-Stent Restenosis
Many vascular diseases are characterized by blood vessel blockages due to inflammation and growth of cells within the blood vessels. One such vascular disease is "in- stent restenosis" (ISR), which occurs in patients following placement of a stent within a coronary artery. ISR occurs in as many as 20-50% of patients receiving stents. In spite of its high incidence, the cause of ISR is not known.
It is proposed that some patients have a genetic susceptibility to develop in-stent restenosis, for instance by having abnormalities or peculiarities of inflammatory or growth regulatory genes and proteins. To investigate this hypothesis, a comprehensive molecular analysis of patients undergoing bare metal stent (BMS) implantation to treat coronary artery blockages has been conducted. The study is called the CardioGene Study. The rationale and study design of the CardioGene Study were published in October, 2004 (Ganesh et al, Pharmacogenomics 5(7):949-1004, 2004). The overall goal of the study is to understand genetic determinants of the responses to vascular injury that result in development of restenosis in some patients but not in others.
Briefly, 358 patients undergoing a stenting procedure at William Beaumont Hospital and the Mayo Clinic were prospectively enrolled in the study. Clinical data and blood samples were collected (1) prior to stent implantation, (2) two weeks after stent placement and (3) six months after stent placement. After twelve months, patients were triaged into two groups based on their clinical outcome: in-stent restenosis versus no in-stent restenosis. Comprehensive molecular analyses of genotype, gene expression and protein expression were conducted using genome-wide scans, microarray analysis of gene expression and proteomics. A corollary case-control analysis was also carried out involving an additional 104 cases with a history of restenosis after BMS treatment.
Using Affymetrix 10OK SNP chips (which contain 116,204 single-nucleotide- polymorphisms), SNPs associated with restenosis were identified, and regions with more than one SNP within 250 kb (or 1 Mb) were selected for further analysis. Haplotypes within these blocks were determined and tested for association between case and control haplotypes of restenosis. Those haplotypes with a population frequency of greater than 10% and p-value for association <0.05 were analyzed further, and ranked by population frequency. Eight regions were found to be significant, five of which contained genes: NOV, ARNTL, PKP4, TAF4B and FLJ21986. In brief outline summary, the haplotype analysis of SNP data was as follows: i. Select SNPs with p<0.001 in genome-wide univariate testing (255 SNPs); ii. SNPs in clusters (within 250 kb of one another) denoted regions for further haplotype analysis (22 physical blocks); iii. Haplotypes were defined in these regions, filtered for population frequency >
10% and tested for association with restenosis outcome; and iv. Eight regions were identified that contained haplotypes with significant association with restenosis (Bonferroni correction for 84 haplotypes), five of which contain the genes NOV, ARNTL, PKP4, TAF4B and FLJ21986. In addition to the analysis described above, a secondary analysis was conducted using the Bayesian Robust Linear Modeling using Mahalanobis distance (BRLMM) algorithm (Rabbee and Speed, Bioinformatics:22(l):l '-12, 2006) as described in the Example below. Using this analysis, eight additional regions were identified that contained haplotypes with significant association with restenosis. The PKP4 gene and ARNTL gene were again identified using this method, however, a different haplotype block was identified as having significant association with ISR. The BRLMM algorithm further identified the genes EPHBl and ST 18 as having significant association with ISR.
Biomarkers identified in the CardioGene Study (including for instance the SNPs, haplotypes, linked genes, and other linked loci) can be used in various ways, including, but not limited to, as diagnostic or predictive indicators to identify patients that are more or less likely to develop in-stent restenosis; as targets for therapeutic agents to reduce or prevent in-stent restenosis; and in the development of drug-eluting stents, based on compounds developed in light of the newly identified targets. Specific applications include methods for predicting in-stent restenosis that employ a specific SNP or other sequence variation, or a specific collection of such sequences (a haplotype), such as those described herein; and screening methods for identifying compounds that influence restenosis, based on an identified biomarker. A. Genes with SNPs/Haplotypes linked to ISR
NOV: NOV (SEQ ID NO: 15; also known as CCN3 and nephroblastoma overexpressed gene; NCBI Accession No. AY082381.1; Gene Map Locus 8q24.1; molecular weight 39,164 Da) is a member of the CCN family of genes, which encode cysteine-rich, secreted proteins associated with the extracellular matrix (ECM) and function as regulatory proteins. In this family of proteins, CCNl and CCN2 have been extensively described and are known to support cell adhesion, stimulate adhesive signaling and induce focal adhesion complexes, with important contributions to vascular homeostasis (Mo et al. MoI. Cell Biol. 22(2^:8709-8720, 2002; Ivkovic et al. Development 130(12):2119-219\, 2003). NOV is highly expressed in vascular smooth muscle cells of adult arterial media. Alterations in expression patterns have been defined in a rat model of vascular injury (Ellis et al. Arterioscler. Thromb. Vase. Biol. 20(^:1912-1919, 2000).
ARNTL: ARNTL (SEQ ID NO: 13; also known as MOP3, Bmal-1 ; NCBI Accession No. AF044288; Gene Map Locus 1 Ip 15; molecular weight 68,766 Da) is a basic helix-loop-helix protein that forms a heterodimer with CLOCK. ARNTL has primarily been described in its role as a regulator of circadian rhythms in mammals, with deletion of this gene in mice resulting in a loss of circadian rhythmicity and decreased locomotor activity (Bunger et al. Cell 103(7): 1009-1017, 2000). The decreased activity observed as these mice age is due to a progressive and noninflammatory arthropathy, characterized by ossification of ligaments and tendons around the joints (Bunger et al. Genesis 41(3): 122-132, 2005), reminiscent of pathophysiologic alterations that occur in the vascular wall with atherosclerosis. Adding further to the possible vascular role of ARNTL is the finding that ARNTL has 29% sequence homology to its namesake ARNT, which is also known as hypoxia inducible factor- lβ (HIF-I β). Hypoxia inducible factors have several well-defined roles in angiogenesis and vascular remodeling processes.
PKP4: PKP4 (SEQ ID NO: 16; also known as p0071 and plakophilin 4; NCBI Accession No. BC048013.1; Gene Map Locus 2q23-q3; molecular weight 1134.3 kDa) is a member of the armadillo gene family and functions as part of the junctional plaque, which serves to cluster cadherins that in turn mediate cell to cell contact (Calkins et al. J. Biol. Chem. 27 '8(3) :1774-1783, 2003; Setzer et al. J. Invest. Dermatol. 123(3) :426-433, 2004). Cadherins have been described extensively in vascular homeostasis, atherogenesis and vascular remodeling (Tzima et al. Nature 437(7057):426-43 l, 2005; Lambeng et al. Circ Res. 96(3) :384-39l, 2005; Carmeliet et al. Cell 98(2): 147- 157, 1999; Freiman e? α/. Science 293(5537) :2084-2087, 2001). As disclosed herein, this gene shows increased expression in atherosclerosis (see Figure 2).
FLJ21968: FLJ21986 (SEQ ID NO: 14) is annotated as a hypothetical protein. No prior functional data has been reported. The disclosure provided herein of vascular expression of FLJ21986 in human coronary arteries is the first specific report of expression in human tissues.
TAF4B: TAF4B (SEQ ID NO: 17; also known as TAFII105; NCBI Accession No. Y09321.1; Gene Map Locus 7q31.32; predicted molecular weight 117,595 Da) is a TATA box binding protein (TBP)-associated factor. TAF4B is a cell type-specific subunit of the transcription factor TFIID that is known to be expressed in gonadal tissues and B cells. It is thought to mediate transcription of a subset of genes required for folliculogenesis in the ovary, as well as being involved in the regulation of spermatogonial stem cell specification and proliferation (Falender et al Genes Dev. 79^:794-803, 2005). Interactions with NF-κB and inhibin-activin pathways, involved in TGF-beta signaling, have been described (Mengus et al Embo J. 24(15,):2753-2767, 2005; Rieder et al N. Engl J. Med. 352 (22):2285 -2293, 2005).
EPHBl: EPHBl (SEQ ID NO: 60; also known as ephrin receptor type Bl, tyrosine- protein kinase receptor EPH-2, ELK and HEK6; NCBI Accession No. NM 004441, deposited May 9, 1999; Gene Map Locus 3q22.1; predicted molecular weight 109,885 Da) is a type I membrane protein. EPHB 1 functions as the receptor for members of the ephrin-B family (ephrin-B 1, -B2 and -B3). It is thought be involved in cell-cell interactions in the nervous system. The ligand-activated form interacts with GRB2, GRB 10 and NCK through their respective SH2 domains.
ST18: ST 18 (SEQ ID NO: 61; also known as suppression of tumorigenicity protein 18, zinc finger protein 387 and KIAA0535; NCBI Accession No. ABOl 1107, deposited April 10, 1998; Gene Map Locus 8ql 1.23; predicted molecular weight 115,155 Da) is a breast cancer tumor suppressor gene encoding a zinc-finger DNA-binding protein with six fingers of the C2HC type.
B. Clinical Samples Appropriate samples for use with the current disclosure in determining a subject's genetic predisposition to in-stent restenosis (ISR) include any conventional clinical samples, including, but not limited to, blood or blood-fractions (such as serum or plasma), mouthwashes or buccal scrapes, chorionic villus biopsy samples, semen, Guthrie cards, eye fluid, sputum, lymph fluid, urine and tissue. Most simply, blood can be drawn and DNA (or RNA) extracted from the cells of the blood. Alteration of a wild-type ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 allele, whether, for example, by point mutation or deletion, can be detected by any of the means discussed herein are well known in the art.
Techniques for acquisition of such samples are well known in the art (for example, see Schluger et ah, J. Exp. Med. 176: 1327-33, 1992, for the collection of serum samples). Serum or other blood fractions can be prepared in the conventional manner. For example, about 200 μL of serum can be used for the extraction of DNA for use in amplification reactions.
Once a sample has been obtained, the sample can be used directly, concentrated (for example by centrifugation or filtration), purified, or any combination thereof, and an amplification reaction optionally performed. For example, rapid DNA preparation can be performed using a commercially available kit (such as the InstaGene Matrix, BioRad, Hercules, CA; the NucliSens isolation kit, Organon Teknika, Netherlands). In one example, the DNA preparation method yields a nucleotide preparation that is accessible to, and amenable to, nucleic acid amplification.
C. Isolation of Nucleic Acid(s)
The variant elements (including SNPs and haplotypes) described herein are useful as markers, for instance to identify genetic material as being derived from a particular individual or in making assessments regarding the propensity of an individual to develop a particular disorder or condition, the ability of an individual to respond to a certain course of treatment, or in other diagnostic, prognostic and other methods described in more detail herein.
Genetic material (nucleic acids such as genomic DNA, RNA, and cDNA) suitable for use in such methods can be generated or derived from a variety of sources. For example, nucleic acid molecules (preferably genomic DNA) can be isolated from a cell from a living or deceased subject using methods well known in the art. Cells can be obtained from biological samples, for instance from tissue samples or from bodily fluid samples that include cells (such as blood, urine, semen, exudates or saliva). Detection methods of the disclosure can be used to detect variant elements in DNA in a biological sample in intact cells (for instance, using in situ hybridization) or in extracted DNA (for instance, using Southern blot hybridization).
D. Amplification of Nucleic Acid Molecules
The nucleic acid samples obtained from the subject may be amplified from the clinical sample prior to detection. In one embodiment, DNA sequences are amplified. In another embodiment, RNA sequences are amplified. Any nucleic acid amplification method can be used. In one specific, non-limiting example, polymerase chain reaction (PCR) is used to amplify the nucleic acid sequences associated with ISR. Other exemplary methods include, but are not limited to, RT-PCR and transcription-mediated amplification (TMA), cloning, polymerase chain reaction of specific alleles (PASA), ligase chain reaction and nested polymerase chain reaction.
A pair of primers can be utilized in the amplification reaction. One or both of the primers can be labeled, for example with a detectable radiolabel, fluorophore, or biotin molecule. The pair of primers may include an upstream primer (which binds 5' to the downstream primer) and a downstream primer (which binds 3' to the upstream primer). The pair of primers used in the amplification reaction can be selective primers which permit amplification of a nucleic acid involved in ISR.
An additional pair of primers can be included in the amplification reaction as an internal control. For example, these primers can be used to amplify a "housekeeping" nucleic acid molecule and serve to provide confirmation of appropriate amplification. In another example, a target nucleic acid molecule including primer hybridization sites can be constructed and included in the amplification reactor. One of skill in the art will readily be able to identify primer pairs to serve as internal control primers.
Amplification products may be assayed in a variety of ways, including size analysis, restriction digestion followed by size analysis, detecting specific tagged oligonucleotide primers in the reaction products, allele-specific oligonucleotide (ASO) hybridization, sequencing, hybridization, and the like.
PCR-based detection assays include multiplex amplification of a plurality of polymorphisms simultaneously. For example, it is well known in the art to select PCR primers to generate PCR products that do not overlap in size and can be analyzed simultaneously. Alternatively, it is possible to amplify different polymorphisms with primers that are differentially labeled and thus can each be detected. Other techniques are known in the art to allow multiplex analyses of a plurality of polymorphisms. A fragment of a gene may be amplified to produce copies and it may be determined whether copies of the fragment contain the particular protective polymorphism or genotype. E. Detecting Single Nucleotide Alterations
Single nucleotide alterations, whether categorized as SNPs or new mutations, can be detected by a variety of techniques in addition to merely sequencing the target sequence. Constitutional single nucleotide alterations can arise either from new germline mutations, or can be inherited from a parent who possesses a SNP or mutation in their own germline DNA. The techniques used in evaluating either somatic or germline single nucleotide alterations include hybridization using allele-specific oligonucleotides (ASOs) (Wallace et al, CSHL Symp. Quant. Biol. 57:257-261, 1986; Stoneking et al., Am. J. Hum. Genet. 48:370-3*2, 1991); direct DNA sequencing (Church and Gilbert, Proc. Natl. Acad. ScL USA 57: 1991-1995, 1988); the use of restriction enzymes (Flavell et al, Cell 75:25, 1978; Geever et al, 1981); discrimination on the basis of electrophoretic mobility in gels with denaturing reagent (Myers and Maniatis, Cold Spring Harbor Symp. Quant. Biol. 57:275-284, 1986); RNase protection (Myers et al, Science 230: 1242, 1985); chemical cleavage (Cotton et al, Proc. Natl. Acad. ScL USA 55:4397-4401, 1985); and ligase-mediated detection (Landegren et al, Science 241 :1077, 1988).
Allele-specific oligonucleotide hybridization (ASOH) involves hybridization of probes to the sequence, stringent washing and signal detection. Other new methods include techniques that incorporate more robust scoring of hybridization. Examples of these procedures include the ligation chain reaction (ASOH plus selective ligation and amplification), as disclosed in Wu and Wallace (Genomics 4:560-569, 1989); mini- sequencing (ASOH plus a single base extension) as discussed in Syvanen (Meth. MoI Biol. 95:291-298, 1998); and the use of DNA chips (miniaturized ASOH with multiple oligonucleotide arrays) as disclosed in Lipshutz et al (BioTechniques 19:442-447, 1995). Alternatively, ASOH with single- or dual-labeled probes can be merged with PCR, as in the 5'-exonuclease assay (Heid et al, Genome Res. (5:986-994, 1996), or with molecular beacons (as in Tyagi and Kramer, Nat. Biotechnol 74:303-308, 1996).
Another technique is dynamic allele-specific hybridization (DASH), which involves dynamic heating and coincident monitoring of DNA denaturation, as disclosed by Howell et al (Nat. Biotech. 77:87-88, 1999). A target sequence is amplified by PCR in which one primer is biotinylated. The biotinylated product strand is bound to a streptavidin-coated microtiter plate well, and the non-biotinylated strand is rinsed away with alkali wash solution. An oligonucleotide probe specific for one allele is hybridized to the target at low temperature. This probe forms a duplex DNA region that interacts with a double strand- specific intercalating dye. When subsequently excited, the dye emits fluorescence proportional to the amount of double-stranded DNA (probe-target duplex) present. The sample is then steadily heated while fluorescence is continually monitored. A rapid fall in fluorescence indicates the denaturing temperature of the probe-target duplex. Using this technique, a single-base mismatch between the probe and target results in a significant lowering of melting temperature (Tm), which can be readily detected. Oligonucleotides specific to normal or allelic sequences can be chemically synthesized using commercially available machines. These oligonucleotides can then be labeled radioactively with isotopes (such as 32P) or non-radioactively, with tags such as biotin (Ward and Langer et al, Proc. Natl. Acad. ScL USA 75:6633-6657, 1981), and hybridized to individual DNA samples immobilized on membranes or other solid supports by dot-blot or transfer from gels after electrophoresis. These specific sequences are visualized by methods such as autoradiography, fluorometric reactions (Landegren et al., Science 242:229-237, 1989) or colorimetric reactions (Gebeyehu et al., Nucleic Acids Res. 75:4513-4534, 1987). Using an ASO specific for a normal allele, the absence of hybridization would indicate a mutation in the particular region of the gene, or a deleted gene. In contrast, if an ASO specific for a mutant allele hybridizes to a sample then that would indicate the presence of a mutation in the region defined by the ASO.
A variety of other techniques can be used to detect the mutations or other variations in DNA. Merely by way of example, see U.S. Patents No. 4,666,828; 4,801,531; 5,110,920; 5,268,267; 5,387,506; 5,691,153; 5,698,339; 5,736,330; 5,834,200; 5,922,542; and
5,998,137 for such methods. Additional methods include fluorescence polarization methods (see, for example, Kwok, Hum. Mutat., 19(4):3 l5-23, 2002); microbead methods (such as those described by Oliphant et al. Biotechniques 2002:56-58, 60-61, 2001; and Shen et al, Genet. Eng. News, 23(6), 2003); and mass spectrophotometery methods (for example, see Jurinke et al., Methods MoI Biol. 187: 179-92, 2002; Amexis et al. Proc. Natl. Acad. ScL U.S.A. 98(21): 12097-102, 2001; Jurinke et al. Adv. Biochem. Eng. Biotechnol. 77:57-74, 2002; Storm et al., Meth. MoI. Biol., 212:241 262, 2002; Rodi et al., BioTechnique., 32:S62 S69, 2002; U.S. Patent No. 6,300,076; and WO 9820166).
F. Differentiation of Individuals Homozygous versus Heterozygous for SNP(s) Since it is believed that the haplotype of ARNTL, NOV, PKP4, TAF4, or FL J21986 can influence the ISR susceptibility of a subject, it may be beneficial to determine whether a subject is homozygous or heterozygous for SNPs within ARNTL, NOV, PKP 4, TAF4B or FLJ21986.
By way of example, the oligonucleotide ligation assay (OLA), as described by Nickerson et al. {Proc. Natl. Acad. ScL U.S.A. 57:8923-8927, 1990), allows the differentiation between individuals who are homozygous versus heterozygous for alleles indicated herein. G. Additional Characterization of ISR Susceptibility SNPs In order to more fully understand the ISR susceptibility resulting from SNPs described herein, and particularly to determine which are causative SNPs and how they influence or alter the activity or expression of specific molecules, additional characterization can be carried out. The following material describes representative methods useful for such characterization.
Sections of non-coding nucleic acid identified herein, particularly those identified herein as including a variant (for instance, one of the SNPs listed in Tables 5, 6 or 7) can be tested for functionality or changes in functionality between two or more alleles. For example, segments of DNA can be amplified separately from individuals homozygous for risk alleles and from individuals homozygous for non-risk alleles. Each segment is cloned upstream of a reporter gene (such as luciferase), the resulting constructs transfected into various cell lines, such as endothelial cells, or vascular muscle smooth cells, and the relative amount of luciferase reporter expression compared. If there is a significant difference between the luciferase expression between the constructs, this indicates that the SNP(s) in that segment likely affect expression of ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986, or another linked or associated gene.
Additional possible susceptibility SNPs in the region defined herein also can be identified. By way of example, this can be done by surveying public databases of SNPs, and by sequencing DNA from subjects developing ISR and from controls. These SNPs can then be tested for evidence of association with ISR by genotyping cases and controls, for instance using methods like those described herein. SNPs that show the strongest evidence for association may be better candidates for the causative SNP. This genotype data can also be used to test haplotypes for evidence of association with disease, to help determine whether as yet unidentified SNPs may be more strongly associated.
The findings reported herein can be further strengthened by collecting and testing additional case-control samples for evidence of association of the identified SNPs and haplotype with ISR. In addition, the locations of all the identified SNPs can be compared to segments of DNA conserved across species, because SNPs located in these segments are believed to be more likely to be affect gene expression or function.
It is also advantageous to determine whether SNPs found to be linked to susceptibility to ISR affect the ability of protein(s) to bind to the surrounding segment of DNA. Methods for determining binding are well known in the art, including, but not limited to methods described herein. V. Representative Uses of SNPs and Haplotypes
The variants (including individual SNPs and haplotypes) described herein are useful as markers or indicators in a variety of different methods. They can be used, for instance, as diagnostic or predictive indicators to identify patients that are more or less likely to suffer restenosis or in-stent restenosis; as targets for therapeutic agents to reduce or prevent in-stent restenosis; in the development of drug-eluting stents based on compounds developed in light of the targets; and in monitoring clinical trials for the purposes of predicting outcomes of developing or ongoing therapeutic or treatment regimens. The results of such methods (such as diagnostic and predictive methods) can be used to develop or recommend a course of prophylactic treatment for an individual who is identified as having a specific SNP or combination of SNPs (or a haplotype), to prescribe or develop a course of therapy after identification that a subject has or suffers from a disease or disorder, or to alter or adapt an ongoing therapeutic regimen. The SNPs and/or haplotypes may also be used in risk- stratification of patients. For example, patients may be made aware of their chance for developing ISR based on whether or not they have one or more risk SNPs or haplotypes. In addition, additional research can be carried out to determine what additional steps might be taken to help in preventing the development of ISR in those patients at increased risk.
Certain embodiments therefore include diagnostic methods for detecting one or more SNPs or a haplotype in a biological sample, to thereby determine whether a subject is at risk of developing a disorder or disease or condition linked to one or more of the SNPs or the haplotypes described herein, or whether the subject is afflicted with the disease, condition or disorder.
The subject methods also can be used to determine whether a subject is at risk for passing on the susceptibility to develop a disease, condition or disorder to their offspring.
Also provided are prognostic, predictive methods for determining whether a subject is at risk of developing a disease, condition or disorder that affects endothelial cell or vascular smooth muscle cell proliferation or migration, including for instance restenosis, such as ISR. For example, SNP sequences or haplotypes can be assayed in a biological sample from a subject. Such assays can be used for prognostic, diagnostic, or predictive purpose to prophylactically or therapeutically treat an individual prior to or after the onset of a disorder, disease or condition (such as ISR) associated with one or more of the SNPs/haplotypes described herein, specifically those located at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l .2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l .23, 13ql4.3, 15q25.1 and 18q22.3, such as those discussed herein.
The nucleotide variants (including individual SNPs and haplotypes) provided herein also can be used for generating polynucleotide reagents. Methods are also provided for identifying or screening for compounds useful for treating or influencing or preventing a disease, disorder or condition associated with a SNP or haplotype located at cytobands 2q24.15 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3, such as those discussed herein.
VI. Expression of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 or Other Protein Variant Polypeptides, or A Reporter Polypeptide under Control of a Variant Regulatory Sequence
The expression and purification of proteins, such as an ARNTL, NOV, PKP4, TAF4B or FLJ21986 variant protein, can be performed using standard laboratory techniques, though techniques are preferentially adapted to be fitted to express the ARNTL, NOV, PKP4, TAF4B or FLJ21986 protein. Examples of such method adaptations are discussed or referenced herein. After expression, purified protein may be used for functional analyses, antibody production, diagnostics, and patient therapy. Studies such as these will improve understanding of the process of restenosis, which can then be translated into measures that improve patient care. Furthermore, the DNA sequences of the ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 variant cDNAs and regulatory regions, or gene or expressed sequence tag (EST) sequences contained within the genomic region described herein, can be manipulated in studies to understand the expression of the gene and the function of its product. Variant or allelic forms of a human ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 gene, including regulatory regions upstream or downstream of the encoding sequence, may be isolated based upon information contained herein, and may be studied in order to detect alterations in expression patterns in terms of relative quantities, tissue specificity and functional properties of the encoded ARNTL, NOV, PKP4, TAF4B or FLJ21986 variant protein (such as influence on endothelial cell or vascular smooth muscle cell proliferation or migration).
Partial or full-length cDNA sequences, which encode for the subject protein, may be ligated into bacterial expression vectors. Methods for expressing large amounts of protein from a cloned gene introduced into Escherichia coli (E. coli) or more preferably baculovirus/Sf9 cells may be utilized for the purification, localization and functional analysis of proteins. For example, fusion proteins consisting of amino terminal peptides encoded by a portion of a gene native to the cell in which the protein is expressed (for example, an E. coli lacZ or trpE gene for bacterial expression) linked to a variant protein may be used to prepare polyclonal and monoclonal antibodies against these proteins. Thereafter, these antibodies may be used to purify proteins by immunoaffmity chromatography, in diagnostic assays to quantitate the levels of protein and to localize proteins in tissues and individual cells by immunofluorescence.
Intact native protein may also be produced in large amounts for functional studies. Methods and plasmid vectors for producing fusion proteins and intact native proteins in culture are well known in the art, and specific methods are described in Sambrook et al (In Molecular Cloning: A Laboratory Manual, Ch. 17, CSHL, New York, 1989). Such fusion proteins may be made in large amounts, are easy to purify, and can be used to elicit antibody response. Native proteins can be produced in bacteria by placing a strong, regulated promoter and an efficient ribosome-binding site upstream of the cloned gene. If low levels of protein are produced, additional steps may be taken to increase protein production; if high levels of protein are produced, purification is relatively easy. Suitable methods are presented in Sambrook et al. (In Molecular Cloning: A Laboratory Manual, CSHL, New York, 1989) and are well known in the art. Often, proteins expressed at high levels are found in insoluble inclusion bodies. Methods for extracting proteins from these aggregates are described by Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Ch. 17, CSHL, New York, 1989). Vector systems suitable for the expression oilacZ fusion genes include the pUR series of vectors (Ruther and Muller-Hill, EMBO J. 2: 1791, 1983), pEXl-3 (Stanley and Luzio, EMBO J. 3: 1429, 1984) and pMRlOO (Gray et al, Proc. Natl. Acad. ScL U.S.A. 79:6598, 1982). Vectors suitable for the production of intact native proteins include pKC30 (Shimatake and Rosenberg, Nature 292: 128, 1981), pKK177-3 (Amann and Brosius, Gene 40: 183, 1985) and pET-3 (Srudiar and Moffatt, J. MoI. Biol. 189:113, 1986).
Fusion proteins may be isolated from protein gels, lyophilized, ground into a powder and used as an antigen. The DNA sequence can also be transferred from its existing context to other cloning vehicles, such as other plasmids, bacteriophages, cosmids, animal viruses and yeast artificial chromosomes (YACs) (Burke et al, Science 23(5:806-812, 1987). These vectors may then be introduced into a variety of hosts including somatic cells, and simple or complex organisms, such as bacteria, fungi (Timberlake and Marshall, Science 244: 1313-1317, 1989), invertebrates, plants (Gasser and Fraley, Science 244: 1293, 1989), and animals (Pursel et al, Science 244:1281-1288, 1989), in which cells or organisms are rendered transgenic by the introduction of the heterologous cDNA.
For expression in mammalian cells, the cDNA sequence may be ligated to heterologous promoters, such as the simian virus (SV) 40 promoter in the pSV2 vector (Mulligan and Berg, Proc. Natl. Acad. ScL USA 78:2072-2076, 1981), and introduced into cells, such as monkey COS-I cells (Gluzman, Cell 23: 175-182, 1981), to achieve transient or long-term expression. The stable integration of the chimeric gene construct may be maintained in mammalian cells by biochemical selection, such as neomycin (Southern and Berg, J. MoI. Appl. Genet. 7:327-341, 1982) and mycophenolic acid (Mulligan and Berg, Proc. Natl. Acad. ScL U.S.A. 75:2072-2076, 1981).
DNA sequences can be manipulated with standard procedures such as restriction enzyme digestion, fill-in with DNA polymerase, deletion by exonuclease, extension by terminal deoxynucleotide transferase, ligation of synthetic or cloned DNA sequences, site- directed sequence-alteration via single-stranded bacteriophage intermediate or with the use of specific oligonucleotides in combination with PCR or other in vitro amplification.
The cDNA sequence (or portions derived from it) or a mini gene (a cDNA with an intron and its own promoter) may be introduced into eukaryotic expression vectors by conventional techniques. These vectors are designed to permit the transcription of the cDNA in eukaryotic cells by providing regulatory sequences that initiate and enhance the transcription of the cDNA and ensure its proper splicing and polyadenylation. Vectors containing the promoter and enhancer regions of the SV40 or long terminal repeat (LTR) of the Rous Sarcoma virus and polyadenylation and splicing signal from SV40 are readily available (Mulligan et al, Proc. Natl. Acad. ScL U.S.A. 75: 1078-2076, 1981; Gorman et al, Proc. Natl. Acad. ScL U.S.A. 75:6777-6781, 1982). The level of expression of the cDNA can be manipulated with this type of vector, either by using promoters that have different activities (for example, the baculovirus pAC373 can express cDNAs at high levels in S. frugiperda cells (Summers and Smith, In Genetically Altered Viruses and the Environment, Fields et al. (Eds.) 22:319-328, CSHL Press, Cold Spring Harbor, New York, 1985) or by using vectors that contain promoters amenable to modulation, for example, the glucocorticoid-responsive promoter from the mouse mammary tumor virus (Lee et al.,
Nature 294:228, 1982). The expression of the cDNA can be monitored in the recipient cells 24 to 72 hours after introduction (transient expression).
In addition, some vectors contain selectable markers such as the gpt (Mulligan and Berg, Proc. Natl. Acad. ScL U.S.A. 75:2072-2076, 1981) or neo (Southern and Berg, J. MoI. Appl. Genet. 1 :327-341, 1982) bacterial genes. These selectable markers permit selection of transfected cells that exhibit stable, long-term expression of the vectors (and therefore the cDNA). The vectors can be maintained in the cells as episomal, freely replicating entities by using regulatory elements of viruses such as papilloma (Sarver et al, MoI. Cell Biol. 7:486, 1981 ) or Epstein-Barr (Sugden et al, MoI. Cell Biol. 5:410, 1985). Alternatively, one can also produce cell lines that have integrated the vector into genomic DNA. Both of these types of cell lines produce the gene product on a continuous basis. One can also produce cell lines that have amplified the number of copies of the vector (and therefore of the cDNA as well) to create cell lines that can produce high levels of the gene product (Alt et al., J. Biol. Chem. 253:1357, 1978).
The transfer of DNA into eukaryotic, in particular human or other mammalian cells is now a conventional technique. The vectors are introduced into the recipient cells as pure DNA, for example, by transfection using precipitation with calcium phosphate (Graham and vander Eb, Virology 52:466, 1973) or strontium phosphate (Brash et al, MoI. Cell Biol. 7:2013, 1987); electroporation (Neumann et al, EMBO J 1 :841, 1982); lipofection (Feigner et al, Proc. Natl. Acad. Sci U.S.A. 84:7413, 1987); DEAE dextran (McCuthan et al, J. Natl. Cancer Inst. 41:351, 1968), microinjection (Mueller et al, Cell 15:579, 1978); protoplast fusion (Schafner, Proc. Natl. Acad. Sci. USA 77:2163-2167, 1980); or pellet guns (Klein et al, Nature 327:70, 1987). Alternatively, the cDNA, or fragments thereof, can be introduced by infection with virus vectors. Systems are developed that use, for example, retroviruses (Bernstein et al, Gen. Engr'g 7:235, 1985), adenoviruses (Ahmad et al, J. Virol. 57:267, 1986), or Herpes virus (Spaete et al, Cell 30:295, 1982). Protein encoding sequences can also be delivered to target cells in vitro via non-infectious systems, for instance liposomes.
These eukaryotic expression systems can be used for studies of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 variant encoding nucleic acids and mutant forms of these molecules, ARNTL, NOV, PKP4, TAF4B or FLJ21986 variant proteins and mutant forms of these proteins, as well as altered regulator sequences of these genes or variants of the other genes or ESTs or other sequence located in the region of cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql 1.23, 13ql4.3, 15q25.1 and 18q22.3, discussed herein. The eukaryotic expression systems may also be used to study the function of the normal complete protein, specific portions of the protein, or of naturally occurring or artificially produced mutant proteins, as well as regulatory regions. Using the above techniques, the expression vectors containing an ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 sequence or cDNA (or a sequence or cDNA corresponding to a gene or EST or other sequence located in the region at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql 1.23, 13ql4.3, 15q25.1 and 18q22.3, described herein), or fragments or variants or mutants thereof, can be introduced into human cells, mammalian cells from other species or non-mammalian cells as desired. The choice of cell is determined by the purpose of the treatment. For example, monkey COS cells (Gluzman, Cell 23: 175-182, 1981) that produce high levels of the SV40 T antigen and permit the replication of vectors containing the SV40 origin of replication may be used. Similarly, Chinese hamster ovary (CHO), mouse NIH 3T3 fibroblasts or human fibroblasts or lymphoblasts may be used.
The present disclosure thus encompasses recombinant vectors that comprise all or part of an ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 variant gene or cDNA sequences, or a regulatory sequence thereof, for expression in a suitable host. The DNA is operatively linked in the vector to an expression control sequence in the recombinant DNA molecule so that a polypeptide can be expressed, or the regulatory sequence is operatively linked to a reporter gene. The expression control sequence may be selected from the group consisting of sequences that control the expression of genes of prokaryotic or eukaryotic cells and their viruses and combinations thereof. The expression control sequence may be specifically selected from the group consisting of the lac system, the trp system, the tac system, the trc system, major operator and promoter regions of phage lambda, the control region of fd coat protein, the early and late promoters of SV40, promoters derived from polyoma, adenovirus, retrovirus, baculovirus and simian virus, the promoter for 3-phosphoglycerate kinase, the promoters of yeast acid phosphatase, the promoter of the yeast alpha-mating factors and combinations thereof.
The host cell, which may be transfected with the vector of this disclosure, may be selected from the group consisting of E. coli, Pseudomonas, Bacillus subtilis, Bacillus stearothermophilus or other bacilli; other bacteria; yeast; fungi; insect; mouse or other animal; or plant hosts; or human tissue cells. It is appreciated that for mutant or variant ARNTL, NOV, PKP 4, TAF4B, EPHBl,
STl 8 or FLJ21986 DNA sequences, similar systems are employed to express and produce the mutant product. In addition, fragments of an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein can be expressed essentially as detailed above. Such fragments include individual ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein domains or sub-domains, as well as shorter fragments such as peptides. Protein fragments having therapeutic properties may be expressed in this manner also, including for instance substantially soluble fragments.
VII. Production of Protein Specific Binding Agents
Monoclonal or polyclonal antibodies may be produced to either a wildtype or reference protein or specific allelic forms of these proteins, for instance particular portions that contain a differential amino acid encoded by a SNP and therefore may provide a distinguishing epitope, for instance antibodies produced to an ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986 protein or peptide. Optimally, antibodies raised
(generated) against these proteins or peptides would specifically detect the protein or peptide with which the antibodies are generated. That is, an antibody generated to a specified target protein or a fragment thereof would recognize and bind that protein and would not substantially recognize or bind to other proteins found in target cells, for instance human cells. In some embodiments, an antibody is specific for (or measurably preferentially binds to) an epitope in a variant protein (such as an allele of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 as described herein) versus the reference protein, or vice versa.
The determination that an antibody specifically detects a target protein or form of the target protein is made by any one of a number of standard immunoassay methods; for instance, the western blotting technique (Sambrook et ah, In Molecular Cloning: A Laboratory Manual, CSHL, New York, 1989). To determine that a given antibody preparation (such as one produced in a mouse) specifically detects the target protein by Western blotting, total cellular protein is extracted from human cells (for example, lymphocytes) and electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel. The proteins are then transferred to a membrane (for example, nitrocellulose) and the antibody preparation is incubated with the membrane. After washing the membrane to remove non- specifically bound antibodies, the presence of specifically bound antibodies is detected by the use of an anti-mouse antibody conjugated to an enzyme such as alkaline phosphatase. Application of an alkaline phosphatase substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the production of a dense blue compound by immunolocalized alkaline phosphatase. Antibodies that specifically detect the target protein will, by this technique, be shown to bind to the target protein band (which will be localized at a given position on the gel determined by its molecular weight). Non-specific binding of the antibody to other proteins may occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be recognized by one skilled in the art by the weak signal obtained on the Western blot relative to the strong primary signal arising from the specific antibody-target protein binding.
Substantially pure ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein or protein fragment (peptide) suitable for use as an immunogen may be isolated from the transfected or transformed cells as described above. Concentration of protein or peptide in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per milliliter. Monoclonal or polyclonal antibody to the protein can then be prepared as follows: A. Monoclonal Antibody Production by Hybridoma Fusion
Monoclonal antibody to epitopes of the target protein identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler and Milstein {Nature 256:495-497, 1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody-producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess un-fused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall (Meth. Enzymol. 70:419-439, 1980), and derivative methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (Antibodies, A Laboratory Manual, CSHL, New York, 1988). B. Polyclonal Antibody Production by Immunization
Polyclonal antiserum containing antibodies to heterogeneous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with either inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appear to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis et al. {J. Clin. Endocrinol. Metab. 33:988-991, 1971).
Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony et al. {In Handbook of Experimental Immunology, Wier, D. (ed.) chapter 19. Blackwell, 1973). Plateau concentration of antibody is usually in the range of about 0.1 to 0.2 mg/ml of serum (about 12 μM). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher {Manual of Clinical Immunology, Ch. 42, 1980).
C Antibodies Raised against Synthetic Peptides
A third approach to raising antibodies against a specific protein or peptide (such as a peptide that is specific to a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986, such as those described herein) is to use one or more synthetic peptides synthesized on a commercially available peptide synthesizer based upon the predicted amino acid sequence of the protein or peptide. Polyclonal antibodies can be generated by injecting these peptides into, for instance, rabbits or mice.
D. Antibodies Raised by Injection of Encoding Sequence
Antibodies may be raised against proteins and peptides by subcutaneous injection of a DNA vector that expresses the desired protein or peptide, or a fragment thereof, into laboratory animals, such as mice. Delivery of the recombinant vector into the animals may be achieved using a hand-held form of the Biolistic system (Sanford et al, Particulate ScL Technol. 5:27-37, 1987) as described by Tang et al. {Nature 356:152-154, 1992). Expression vectors suitable for this purpose may include those that express a protein- encoding sequence (for instance, a protein encoding ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986) under the transcriptional control of either the human β-actin promoter or the cytomegalovirus (CMV) promoter.
Antibody preparations prepared according to these protocols are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of antigen in a biological sample; or for immunolocalization of the specified protein.
Optionally, antibodies, such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-specific monoclonal antibodies, can be humanized by methods known in the art. Antibodies with a desired binding specificity can be commercially humanized (Scotgene, Scotland, UK; Oxford Molecular, Palo Alto, CA).
E. Antibodies Specific for Specific Protein Variants
Antibodies can be produced that specifically recognize protein variants (and peptides derived therefrom). In particular, production of antibodies (and fragments and engineered versions thereof) that recognize at least one variant protein with a higher affinity than they recognize a corresponding protein is beneficial, as the resultant antibodies can be used in analysis, diagnosis and treatment (for example, inhibition or enhancement of protein action, such as, for instance, inhibition or enhancement of a biological activity of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986), as well as in study and examination of the proteins themselves.
In particular embodiments, it is beneficial to generate antibodies from a peptide taken from a variation-specific region of the target protein. By way of example, such regions include any peptide (usually four or more amino acids in length) that overlaps with one or more of SNP-encoded variants in a coding sequence described herein. Longer peptides also can be used, and in some instances will produce a stronger or more reliable immunogenic response. Thus, it is contemplated in some embodiments that more than four amino acids are used to elicit the immune response, for instance, at least 5, at least 6, at least 8, at least 10, at least 12, at least 15, at least 18, at least 20, at least 25, or more, such as 30, 40, 50, or even longer peptides. Also, it will be understood by those of ordinary skill that it is beneficial in some instances to include adjuvants and other immune response enhancers, including passenger peptides or proteins when using peptides to induce an immune response for production of antibodies.
Embodiments are not limited to antibodies that recognize epitopes containing the actual mutation identified in each variant. Instead, it is contemplated that variant-specific antibodies also may each recognize an epitope located anywhere throughout the specified variant molecule, which epitopes are changed in conformation and/or availability because of the mutation. Antibodies directed to any of these variant-specific epitopes are also encompassed herein. By way of example, the following references provide descriptions of methods for making antibodies specific to mutant proteins: Hills et al., {Int. J. Cancer 63: 537-543, 1995); Reiter & Maihle (Nucleic Acids Res. 24: 4050-4056, 1996); Okamoto et al. (Br. J. Cancer 73: 1366-1372, 1996); Nakayashiki et al. (Jpn. J. Cancer Res., 91: 1035-1043, 2000); Gannon et al (EMBO J., 9: 1595-1602, 1990); Wong e? α/. (Cancer Res., 46: 6029- 6033, 1986); and Carney et al. {J. Cell Biochem., 32: 207-214, 1986). Similar methods can be employed to generate antibodies specific to specific protein variants, including variants of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or another protein encoded by a gene or EST or other sequence in the region at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3 discussed herein.
VIII. Microarrays
In particular examples, methods for detecting a polymorphism in the ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 genes use the arrays disclosed herein. Such arrays can include nucleic acid molecules. In one example, the array includes nucleic acid oligonucleotide probes that can hybridize to polymorphic ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 gene sequences, such as those polymorphisms discussed herein. Certain of such arrays (as well as the methods described herein) can include other polymorphisms associated with risk or protection from developing ISR, as well as other sequences, such as one or more probes that recognize one or more housekeeping genes.
The arrays herein termed "ISR detection arrays," are used to determine the genetic susceptibility of a subject to developing ISR. In one example, a set of oligonucleotide probes is attached to the surface of a solid support for use in detection of a polymorphism in the ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genes, such as those amplified nucleic acid sequences obtained from the subject. Additionally, if an internal control nucleic acid sequence was amplified in the amplification reaction (see above), an oligonucleotide probe can be included to detect the presence of this amplified nucleic acid molecule. The oligonucleotide probes bound to the array can specifically bind sequences amplified in an amplification reaction (such as under high stringency conditions). Oligonucleotides comprising at least 15, 20, 25, 30, 35, 40 or more consecutive nucleotides of the ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genes may be used.
The methods and apparatus in accordance with the present disclosure take advantage of the fact that under appropriate conditions oligonucleotides form base-paired duplexes with nucleic acid molecules that have a complementary base sequence. The stability of the duplex is dependent on a number of factors, including the length of the oligonucleotides, the base composition, and the composition of the solution in which hybridization is effected. The effects of base composition on duplex stability may be reduced by carrying out the hybridization in particular solutions, for example in the presence of high concentrations of tertiary or quaternary amines.
The thermal stability of the duplex is also dependent on the degree of sequence similarity between the sequences. By carrying out the hybridization at temperatures close to the anticipated Tm's of the type of duplexes expected to be formed between the target sequences and the oligonucleotides bound to the array, the rate of formation of mis-matched duplexes may be substantially reduced.
The length of each oligonucleotide sequence employed in the array can be selected to optimize binding of target ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 nucleic acid sequences. An optimum length for use with a particular ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 nucleic acid sequence under specific screening conditions can be determined empirically. Thus, the length for each individual element of the set of oligonucleotide sequences including in the array can be optimized for screening. In one example, oligonucleotide probes are from about 20 to about 35 nucleotides in length or about 25 to about 40 nucleotides in length.
The oligonucleotide probe sequences forming the array can be directly linked to the support, for example via the 5'- or 3'-end of the probe. In one example, the oligonucleotides are bound to the solid support by the 5' end. However, one of skill in the art can determine whether the use of the 3' end or the 5' end of the oligonucleotide is suitable for bonding to the solid support. In general, the internal complementarity of an oligonucleotide probe in the region of the 3' end and the 5' end determines binding to the support. Alternatively, the oligonucleotide probes can be attached to the support by non-ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 sequences such as oligonucleotides or other molecules that serve as spacers or linkers to the solid support. In another example, an array includes protein sequences, which include at least one
ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein (or genes, cDNAs or other polynucleotide molecules including one of the listed sequences, or a fragment thereof), or a fragment of such protein, or an antibody specific to such a protein or protein fragment. The proteins or antibodies forming the array can be directly linked to the support. Alternatively, the proteins or antibodies can be attached to the support by spacers or linkers to the solid support.
Abnormalities in ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and FLJ21986 proteins can be detected using, for instance, an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and/or FLJ21986 protein-specific binding agent, which in some instances will be detectably labeled. In certain examples, therefore, detecting an abnormality includes contacting a sample from the subject with an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 and/or FLJ21986 protein-specific binding agent; and detecting whether the binding agent is bound by the sample and thereby measuring the levels of the ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 and/or FLJ21986 protein present in the sample, in which a difference in the level of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 and/or FLJ21986 protein in the sample, relative to the level of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 and/or FLJ21986 protein found an analogous sample from a subject not predisposed to developing ISR, or a standard ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and/or FLJ21986 protein level in analogous samples from a subject not having a predisposition for developing ISR, is an abnormality in that ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 and/or FLJ21986 molecule.
In particular examples, the microarray material is formed from glass (silicon dioxide). Suitable silicon dioxide types for the solid support include, but are not limited to: aluminosilicate, borosilicate, silica, soda lime, zinc titania and fused silica (for example see Schena, Microarray Analysis. John Wiley & Sons, Inc, Hoboken, New Jersey, 2003). The attachment of nucleic acids to the surface of the glass can be achieved by methods known in the art, for example by surface treatments that form from an organic polymer. Particular examples include, but are not limited to: polypropylene, polyethylene, polybutylene, polyisobutylene, polybutadiene, polyisoprene, polyvinylpyrrolidine, polytetrafluroethylene, polyvinylidene difluroide, polyfluoroethylene-propylene, polyethylenevinyl alcohol, polymethylpentene, polycholorotrifluoroethylene, polysulfornes, hydroxylated biaxially oriented polypropylene, aminated biaxially oriented polypropylene, thiolated biaxially oriented polypropylene, etyleneacrylic acid, thylene methacrylic acid, and blends of copolymers thereof (see U.S. Patent No. 5,985,567, herein incorporated by reference), organosilane compounds that provide chemically active amine or aldehyde groups, epoxy or polylysine treatment of the microarray. Another example of a solid support surface is polypropylene.
In general, suitable characteristics of the material that can be used to form the solid support surface include: being amenable to surface activation such that upon activation, the surface of the support is capable of covalently attaching a biomolecule such as an oligonucleotide thereto; amenability to "in situ" synthesis of biomolecules; being chemically inert such that at the areas on the support not occupied by the oligonucleotides are not amenable to non-specific binding, or when non-specific binding occurs, such materials can be readily removed from the surface without removing the oligonucleotides.
In one example, the surface treatment is amine-containing silane derivatives. Attachment of nucleic acids to an amine surface occurs via interactions between negatively charged phosphate groups on the DNA backbone and positively charged amino groups (Schena, Microarray Analysis. John Wiley & Sons, Inc, Hoboken, New Jersey, 2003). In another example, reactive aldehyde groups are used as surface treatment. Attachment to the aldehyde surface is achieved by the addition of 5 '-amine group or amino linker to the DNA of interest. Binding occurs when the nonbonding electron pair on the amine linker acts as a nucleophile that attacks the electropositive carbon atom of the aldehyde group.
A wide variety of array formats can be employed in accordance with the present disclosure. One example includes a linear array of oligonucleotide bands, generally referred to in the art as a dipstick. Another suitable format includes a two-dimensional pattern of discrete cells (such as 4096 squares in a 64 by 64 array). As is appreciated by those skilled in the art, other array formats including, but not limited to slot (rectangular) and circular arrays are equally suitable for use (see U.S. Patent No. 5,981,185, herein incorporated by reference). In one example, the array is formed on a polymer medium, which is a thread, membrane or film. An example of an organic polymer medium is a polypropylene sheet having a thickness on the order of about 1 mm (0.001 inch) to about 20 mm, although the thickness of the film is not critical and can be varied over a fairly broad range. Particularly disclosed for preparation of arrays are biaxially oriented polypropylene (BOPP) films; in addition to their durability, BOPP films exhibit a low background fluorescence. In a particular example, the array is a solid phase, Allele-Specific Oligonucleotides (ASO) based nucleic acid array. The array formats of the present disclosure can be included in a variety of different types of formats. A "format" includes any format to which the solid support can be affixed, such as microtiter plates, test tubes, inorganic sheets, dipsticks, and the like. For example, when the solid support is a polypropylene thread, one or more polypropylene threads can be affixed to a plastic dipstick-type device; polypropylene membranes can be affixed to glass slides. The particular format is, in and of itself, unimportant. All that is necessary is that the solid support can be affixed thereto without affecting the functional behavior of the solid support or any biopolymer absorbed thereon, and that the format (such as the dipstick or slide) is stable to any materials into which the device is introduced (such as clinical samples and hybridization solutions). The arrays of the present disclosure can be prepared by a variety of approaches. In one example, oligonucleotide or protein sequences are synthesized separately and then attached to a solid support (see U.S. Patent No. 6,013,789, herein incorporated by reference). In another example, sequences are synthesized directly onto the support to provide the desired array (see U.S. Patent No. 5,554,501, herein incorporated by reference). Suitable methods for covalently coupling oligonucleotides and proteins to a solid support and for directly synthesizing the oligonucleotides or proteins onto the support are known to those working in the field; a summary of suitable methods can be found in Matson et ah, Anal. Biochem. 217:306-10, 1994. In one example, the oligonucleotides are synthesized onto the support using conventional chemical techniques for preparing oligonucleotides on solid supports (such as see PCT Publication Nos. WO 85/01051 and WO 89/10977, or U.S. Patent No. 5,554,501, each of which is herein incorporated by reference).
A suitable array can be produced using automated means to synthesize oligonucleotides in the cells of the array by laying down the precursors for the four bases in a predetermined pattern. Briefly, a multiple-channel automated chemical delivery system is employed to create oligonucleotide probe populations in parallel rows (corresponding in number to the number of channels in the delivery system) across the substrate. Following completion of oligonucleotide synthesis in a first direction, the substrate can then be rotated by 90° to permit synthesis to proceed within a second (2°) set of rows that are now perpendicular to the first set. This process creates a multiple-channel array whose intersection generates a plurality of discrete cells.
In particular examples, the oligonucleotide probes on the array include one or more labels, that permit detection of oligonucleotide probe:target sequence hybridization complexes.
IX. Kits
The present disclosure provides for kits that can be used to determine whether a subject, such as an otherwise healthy human subject, is genetically predisposed to ISR. Such kits allow one to determine if a subject has one or more genetic mutations or polymorphisms in ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 gene sequences.
The kits contain reagents useful for determining the presence or absence of at least one polymorphism in a subject's ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genes, such as probes or primers that selectively hybridize to an ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 polymorphic sequence identified herein. Such kits can be used with the methods described herein to determine a subject's ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 genotype or haplotype.
Oligonucleotide probes and/or primers may be supplied in the form of a kit for use in detection of a specific ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 sequence, such as a SNP or haplotype described herein, in a subject. In such a kit, an appropriate amount of one or more of the oligonucleotide primers is provided in one or more containers. The oligonucleotide primers may be provided suspended in an aqueous solution or as a freeze-dried or lyophilized powder, for instance. The container(s) in which the oligonucleotide(s) are supplied can be any conventional container that is capable of holding the supplied form, for instance, microfuge tubes, ampoules, or bottles. In some applications, pairs of primers may be provided in pre-measured single use amounts in individual, typically disposable, tubes or equivalent containers. With such an arrangement, the sample to be tested for the presence of a ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 polymorphism can be added to the individual tubes and amplification carried out directly. The amount of each oligonucleotide primer supplied in the kit can be any appropriate amount, depending for instance on the market to which the product is directed. For instance, if the kit is adapted for research or clinical use, the amount of each oligonucleotide primer provided would likely be an amount sufficient to prime several PCR amplification reactions. Those of ordinary skill in the art know the amount of oligonucleotide primer that is appropriate for use in a single amplification reaction. General guidelines may for instance be found in Innis et αl. (PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990), Sambrook et αl. (In Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, New York, 1989), and Ausubel et αl. (In Current Protocols in Molecular Biology. Greene Publ. Assoc, and Wiley-Intersciences, 1992). A kit may include more than two primers, in order to facilitate the in vitro amplification of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 -encoding sequences, for instance a specific target ARNTL, NOV, PKP 4, TAF4B, EPHBl, ST 18 or FLJ21986 gene or the 5' or 3' flanking region thereof.
In some embodiments, kits may also include the reagents necessary to carry out nucleotide amplification reactions, including, for instance, DNA sample preparation reagents, appropriate buffers (such as polymerase buffer), salts (for example, magnesium chloride), and deoxyribonucleotides (dNTPs).
Kits may in addition include either labeled or unlabeled oligonucleotide probes for use in detection of ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 polymorphisms or haplotypes. In certain embodiments, these probes will be specific for a potential polymorphic site that may be present in the target amplified sequences. The appropriate sequences for such a probe will be any sequence that includes one or more of the identified polymorphic sites, such that the sequence the probe is complementary to a polymorphic site and the surrounding ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or
FLJ21986 sequence. By way of example, such probes are of at least 6 nucleotides in length, and the polymorphic site occurs at any position within the length of the probe. It is often beneficial to use longer probes, in order to ensure specificity. Thus, in some embodiments, the probe is at least 8, at least 10, at least 12, at least 15, at least 20, at least 30 nucleotides or longer.
It may also be advantageous to provide in the kit one or more control sequences for use in the amplification reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art. By way of example, control sequences may comprise human (or non-human) ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 nucleic acid molecule(s) with known sequence at one or more target SNP positions, such as those described herein. Controls may also comprise non-ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 nucleic acid molecules.
In some embodiments, kits may also include some or all of the reagents necessary to carry out RT-PCR in vitro amplification reactions, including, for instance, RNA sample preparation reagents (including for example, an RNase inhibitor), appropriate buffers (for example, polymerase buffer), salts (for example, magnesium chloride), and deoxyribonucleotides (dNTPs).
Such kits may in addition include either labeled or unlabeled oligonucleotide probes for use in detection of the in vitro amplified target sequences. The appropriate sequences for such a probe will be any sequence that falls between the annealing sites of the two provided oligonucleotide primers, such that the sequence the probe is complementary to is amplified during the PCR reaction. In certain embodiments, these probes will be specific for a potential polymorphism that may be present in the target amplified sequences.
It may also be advantageous to provide in the kit one or more control sequences for use in the RT-PCR reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.
Kits for the detection or analysis of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein expression (such as over- or under-expression, or expression of a specific isoform) are also encompassed. Such kits may include at least one target protein specific binding agent (for example, a polyclonal or monoclonal antibody or antibody fragment that specifically recognizes a ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein, or a specific polymorphic form of a ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein) and may include at least one control (such as a determined amount of target ARNTL, NOV, PKP4, TAF4B, EPHB 1 , ST 18 or FLJ21986 protein, or a sample containing a determined amount of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein). The ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-protein specific binding agent and control may be contained in separate containers. The antibodies may have the ability to distinguish between polymorphic forms of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein.
ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein or isoform expression detection kits may also include a means for detecting ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986:binding agent complexes, for instance the agent may be detectably labeled. If the detectable agent is not labeled, it may be detected by second antibodies or protein A, for example, which may also be provided in some kits in one or more separate containers. Such techniques are well known.
Additional components in specific kits may include instructions for carrying out the assay. Instructions will allow the tester to determine ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 expression level. Reaction vessels and auxiliary reagents such as chromogens, buffers, enzymes, etc. may also be included in the kits. The instructions can provide calibration curves or charts to compare with the determined (for example, experimentally measured) values.
Also provided are kits that allow differentiation between individuals who are homozygous versus heterozygous for specific SNPs (or haplotypes) of the ARNTL, NOV, PKP 4, TAF4B or FLJ21986 genes as described herein. Examples of such kits provide the materials necessary to perform oligonucleotide ligation assays (OLA), as described in Nickerson ef α/., Proc. Natl. Acad. ScL U.S.A. 57:8923-8927, 1990. In specific embodiments, these kits contain one or more microtiter plate assays, designed to detect polymorphism(s) in a ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 sequence of a subject, as described herein. Instructions in these kits will allow the tester to determine whether a specified ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 allele is present, and whether it is homozygous or heterozygous. It may also be advantageous to provide in the kit one or more control sequences for use in the OLA reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.
The kit may involve the use of a number of assay formats including those involving nucleic acid binding, such binding to filters, beads, or microtiter plates and the like. Techniques may include dot blots, RNA blots, DNA blots, PCR, restriction fragment length polymorphism (RFLP), and the like.
Microarray-based kits are also provided. These microarray kits may be of use in genotyping analyses. In general, these kits include one or more oligonucleotides provided immobilized on a substrate, for example at an addressable location. The kit also includes instructions, usually written instructions, to assist the user in probing the array. Such instructions can optionally be provided on a computer readable medium
Kits may additionally include one or more buffers for use during assay of the provided array. For instance, such buffers may include a low stringency wash, a high stringency wash, and/or a stripping solution. These buffers may be provided in bulk, where each container of buffer is large enough to hold sufficient buffer for several probing or washing or stripping procedures. Alternatively, the buffers can be provided in pre-measured aliquots, which would be tailored to the size and style of array included in the kit. Certain kits may also provide one or more containers in which to carry out array-probing reactions. Kits may in addition include one or more containers of detector molecules, such as antibodies or probes (or mixtures of antibodies, mixtures of probes, or mixtures of the antibodies and probes), for detecting biomolecules captured on the array. The kit may also include either labeled or unlabeled control probe molecules, to provide for internal tests of either the labeling procedure or probing of the array, or both. The control probe molecules may be provided suspended in an aqueous solution or as a freeze-dried or lyophilized powder, for instance. The container(s) in which the controls are supplied can be any conventional container that is capable of holding the supplied form, for instance, microfuge tubes, ampoules, or bottles. In some applications, control probes may be provided in pre- measured single use amounts in individual, typically disposable, tubes or equivalent containers. The amount of each control probe supplied in the kit can be any particular amount, depending for instance on the market to which the product is directed. For instance, if the kit is adapted for research or clinical use, sufficient control probe(s) likely will be provided to perform several controlled analyses of the array. Likewise, where multiple control probes are provided in one kit, the specific probes provided will be tailored to the market and the accompanying kit. In certain embodiments, a plurality of different control probes will be provided in a single kit, each control probe being from a different type of specimen found on an associated array (for example, in a kit that provides both eukaryotic and prokaryotic specimens, a prokaryote-specific control probe and a separate eukaryote-specific control probe may be provided).
In some embodiments of the current disclosure, kits may also include the reagents necessary to carry out one or more probe-labeling reactions. The specific reagents included will be chosen in order to satisfy the end user's needs, depending on the type of probe molecule (for example, DNA or RNA) and the method of labeling (for example, radiolabel incorporated during probe synthesis, attachable fluorescent tag, etc.).
Further kits are provided for the labeling of probe molecules for use in assaying arrays provided herein. Such kits may optionally include an array to be assayed by the so labeled probe molecules.
X. Transgenic Animals
A. Knockout and Overexpression Transgenic Animals
Mutant organisms that under-express or over-express one or more specific alleles of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or another protein encoded in the region at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3, such as those discussed herein, or that express a protein (such as a reporter protein) under the control of a regulatory sequence from that region, are useful for research. Such mutants allow insight into the physiological and/or psychological role of this genomic region, and more particularly the role of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 (and/or another protein encoded by a gene or EST or other sequence in the region of at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3, discussed herein) in a healthy and/or pathological organism. These "mutant organisms" are "genetically engineered," meaning that information in the form of nucleotides has been transferred into the mutant's genome at a location, or in a combination, in which it would not normally exist. Nucleotides transferred in this way are said to be "non-native." For example, a non-ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 promoter inserted upstream of an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-encoding sequence would be non-native (as would an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 promoter inserted upstream of a non-ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 encoding sequence). An extra copy of a gene (or cDNA) on a plasmid, transformed into a cell, also would be non-native.
Mutants may be, for example, produced from mammals, such as mice or rats, that either express, over-express, or under-express a specific allelic variant or haplotype or diplotype of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986, or that do not express ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 at all. Over-expression mutants are made by increasing the number of specified genes in the organism, or by introducing a specific allele into the organism under the control of a constitutive or inducible or viral promoter such as the mouse mammary tumor virus (MMTV) promoter or the whey acidic protein (WAP) promoter or the metallothionein promoter. Mutants that under-express a protein (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or another protein encoded by a gene or EST or other sequence at any one of cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 and 18q22.3, such as those discussed herein), or that do not express a specific allelic variant, may be made by using an inducible or repressible promoter, or by deleting the target gene, or by destroying or limiting the function of the target gene, for instance by disrupting the gene by transposon insertion.
Antisense genes or molecules (such as siRNAs) may be engineered into the organism, under a constitutive or inducible promoter, to decrease or prevent expression of a specific target gene (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986), as known to those of ordinary skill in the art.
A mutant mouse over-expressing a heterologous protein (such as a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 protein) may be made by constructing a plasmid having an ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 allele encoding sequence driven by a promoter, such as the mouse mammary tumor virus (MMTV) promoter or the whey acidic protein (WAP) promoter. This plasmid may be introduced into mouse oocytes by microinjection. Alternatively, expression of ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 or another gene (such as a reporter gene) can be driven by regulatory sequences from ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986, including specifically regions developed based on the SNP-containing sequences described herein. The oocytes are implanted into pseudopregnant females, and the litters are assayed for insertion of the transgene. Multiple strains containing the transgene are then available for study. WAP is quite specific for mammary gland expression during lactation, and MMTV is expressed in a variety of tissues including mammary gland, salivary gland and lymphoid tissues. Many other promoters might be used to achieve various patterns of expression, such as the metallothionein promoter.
An inducible system may be created in which the subject expression construct is driven by a promoter regulated by an agent that can be fed to the mouse, such as tetracycline. Such techniques are well known in the art.
A mutant knockout animal (for example, a mouse) from which a specific gene is deleted can be made by removing all or some of the coding regions of the gene from embryonic stem cells. The methods of creating deletion mutations by using a targeting vector have been described (Thomas and Capecchi, Cell 57:503-512, 1987). B. Knock-in Organisms
In addition to knock-out systems, it is also beneficial to generate "knock-ins" that have lost expression of the native protein but have gained expression of a different, usually mutant or identified allelic form of the same protein, or expression of the native protein under control of an altered regulatory sequence. By way of example, ARNTL, NOV, PKP 4, TAF4B, EPHBl, STl 8 or FLJ21986 (including variants thereof identified and discussed herein) can be expressed in a knockout background in order to provide model systems for studying the effects of these mutants. In particular embodiments, the resultant knock-in organisms provide systems for studying restenosis, particularly ISR, influence on endothelial or vascular smooth muscle cell proliferation, migration, and so forth. In addition, it is contemplated that knock-in organisms can be generated in which a reporter gene, or ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 itself, is expressed under the influence of a non-coding variant sequence described herein. In specific embodiments, expression of the same protein is driven by risk alleles at SNPs compared to non-risk alleles. Those of ordinary skill in the relevant art know methods of producing knock-in organisms. See, for instance, Rane et al. (MoI. Cell Biol. 22: 644-656, 2002); Sotillo et al. [EMBO J, 20: 6637-6647, 2001); Luo et al. [Oncogene 20: 320-328, 2001); Tomasson et al. [Blood 93: 1707-1714, 1999); Voncken et al. (Blood 86: 4603-4611, 1995); Andrae et al. (Mech. Dev. 107: 181-185, 2001); Reinertsen et al. (Gene Expr. (5: 301-314, 1997); Huang et al. (MoI. Med. 5: 129-137, 1999); Reichert et al. (Blood 97: 1399-1403, 2001); and Huettner et al. (Nat. Genet. 24: 57-60, 2000), by way of example. XI. Screening Assays for Compounds that Modulate Expression or Activity of a Target
The following assays are designed to identify compounds that interact with (for example, bind to) a variant form of an ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986; compounds that interact with (bind to) intracellular proteins that interact with such a variant form; compounds that interfere with the interaction of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 with transmembrane or intracellular proteins involved in signal transduction; and compounds which modulate the activity of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 {i.e., modulate the level of gene expression) or modulate the level of activity of a variant form of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986. Assays may additionally be utilized which identify compounds which bind to ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 regulatory sequences (such as promoter sequences) and which may modulate gene expression (see, for example, Platt, J. Biol. Chem. 2(59:28558-28562, 1994). It is contemplated that these assays also can be used to identify compounds that interact in any of the ways listed above with another gene, regulatory sequence, gene corresponding with an EST, or protein encoded thereby, from cytoband 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3, described herein as being linked to susceptibility to ISR.
The compounds which may be screened in accordance with the disclosure include, but are not limited to peptides, antibodies and fragments thereof, and other organic compounds (for example, peptidomimetics, small molecules) that bind to one or more variant sequences (including variant regulatory sequences or encoding sequences) as described herein and either mimic the activity triggered by the natural ligand (for example, agonists) or inhibit the activity triggered by the natural ligand (for example, antagonists); as well as peptides, antibodies or fragments thereof, and other organic compounds that mimic the a variant (or a portion thereof) and bind to and "neutralize" natural ligand.
Such compounds may include, but are not limited to, peptides such as, for example, soluble peptides, including, but not limited to members of random peptide libraries (see, for example, Lam et al, Nature 354:82-84, 1991; Houghten et al, Nature 354:84-86, 1991) and combinatorial chemistry-derived molecular libraries made of D- and/or L- configuration amino acids, phosphopeptides (including, but not limited to, members of random or partially degenerate, directed phosphopeptide libraries; see, for example, Songyang et al, Cell 72:161-11%, 1993), antibodies (including, but not limited to, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab')2 and Fab expression library fragments, and epitope-binding fragments thereof), and small organic or inorganic molecules.
Other compounds which can be screened in accordance with the disclosure include, but are not limited to small organic molecules that are able to gain entry into an appropriate cell and affect the expression of an ARNTL, NO V, PKP 4, TAF4B, EPHBl, STl 8 or
FLJ21986 gene or some other gene involved in a related signal transduction pathway (such as by interacting with the regulatory region or transcription factors involved in gene expression); or such compounds that affect the activity of a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or the activity of some other intracellular factor involved in the signal transduction pathway.
Computer modeling and searching technologies permit identification of compounds, or the improvement of already identified compounds, that can modulate expression or activity of a variant target protein. Having identified such a compound or composition, the active/binding/effector sites or regions are identified. Such active sites typically might be ligand binding sites, such as the interaction domains of a molecule with a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 itself or a sequence encoding the protein or regulating the expression thereof, or the interaction domains of a molecule with a specific allelic variant in comparison to the interaction domains of that molecule with another variant of the protein. The active site can be identified using methods known in the art including, for example, from the amino acid sequences of peptides, from the nucleotide sequences of nucleic acids, or from study of complexes of the relevant compound or composition with its natural ligand. In the latter case, chemical methods can be used to find the active site by finding where on the factor the complexed ligand is found. Next, the three dimensional geometric structure of the active site is determined. This can be done by known methods can determine a complete molecular structure. On the other hand, solid or liquid phase NMR can be used to determine certain intra-molecular distances. Any other experimental method of structure determination can be used to obtain partial or complete geometric structures, such as high resolution electron microscopy. The geometric structures may be measured with a complexed ligand, natural or artificial, which may increase the accuracy of the active site structure determined. In another embodiment, the structure of the specified target protein is compared to that of a "variant" of the specified protein and, rather than solve the entire structure, the structure is solved for the protein domains that are changed. If an incomplete or insufficiently accurate structure is determined, the methods of computer based numerical modeling can be used to complete the structure or improve its accuracy. Any recognized modeling method may be used, including parameterized models specific to particular biopolymers such as proteins or nucleic acids, molecular dynamics models based on computing molecular motions, statistical mechanics models based on thermal ensembles, or combined models. For most types of models, standard molecular force fields, representing the forces between constituent atoms and groups, are necessary, and can be selected from force fields known in physical chemistry. The incomplete or less accurate experimental structures can serve as constraints on the complete and more accurate structures computed by these modeling methods.
Finally, having determined the structure of the active site, either experimentally, by modeling, or by a combination, candidate modulating compounds can be identified by searching databases containing compounds along with information on their molecular structure. Such a search seeks compounds having structures that match the determined active site structure and that interact with the groups defining the active site. Such a search can be manual, but is preferably computer assisted. These compounds found from this search are potential ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986-modulating compounds.
Alternatively, these methods can be used to identify improved modulating compounds from an already known modulating compound or ligand. The composition of the known compound can be modified and the structural effects of modification can be determined using the experimental and computer modeling methods described above applied to the new composition. The altered structure is then compared to the active site structure of the compound to determine if an improved fit or interaction results. In this manner systematic variations in composition, such as by varying side groups, can be quickly evaluated to obtain modified modulating compounds or ligands of improved specificity or activity.
In another embodiment, the structure of a specified protein or nucleic acid sequence, such as a regulatory sequence, (the reference form) is compared to that of a variant protein or sequence (encoded by a different allele of the same protein, or a variant non-coding nucleic acid sequence such as a regulatory sequence containing one or more SNPs). Then, potential inhibitors (or enhancers) are designed that bring about a structural change in the reference form so that it resembles the variant form. Or, potential mimics are designed that bring about a structural change in the variant form so that it resembles another variant form, or the form of the reference receptor. In the case of nucleic acid sequences (including for instance regulatory sequences), the inhibitors, enhancers, or mimics may influence the binding of one or more other proteins to the nucleic acid sequence, for instance in a way that affects the transcription of an encoding sequence that is operably linked to that nucleic acid sequence. Further experimental and computer modeling methods useful to identify modulating compounds based upon identification of the active sites of compounds, various variants of the ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986, regulatory regions thereof, and other sequences or proteins encoded for in the region at cytoband 2q24.1 , 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql 1.23, 13ql4.3, 15q25.1 or 18q22.3, described herein, and related transduction and transcription factors will be apparent to those of skill in the art.
Examples of molecular modeling systems are the CHARMM and QUANTA programs (Polygen Corporation, Waltham, Mass.). CHARMM performs the energy minimization and molecular dynamics functions. QUANTA performs the construction, graphic modeling and analysis of molecular structure. QUANTA allows interactive construction, modification, visualization, and analysis of the behavior of molecules with each other.
A number of articles review computer modeling of drugs interactive with specific- proteins (such as Rotivinen et al. {Acta Pharmaceutical F ennica 97: 159-166, 1988; Ripka, New Scientist 54-57, 1988; McKinaly and Rossmann, Annu. Rev. Pharmacol. Toxicol.
29: 111-122, 1989; Perry and Davies, OSAR: Quantitative Structure-Activity Relationships in Drug Design pp. 189-193, 1989 (Alan R. Liss, Inc.); Lewis and Dean, Proc. R. Soc. Lond. 236: 125-140 and 141-162, 1989; and, with respect to a model receptor for nucleic acid components, Askew et al, JAm Chem Soc 111: 1082-1090, 1989). Other computer programs that screen and graphically depict chemicals are available from companies such as BioDesign, Inc. (Pasadena, Calif), Allelix, Inc. (Mississauga, Ontario, Canada) and Hypercube, Inc. (Cambridge, Ontario). Although these are primarily designed for application to drugs specific to particular proteins, they can be adapted to design of drugs specific to regions of DNA or RNA, once that region is identified. Although described above with reference to design and generation of compounds which could alter binding, one could also screen libraries of known compounds, including natural products or synthetic chemicals, and biologically active materials, including proteins, for compounds which are inhibitors or activators. Compounds identified via assays such as those described herein may be useful, for example, in elaborating the biological function of a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 gene product, and for designing therapeutic molecules useful in the diagnosis and/or treatment of restenosis, more specifically ISR, influence on endothelial or vascular smooth muscle cell proliferation or migration, and so forth.
XII. In Vitro Screening Assays for Compounds that Bind to a Nucleotide Variant
In vitro systems may be designed to identify compounds capable of interacting with a variant protein or nucleic acid sequence identified by the SNPs described herein. Compounds identified using such systems may be useful, for example, in modulating the activity of "wild type" (reference) and/or "variant" gene products (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986); in elaborating the biological function of such proteins; in screens for identifying compounds that disrupt normal protein-protein or protein- nucleic acid interactions; or to study or characterize the regulation of gene expression, for instance expression of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 or a reporter protein linked to a regulatory sequence from ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986 or another gene or EST or other sequence from cytoband 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3. One type of assay that can be used to identify compounds that bind to a variant molecule (such as a variant protein, peptide, or nucleic acid) involves preparing a reaction mixture of a variant molecule and a test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex which can be removed and/or detected in the reaction mixture. The molecular species used can vary depending upon the goal of the screening assay. For example, where agonists or antagonists of a protein are sought, the full length protein (for example, ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986), or a soluble truncated portion thereof, or a fusion protein containing a variant peptide fused to a protein or polypeptide that affords advantages in the assay system (such as labeling, isolation of the resulting complex, etc.) can be utilized. Where compounds that interact with a nucleic acid sequence, such as a regulatory or putative regulatory sequence, are sought to be identified, oligonucleotides corresponding to a variant sequence (containing at least one SNP position as discussed herein) and fusion nucleic acid molecules containing a variant sequence can be used. The screening assays can be conducted in a variety of ways. For example, one method to conduct such an assay involves anchoring a variant molecule (such as a protein, polypeptide, peptide or fusion protein, or nucleic acid) or the test substance(s), onto a solid phase and detecting variant molecule/test compound complexes anchored on the solid phase at the end of the reaction. In one embodiment of such a method, the variant molecule(s) may be anchored onto a solid surface, and the test compound(s), which is not anchored, may be labeled, either directly or indirectly.
In practice, microtiter plates may conveniently be utilized as the solid phase. The anchored component may be immobilized by non-covalent or covalent attachments. Non- covalent attachment may be accomplished by simply coating the solid surface (or a portion thereof) with a solution containing the protein (or nucleic acid) and drying. Alternatively, an immobilized specific binding agent, such as an antibody, preferably a monoclonal antibody, specific for the protein to be immobilized may be used to anchor the protein to the solid surface. The surfaces may be prepared in advance and stored. In order to conduct the assay, the nonimmobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (such as by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously nonimmobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously nonimmobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; for example, using a labeled antibody specific for the previously nonimmobilized component (the antibody, in turn, may be directly labeled or indirectly labeled with a labeled anti-Ig antibody).
Alternatively, a reaction can be conducted in a liquid phase, the reaction products separated from unreacted components, and complexes detected. Such detection can involve using an immobilized binding agent specific for the variant molecule (such as an antibody or other binding agent specific for a variant protein, polypeptide, peptide or fusion protein (for instance, ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986)) or specific for the test compound, to anchor or capture any complexes formed in solution, and a labeled antibody (or other binding agent) specific for the other component of the possible complex to detect anchored complexes. Alternatively, cell-based assays, membrane vesicle-based assays and membrane fraction-based assays can be used to identity compounds that interact with a variant molecule. To this end, cell lines that express a variant molecule, such as a variant ARNTL, NOV, PKP4, TAF4B, EPHBl, STl 8 or FLJ21986 encoding sequence or a regulatory sequence variant or other non-coding sequence variant (or combination of two or more variants) or cell lines (For example, COS cells, CHO cells, HEK293 cells, etc.) that have been genetically engineered to express a variant (for example, by transfection or transduction of protein encoding DNA) can be used. Interaction of the test compound with, for example, a variant protein (such as ARNTL, NOV, PKP4, TAF4B, EPHBl, ST 18 or FLJ21986) expressed by the host cell, or a variant nucleic acid sequence present in the host cell, can be determined by comparison or competition with a host cell not treated with the compound, or treated with another compound, or by examining one or more biological characteristics linked to the variant (such as endothelial or vascular smooth muscle cell proliferation or migration, for instance restenosis or ISR. A variant molecule, such as a variant nucleic acid or polypeptide (such as those described herein) may be employed in a screening process for compounds which bind the variant molecule and which activate (agonists) or inhibit activation (antagonists) of the molecule or one linked thereto. Thus, variant molecules described herein also may be used to assess the binding of small molecule substrates and ligands in, for example, cells, cell- free preparations, chemical libraries, and natural product mixtures. These substrates and ligands may be natural substrates and ligands or may be structural or functional mimetics (see Coligan et al. Current Protocols in Immunology 1 (2): Chapter 5, 1991).
In general, such screening procedures involve providing appropriate cells that express a polypeptide of the present disclosure, or a reporter polypeptide operably linked to a non-coding variant nucleic acid found at cytoband 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3, such as those discussed herein. Such cells include cells from mammals, insects, yeast, and bacteria. In particular, a polynucleotide regulatory sequence or polynucleotide encoding the polypeptide is employed to transfect cells to thereby express a variant molecule. The cell expressing the variant polypeptide or variant nucleic acid is then contacted with a test compound to observe binding, stimulation or inhibition of a functional response.
The technique may also be employed for screening of compounds which activate a molecule of the present disclosure by contacting such cells with compounds to be screened and determining whether such compound generates a signal, i.e., activates the polypeptide or reporter polypeptide.
Another method involves screening for compounds which are antagonists, and thus inhibit activation of a molecule of the present disclosure by determining inhibition of binding of labeled ligand, such as a factor that binds to a nucleic acid of the disclosure, to cells expressing the variant molecule or a reporter gene operably linked to a non-coding nucleic acid (such as a regulatory region). Such a method involves transfecting a eukaryotic cell with a DNA encoding a variant molecule such that the cell expresses the molecule (or expresses a reporter gene under the control of a non-coding region containing a variant SNP or haplotype as described herein). The cell is then contacted with a potential antagonist in the presence of a labeled form of a ligand or binding factor. The ligand/factor can be labeled, for example, with radioactivity. The amount of labeled ligand/factor bound to the variant molecule is measured, for example, by measuring radioactivity associated with transfected cells or membrane another fraction from these cells. If the compound binds to the variant molecule, the binding of labeled ligand/factor to the variant is inhibited as determined by a reduction of labeled ligand/factor that binds.
XIII. Pharmaceutical Preparations and Methods of Administration
Therapeutic compounds and agents can be administered directly to the mammalian subject for modulation of activity of ARNTL, NOV, PKP4, TAF4B, EPHBl, ST18 or FLJ21986 activity or expression, or the activity or expression of another gene, EST, or protein encoded by a gene or EST found in cytobands 2q24.1, 7q31.31, 8q24.12, l ip 15.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3, such as those discussed herein. Administration is by any of the routes normally used for introducing a modulator compound into ultimate contact with the tissue to be treated. The compounds or agents, alone or accompanied by one or more additional therapeutic agents, are administered in any suitable manner, optionally with pharmaceutically acceptable carrier(s). Suitable methods of administering such compounds/agents are available and well known to those of ordinary skill in the art, and, although more than one route can be used to administer a particular composition, a particular route can often provide a more immediate and more effective reaction than another route.
Pharmaceutically acceptable carriers are determined in part by the particular composition being administered, as well as by the particular method used to administer the composition. Accordingly, there is a wide variety of suitable formulations of pharmaceutical compositions of the present disclosure (see, for example, Remington 's Pharmaceutical
Sciences, 17th ed. 1985).
Formulations suitable for administration include aqueous and non-aqueous solutions, isotonic sterile solutions, which can contain antioxidants, buffers, bacteriostats, and solutes that render the formulation isotonic, and aqueous and non-aqueous sterile suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, and preservatives.
Compositions can be administered, for example, orally, parenterally, intrathecally, and so forth.
The formulations of compounds can be presented in unit-dose or multi-dose sealed containers, such as ampoules and vials. Solutions and suspensions can be prepared from sterile powders, granules, and tablets of the kind previously described. The compounds/agents also can be optionally administered as part of a prepared food or drug. The dose administered to a subject, in the context of the present disclosure, should be sufficient to affect a beneficial response in the subject over time. The dose will be determined by the efficacy of the particular compound/agent employed and the condition of the subject, as well as the body weight or surface area of the area to be treated, and whether the subject is being treated prophylactically or after the identification and diagnosis of a specific disease, condition, or disorder. The size of the dose also may be influenced by the existence, nature, and extent of any adverse side-effects that accompany the administration of a particular compound in a particular subject.
In determining the effective amounts of the modulator to be administered, a physician may evaluate circulating plasma levels of the modulator, modulator toxicities, and the production of anti-modulator antibodies. In general, the dose equivalent of a modulator is from about 1 ng/kg to 10 mg/kg for a typical subject. For administration, therapeutic compounds of the present disclosure can be administered at a rate determined by the LD50 of the modulator, and the side effects of the inhibitor at various concentrations, as applied to the mass and overall health of the subject.
Administration can be accomplished via single or divided doses.
The following examples are provided to illustrate certain particular features and/or embodiments. These examples should not be construed to limit the disclosure to the particular features or embodiments described. EXAMPLES
To investigate the genetic basis of in-stent restenosis (ISR), a case control association study was designed. In consideration of the common-disease common- variation (CDCV) hypothesis, an association analysis was expected to be more powerful than linkage analysis for the detection of common disease alleles that confer modest disease risks for ISR (Risch and Merikangas Science 273 (5281): 1516-1517, 1996). A genome-wide association approach was chosen, since the technologies are now available to assay SNPs at high density across the genome, and a truly unbiased approach would afford the best opportunity to identify genetic determinants of a complex disease in which the genetics are essentially unknown (Van Steen et al. Nat Genet. 37(7):683-691, 2005).
Materials and Methods
Study Subjects and Phenotypic Data
Patients were enrolled in the CardioGene Study (Ganesh et al., Pharmacogenomics. 5(7):952-1004, 2004) at the time of cardiac catheterization or clinical evaluation when a history of in-stent restenosis was noted. Patients were either enrolled in a prospective study (case 1) after de novo bare metal stent implantation (n=60 cases, n=298 controls) or in a case enrichment group (case 2; n=104), in which patients with a history of in-stent restenosis were enrolled to supplement the number of cases for analysis. In the prospective group, consecutive patients presenting to the catheterization laboratory were screened and enrolled without selection. Historical cases were required to have at least two instances of in-stent restenosis to maximize the chance that these patients had an inherent predisposition to ISR as opposed to patients who are not particularly predisposed but may have developed ISR as a result of excessive injury to the artery during stent implantation. In-stent restenosis was clinically defined as any target vessel revascularization or the development of symptomatic ischemia in the target vessel territory (ischemic symptoms or a positive functional ischemia test), as previously described (Ganesh et al. Pharmacogenomics. 5(7):952-1004, 2004). Basic clinical parameters relevant to ISR outcomes were not different between the two groups (see Table 1). For analyses using time to restenosis (TTRS), the number of days between stent placement and the first objective evidence of ischemia, using either stress test or cardiac catheterization, was recorded for each patient. For case 2 patients, the most recent bare metallic stent implanted in a naϊve coronary artery lesion was designated as the index stent, for the purpose of establishing TTRS (see Figure 3).
Figure imgf000069_0001
SNP Genotyping and Quality Control
DNA was isolated from blood using commercially available methods to extract genomic DNA from leukocytes (Qiagen, Valencia, CA). Genomic DNA also was obtained from the Centre d'Etude du Polymorphisme Humain (CEPH) cohort, for use as a processing control and evaluation of genotype data reproducibility. Each individual was genotyped using the Affymetrix GeneChip Mapping IOOK Set of microarrays, which consists of two chips {Xbal and HinDIII) with approximately 50,000 SNPs on each chip. Genomic DNA (250 ng) was digested with two restriction enzymes and processed according to the
Affymetrix protocol. Image analysis of each chip was obtained with a laser scanner and raw image data was processed using GeneChip™ DNA Analysis Software (GDAS; Affymetrix, Santa Clara, CA). The CEPH control sample was inserted in between patient samples and analyzed on 22 sets of chips. Using the default settings of GDAS to implement the dynamic model (DM) genotype-calling algorithm, the median call rate per chip was 98.47%. The GDAS algorithm allows for increased stringency of the genotype calls. For the analyses presented, the GDAS genotype calling p-value stringency was increased from the default value of 0.25 to 0.05 (see Table T), and a call rate of 85% was needed on each chip from each patient for inclusion in the study. Reproducibility was assessed using the 22 CEPH chips run from the same CEPH individual, and 99.79% concordance was noted across all SNPs genotyped. X-chromosome heterozygosity was used to confirm gender identity for each patient. As a further check for internal consistency between the two chips run for each patient, the 31 SNPs on both the Xba and Hind chips were checked for matching of the genotypes across the two chips per patient. Among the cohorts, 438 patients had adequate DNA and chips. STRUCTURE was used to evaluate each patient's genotype data. For 28 patients, the self-reported race did not match the STRUCTURE output and these patients were removed. Additionally, given the low frequency of non- Caucasian patients in this study and known variances in allele frequencies between different races, non-Caucasian patients were excluded from analysis. This left 407 patients for analysis, with 150 cases (51 patients from case 1 and 99 patients from case T) and 257 controls.
A secondary analysis was conducted using the newer Bayesian Robust Linear Modeling using Mahalanobis distance (BRLMM) algorithm (Rabbee and Speed, Bioinformatics:22(l):l-\2, 2006) to call genotypes from the raw chip image data, which improves heterozygote call rates. The median call rate using this algorithm was 99.19%, and all subsequent analysis steps were conducted using the same methods used in the primary analysis of the DM-based data set.
Various quality control measures were performed to ensure that the genotype data analyzed would represent only those SNPs for which the data quality would support meaningful conclusions (see Figure 4). Quality control of the genotype data was performed in a step-wise manner. First, patients with a chip call rate < 85% were excluded. Next, SNPs were removed from the analysis if the SNP was called in less than 25% of CEPH control replicate chips (22 sets of chips; 82 SNPs removed). To remove uninformative SNPs, SNPs with a minor allele frequency <0.8% in the cohort were removed (6963 SNPs), monomorphic SNPs (855 SNPs) were removed, and SNPs with no homozygotes in the cohort were removed (9 SNPs). Finally, SNPs with a minor allele frequency < 1% were removed (7816 SNPs) and SNPs on the X chromosome were excluded. At this stage, 104,581 SNPs were available for analysis. To remove SNP genotypes likely due to error, Hardy- Weinberg equilibrium (HWE) tests were calculated using exact tests for each of 104,581 SNPs on all 407 samples, using 4.78e-7 (0.05/104,581) as the threshold for filtering based on departure from Hardy- Weinberg equilibrium. After this final stage of filtering, 99,523 SNPs were available for analysis.
Figure imgf000071_0001
Statistical analysis
Univariate tests of Allelic Association: In the primary analysis, the 99,523 SNPs from 407 patients' chips were analyzed using genotypes determined by the DM algorithm. First, univariate tests of allelic association were conducted using a training/test set approach. The cohort was randomly divided into two subsets, and tests of allelic association of each SNP were conducted in the training subset. The test set was used as an independent validation set of findings identified in the training set. Association of each SNP with disease status in the training set was tested by constructing a 2x2 contingency table of allele frequencies and conducting a chi-squared test. A Cochran-Armitage trend test was also conducted using three genetic models: recessive, additive and dominant (Sasieni, Biometrics 53(^: 1253-1261, 1997; Slager and Schaid, Hum. Hered. 52^: 149-153, 2001; Freidlin et al. Hum. Hered. 53(3):\A6-\52, 2002). The nominal p-value for multiple testing was corrected by applying the Bonferroni criterion. In a secondary analysis using the BRLMM algorithm to call genotypes, 100,648SNPs were available for analysis. Analyses were conducted using SAS (SAS, Inc., Cary, NC). Negative analyses were used to try to link SNP findings to neighboring genes.
Regional Haplotype Analysis: The results of univariate tests of allelic association were used as a screen to identify loci which were selectively tested for haplotype association by selecting SNPs with uncorrected p-value <0.001. Genomic regions were selected by the presence of two or more of these SNPs within 250 kb of one another. Haplotypes were defined in the regions using a 4-gamete test (Barrett et al. Bioinformatics 21(2):263-265, 2005). Haplotypes with a population frequency >10% were selected to provide a non- overlapping set of haplotypes, favoring those with the largest population frequency. The final set of haplotypes was tested for association with ISR and a Bonferroni multiple testing correction was applied globally, correcting for the overall number of haplotype association tests performed.
Cox regression Analysis of Haplotypes and Time-to-ISR: Haplotypes were defined in these regions as described using a 4-gamete test using Haploview software (Simon et al. J. Clin. Invest. 105(3):293-300, 2000). For each region, the most probable haplotype pair per subject was estimated using Haplo. stats software (Sinnwell and Schaid, Haplo Stats, version 1.2.0, http://mayoresearch.mayo.edu/mayo/research/biostat/schaid.cfm). Haplotypes with a population frequency greater than 5%, as determined in the CardioGene cohort, were then considered for clustering. Based on a hierarchical clustering using a distance metric defined by the number of differences in genotypes, two distinct groups of haplotypes (A and B) were formed for each region. Cox regression analysis was then performed to examine the effect of haplotype cluster status (AA, AB or BB) on the time to development of ISR. Biological Validation/Gene expression studies RT-PCR
Total RNA extracted from human tissue (heart, lung, liver, skeletal muscle, brain, skin, carotid artery, and fetal aorta), as well as universal human RNA, were purchased from a commercial source (Stratagene, Inc., La Jolla, CA). Coronary artery specimens were obtained from freshly explanted hearts at the time of heart transplantation at the Johns Hopkins Hospital, under a protocol that was reviewed and approved by the hospital IRB to be exempt status research. Arterial samples were examined grossly for pathology, flash frozen and stored at -800C until homogenization for RNA isolation. Total RNA was isolated using Trizol (Invitrogen, Carlsbad, CA) and reverse transcribed (lμg) using poly(dT) and avian myelobastosis virus-reverse transcriptase (Omniscript, Qiagen) at 37°C for 60 min. cDNA was diluted 1 :5 and stored at -200C. PCR primers were generated using a web-based prediction algorithm (available on the internet at genome.wi.mit.edu/genome- softwaree/other/primer3.html) (Table 3). 5 μL of cDNA was then used as template in a PCR amplification reaction using DNA polymerase (HotStarTaq®, QIAgen, Valencia, CA) and the gene-specific primers (see Table 3). Reactions were carried out for 25-30 cycles of amplification, consisting of 1 minute strand separation at 95°C, 1 minute of annealing at 59°C (NOV) or 66°C (all other primers), and 2 minute extension at 72°C, with a final 10 minute extension at 72°C. PCR products were run on a 1% agarose gel with ethidium bromide and the bands visualized and photographed under UV light with the UVP gel documentation system.
Table 3: RT-PCR Primers and Conditions
Figure imgf000073_0001
Quantitative real-time PCR
Levels of mRNA expression for each of the five genes as well as an endogenous reference (GAPDH) were measured by quantitative real-time RT-PCR based on Taqman- chemistry using an ABI PRISM 7700 sequence detector (Applied Biosystems, Foster City, CA, USA). Specific primers and probes were designed using GenBank accession codes listed in Table 4. All PCR reactions had an amplification efficiency of at least 95% and were very specific, showing only a single PCR product on gel. One hundred nanograms of genomic DNA served as a negative control and did not result in product amplification for any of these reactions, confirming the specificity of these reactions for RNA detection. Two μl of cDNA for each gene was amplified in the presence of 300 nM forward and reverse primers, 50 nM probe, 200 μM dNTPs, 4 mM MgC12 and 1.25 U of AmpliTaqTM gold DNA polymerase in Taqman buffer A (Applied Biosystems) in a total volume of 50 μl. Samples were heated for 10 minutes at 95°C and amplified in 40 cycles of 15 seconds at 95°C and 60 seconds at 600C. A positive control was amplified on each plate to verify the amplification efficiency within each experiment. The average Ct-value was used to calculate mRNA expression levels of the PCR targets relative to the expression level of the two reference genes using the comparative Ct method using the equation: relative expression = 2- [Ct(target)-Ct(reference gene)] x 100. Results were exported to Excel and analyzed via the standard curve method.
Table 4: Quantitative primers used for RT-PCR
Figure imgf000074_0001
Immunohistochemistry
After harvesting, coronary artery specimens were either snap-frozen, for subsequent RNA isolation, or immediately fixed in 10% formalin overnight for immunohistochemical analysis. Atherosclerotic tissues with calcification visible by flat plate x-ray were decalcified by immersion in 10% EDTA, pH 6.0 for 5 days. Matched unaffected coronary arteries were similarly fixed and stained concurrently. Immunohistochemical staining was performed using anti- ARNTL antibody (rabbit polyclonal, Chemicon, Temecula, CA) diluted 1 :200; anti-NOV antibody (goat polyclonal, RD Systems, Tustin, CA) diluted 1 :3; anti-PKP- 4 antibody (rabbit polyclonal) diluted 1 :200; and anti-TAF4B rabbit antisera diluted 1 :100. Diaminobenzidine staining was performed according to standard protocols. As a negative control for staining with anti-ARNTL, NOV and PKP4 antibodies, the procedure was repeated with no primary antibody, as well as IgG control for the corresponding species in which the primary antibody was raised. Pre-immune rabbit sera were used as for the corresponding control staining for TAF4B for each section stained.
Results
Association of Specific SNPS with In-stent Restenosis
A database was generated of 116,204 SNPs across the genome using the Affymetrix IOOK Mapping Assay. Patients were enrolled prospectively (n=357 cases) from the time of stenting, and clinical follow-up was ascertained at 6 and 12 months, to determine whether or not ISR had occurred (see Table 1). To enrich the case pool, additional cases (n=104) were enrolled with a history of ISR that had occurred two or more times (case2 group). 96,767 SNPs that passed quality control checks, were informative and were located on autosomal chromosomes were analyzed. Univariate methods: Allelic association of each SNP with disease status was tested in each subgroup. A p-value of 5 x 10~8 (p=0.05 after Bonferroni correction for 1 million independent tests) has been proposed as a conservative threshold for significance of association in a genome-wide study (Farb et al. Circulation 99(l):44-52, 1999). Seeking to minimize false positives, the Bonferroni correction method was used to deal with the multiple testing problem. In the training set, one SNP (SNP A- 1674604 / rs4979254) passed the Bonferroni corrected significance threshold (p=0.0416, adjusting for 96,382 SNPs). The allelic association of this single SNP alone was then tested in the test set and found not to be significant (p=0.796).
Similarly, the reverse procedure was conducted, testing allelic association in the test set, and no SNPs were identified as significant after Bonferroni correction. Given that multiple testing correction using Bonferroni correction is severe when adjusting for 96,382 SNPs tested in the discovery process, the 5 SNPs closest to the significance threshold were additionally tested in each subset of the data and tested in the other subset at a Bonferroni- adjusted statistical significance threshold of p=0.01, in either direction. Upon validation testing, no SNPs were significant past this threshold. Finally, using the independent training and test subset results of allelic association testing, an analysis was performed using Fisher's combination of the p-values from each data set in which allelic association of all SNPs was tested, Bonferroni corrected for 96,767 SNPs. After correction, one SNP was identified on chromosome 5 (SNP A-1686530 / rslO515758) (p=0.0086). Repartitioning of the cohort three times confirmed significance of this SNP.
To maximize power for testing univariate allelic association (Skol et al. Nat. Genet. 38(2):209-213, 2006), all 96,767 SNPs were tested for association among the entire cohort (see Figure 7), considering the two case groups as independent and ordinal. A means test was performed with a recessive model and two SNPs were identified that were significant after Bonferroni correction: SNP A-1745890 / rs958505 (p=0.019), and SNP A-1686530 / rslO515758 (p= p=0.0353). Using various haplotype and LD-based methods, these SNPs were not related to surrounding genes. Since indirect association is the presumed mode of detecting signals in a study such as this, and to further investigate the regions surrounding these two SNPs, haplotype analysis was performed mirroring the 2-group and 3 -group design of statistical testing performed for allelic association of each SNP. Haplotype analysis using a 2-group test of the assayed SNPs within a set distance of each SNP (4-gamete test, using Haploview) comparing all cases versus controls showed no significant haplotypes. Using a 3-group analysis, the two case groups were treated as non-ordered outcomes, and significant haplotypes, ranging from 2-6 SNPs, were identified around each of these SNPs. However, none of these haplotype results included the SNPs identified in each region. To further understand LD in these regions, another series of haplotype analyses were performed using HapMap-defined haplotype data to define blocks. Two blocks were identified, 240 kb (SNP A- 1745890 / rs958505) and 7 kb (SNP A-1686530 / rslO515758), with one in each region corresponding to the SNPs identified.
Furthermore, univariate analysis of the surrounding SNPs demonstrated that none of the neighboring SNPs demonstrated p-values close to significance (QQ plots), suggesting that the signal identified in this region was constrained to a relatively narrow locus. The haplotype analysis was extended to evaluate the relationships of the identified SNPs to neighboring genes. The SNP on chromosome 7, SNP A-1745890 / rs958505, is 32kb downstream from the nearest gene, KCNV2 (potassium channel, subfamily V, member T), and SNP A- 1686530 / rs 10515758 on chromosome 5 is 554 kb upstream from the nearest gene, EBF (early B-cell factor). Taking these data in sum, the highest confidence was in the association of SNP A-1686530 / rslO515758, given that it was identified using an analysis of independent testing in separate subsets of the cohort. SNP A-1745890 / rs958505 also passed significance threshold after Bonferroni correction, but was not identified in any analysis in which the patient cohort was analyzed using a validation approach. Linking these two SNPs to surrounding genes was unsuccessful based on further analysis of genotype data.
Therefore, no clear functional hypotheses could be supported by the results of univariate analysis of allelic association in the described genome-wide association study. The limitations of such univariate testing are related to impact on power, for both the ability to detect single SNPs passing a Bonferroni-corrected significance threshold in a case control association study, as well as given that in complex disease. It is expected that multiple loci contribute to disease, each with a subtle overall effect. This may be especially true in this study of ISR, in which controls had no evidence of ISR at a defined endpoint assessment but had underlying CAD of sufficient severity to require revascularization therapy, as did all cases.
Multivariate methods: Multivariate methods are expected to more accurately assess the genetics of complex diseases (Ritchie et al. Am. J. Hum. Genet. 69(1): 138-147, 2001; Hoh and Ott. Hum. Hered. 50(l):85-89, 2000; Hoh and Ott. Nat. Rev. Genet. 4(9):701-709, 2003; Culverhouse et al. Genet. Epidemiol. 27(2): 141-152, 2004). Multivariate approaches that considered SNP genotype patterns and SNP-SNP interactions (Halushka et al. Nat. Genet. 22(3):239-247, 1999) were considered, since univariate methods did not yield functional hypotheses. Given that the CDCV hypothesis is based upon the notion that multiple genes may each be subtly contributing to disease, it was hypothesized that the signals identified in the univariate analyses may be of insufficient strength to withstand a severe multiple testing correction such as the Bonferroni method, but may still be further evaluated for meaningful genotype patterns related to disease status. Furthermore, these patterns would assist in identifying candidate susceptibility regions for ISR. Multivariate approaches were first considered that address SNP-SNP interactions, as well as SNP genotype patterns (Glazier et al. Science 298(5602):2345-2349, 2002), using methodologies supported in the literature.
Regional Haplotypes analysis: To test gene associations, two main approaches have been advocated: univariate, including chi-squared testing of allelic association, and multivariate methods that include haplotype association, with the haplotypes built upon unphased multilocus genotypes. In a genome wide association study, both approaches entail significant multiple testing. A novel step-wise approach was developed and applied that uses univariate tests of allelic association as a screen to identify loci which are then selectively tested for haplotype association. It is believed that multiple alleles are involved in ISR and the genome -wide strategy described herein was designed to allow assessment of linkage disequilibrium at multiple loci and to map disease loci and genes in an unbiased manner (Lander and Schork Science 265(5181):2037-2048, 1994; Jorde, Genome Res. 10(10): 1435-1444, 2000; Risch, Nature 405(6788):847-856, 2000; Abecasis et al. Am. J. Hum. Genet. 68(6):1463-1474, 2001).
A regional haplotype analysis was undertaken, focused on those regions with multiple SNPs meeting a minimum p-value threshold of 0.001 in univariate testing (see Figures 5 and 6, as well as Tables 5, 6 and 7). The results of four chi-squared tests of association that were performed on each SNP were first sorted, comparing all cases versus all controls in the cohort, using an allele-based test, and 3 trend tests with recessive, additive and dominant models. For each SNP, the minimum p-value was reported. SNPs with a minimum p-value <0.001 were selected, yielding a set of 255 SNPs. Regions for further analysis were selected if they contained more than one SNP within 250 kb of one another (see Table 5), allowing the first and last SNP within a contiguous region to define the initial boundaries for each "block." The final block boundaries for further testing were defined by adding 3 kb to either end of the initial block defined by the physical positions of the SNPs identified by univariate analysis.
SNP genotypes for all 10OK SNPs that passed the original data cleaning, prior to chi- squared testing, and were located within the final physical block, were included as input for haplotype analysis. Haplotype structure within each block was then determined using Haploview (Barrett et al. Bioinformatics 21(2):263-265, 2005) software to conduct a 4- gamete test. Those haplotypes with a population frequency greater than 10% were then filtered to provide a non-redundant list of haplotypes, giving priority to the haplotype with the largest population frequency in the case of multiple haplotypes in the same region with p<0.05. (See Figures 5 and 6, as well as Tables 5, 6 and 7). The final haplotypes were tested for association with restenosis, and after Bonferroni multiple testing correction was applied in a global manner to all 84 haplotypes tested across all regions, eight regions were identified containing haplotypes with significant association with ISR. Among these, five regions contained the genes NOV, ARNTL, PKP4, TAF4B and FLJ21986 (see Tables 5, 6, and 7).
Further strengthening the findings of the regional haplotype approach analysis is the result of a repeat analysis conducted using 1 Mb (see Table 6) as the SNP-SNP physical distance threshold, rather than 250 kb, in which was identified a haplotype block containing SNP A-1686530 / rsl 0515758 on chromosome 5, previously identified by univariate testing for allelic association. (See Figures 5 and 6, as well as Tables 5, 6 and 7; see also Table 31, which lists the top 1000 SNPs identified throughout this analysis as being linked to ISR.)
To investigate the possibility of artifactual results of the haplotype analysis, further post hoc evaluation of allele frequency and deviations from Hardy- Weinberg equilibrium (HWE) of the SNPs in these haplotypes was conducted. HWE checks were used since the initial criteria for the chi-squared analysis was set at a non-stringent value. The examination of allele frequencies showed three SNPs (SNP A-1718483 / rsl0505358, SNP_A-1755122 / rsl461692, SNP_A-1695875 / rsl0505360) with minor allele frequencies less than 5% in cases, controls or the overall population in the CardioGene cohort. These three SNPs were identified in a haplotype identified at the 5' region of the NOV gene in the chromosome 8 q24.12 region. These SNPs were also found to have low allele frequencies in CEPH control individuals as well, with heterozygosity <5%, thus indicating that these were likely to be the true allele frequencies of these SNPs rather than genotyping error. These three SNPs did not diminish the significance of the haplotype in this region, and did not impact the significance of the 3' haplotype associated with the NOV gene, as none of the three SNPs are part of that haplotype.
Since the extent of LD within a region identified through a genome wide association study is of importance in relating the identified signals with a distinct region (Zhu et al. Genome Res. 13(2): 173-181, 2003), D' was calculated across the regions identified in this analysis using both the 10OK data as well as HapMap Caucasians. This analysis confirms a high degree of LD in all of the identified regions, with some regions showing particularly strong LD (see Figure 1).
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
*The haplotype block is significant in the haplotype based association analysis ^The haplotype block is significant, however, there is no gene located in the block Bold indicates that the gene locates in this haplotype block
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
*The haplotype block is significant in the haplotype based association analysis ^The haplotype block is significant, however, there is no gene located in the block Bold indicates that the gene locates in this haplotype block
Figure imgf000087_0001
Figure imgf000088_0001
87
Figure imgf000089_0001
Figure imgf000090_0001
SNP does not comprise haplotype *SNP part of a significant haplotype block Bold: Significant but no SNP nearby
89
Vascular Expression of Candidate Genes for ISR
To develop functional hypotheses regarding the candidate genes identified in the primary regional haplotypes analysis, vascular expression by RT-PCR of the five genes in human coronary artery tissues as well as a survey of other human tissues was first defined (Figure 2A). All five genes are expressed in normal human coronary arteries, with NOV demonstrating the highest level of relative expression and greatest specificity for arterial tissue. Comparing the expression of these genes in arteries with evidence of atherosclerosis, all five genes are again expressed, with NO V again highly expressed in normal coronary artery. Antibodies have been developed for NOV, ARNTL, PKP4, TAF4B, and immunostaining in normal, atherosclerotic and restenotic arteries demonstrates specific expression patterns for each gene (Figures 2B-D). ARNTL and NOV staining is observed most specifically in the media of normal coronary artery, with less distinct staining for PKP 4 and TAF4b (Figure 2B). In the atherosclerotic artery, NO V is expressed in cellular regions of atherosclerotic plaque as well as the arterial media, as seen in the normal coronary artery (Figure 2B). In a stented human coronary artery with significant neointimal hyperplasia within the stent, Movat pentachrome staining demonstrates heterogeneous staining within the neointimal lesion, with cellular regions as well as more fibrous and less cellular regions indicated as yellow staining. Within the cellular regions of the neointimal lesion, positive staining is observed for ARNTL, NOV and PKP4 (Figure 2C).
Secondary Quantitative Trait Analysis
To further understand the relationship of the identified regions to the ISR phenotype, the time interval was examined between stent implantation and first presentation with evidence of clinical restenosis as a quantitative trait. This time interval provides additional phenotypic characterization of each patient, as it is a marker of disease severity, with more severe exuberant vascular wound repair mechanisms operative in ISR occurring early after stent implantation. The trait is independent of the binary ISR outcome and was analyzed in order to gain further understanding of allele copy number effects on the disease phenotype. Analysis of the time interval was first conducted as a continuous variable against each SΝP in significant haplotypes in a case-only Cox regression analysis. No significant trends were identified in this analysis (see Figure 7), and since these are regions in which significant haplotypes are associated with ISR, the tests are limited in that dependencies between SNPs are not considered. Using methods previously described (Marchini et al. Nat. Genet. 37(4):413-417, 2005), haplotypes were analyzed in each of the eight candidates' susceptibility regions for haplotype association with time to restenosis. First, genealogic trees were constructed on the basis of the number of differences between haplotypes in each region, using the UPGMA (unweighted pair group method using arithmetic averages) clustering method and using the proportion of SNPs with the same allele as a similarity measure to perform hierarchical cluster analysis. Clustered haplotypes were designated as A and B clusters, so that further analysis was conducted on A/A, A/B or B/B haplotype status in a Cox regression. All eight haplotypes were significant in this analysis, suggesting that copy number at polymorphic sites in these regions plays a role in the development of ISR (see Tables 8-15 as well as Figure 8A-H). The region on chromosome 18 cytoband ql 1.2 shows the strongest association between haplotype status and the rapidity of onset of ISR. As a control for this analysis, haplotypes that were not significantly associated with ISR were additionally tested in a Cox regression, demonstrating no significant association between the time to development of ISR and haplotype status.
Haplotypes in the regions associated with ISR are more informative than any single SNP in explaining inter-individual variation in the time to development of ISR, supporting the need to consider multivariate approaches to this data.
Tables 8-15 are shown below. Bolded SNPs (in the Haplotypes column) distinguish two clusters. N/A indicates an extra plot.
Table 8: 2q24.1 (PKP4)
Figure imgf000092_0001
Figure imgf000092_0002
Figure imgf000092_0003
Figure imgf000093_0002
Haplotype3 is excluded for case only analysis
Table 10: 8q24.12 (NOV)
Figure imgf000093_0003
Figure imgf000093_0004
Table 11: Ilpl5.2 (ARNTL)
Figure imgf000093_0005
Figure imgf000093_0006
Figure imgf000093_0001
Figure imgf000093_0007
Table 13: 2pl6.1
Figure imgf000093_0008
Figure imgf000094_0001
Table 14: 4q31.21
Figure imgf000094_0002
Figure imgf000094_0003
Table 15: 7p21.2
Figure imgf000094_0004
Figure imgf000094_0005
Secondary Analysis of Genotypes Called Using the BRLMM Algorithm In univariate tests of allelic association, no significant associations in the BRLMM- derived genotype data set were observed after Bonferroni correction. The regional haplotype approach was applied in a similar manner and derived partly overlapping results with the regional haplotype analysis applied to the DM-based genotype data (Table 16). An analysis of the top significant SNPs is shown in Table 17. The relationship between haplotypes identified in this analysis and the time to ISR in a Cox regression was also tested, which demonstrated significance of all of the additional regions identified (p<0.003; see Tables 18- 29). Table 16: Primary and Secondary Analysis using DM and BRLMM Genotype Calls
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000104_0001
Tables 18-29, shown below, provide the results of TTRS analysis based on haplotype clustering for the BRLMM results. Bold SNPs (in the Haplotypes column) distinguish two clusters. N/A indicates an extra plot.
Figure imgf000105_0001
Figure imgf000105_0002
Table 19: 322.1 EPHBl
Figure imgf000105_0003
Figure imgf000105_0004
Table 20: 8 ll.23 ST18
Figure imgf000105_0005
Figure imgf000105_0006
Table 21: Il l5.2 ARNTL
Figure imgf000105_0007
Figure imgf000106_0001
Table 22: 18qll.2 (TAF4B)
Figure imgf000106_0002
Figure imgf000106_0003
Table 23: Ip31.1
Figure imgf000106_0004
Figure imgf000106_0005
Table 24: 2pl6.1
Figure imgf000106_0006
Figure imgf000106_0007
Table 25: 2q22.3
Figure imgf000106_0008
Figure imgf000107_0001
Figure imgf000107_0002
Table 26: 7p21.2
Figure imgf000107_0003
Figure imgf000107_0004
Table 27: 13ql4.3
Figure imgf000107_0005
Figure imgf000107_0006
Table 28: 15q25.1
Figure imgf000107_0007
Figure imgf000107_0008
Table 29: 18q22.3
Figure imgf000108_0001
Figure imgf000108_0002
Correlating Haplotypes with a Nucleotide Sequence
SNPs in each significant haplotype block were analyzed using the Haplo. score part of the Haplostats package (Schaid et al. Am. J. Hum. Genet. 70:425-434, 2002). Observed haplotypes in the CardioGene patient sample were reported as frequencies among each case group and the control group. Table 30 provides a summary of the SNPs identified in each significant haplotype block, including the chromosomal location of each SNP (designated as "location") and the nucleotide change corresponding to each polymorphism. Since each patient in the study has two strands of DNA at each locus, two haplotypes were estimated per patient and tallies reflect two counts per patient. The haplotype shown in bold is associated with ISR. Haplotypes in the regional haplotype analysis results were defined using " 1 " and "2" coding for alleles. For the haploscore analysis, " 1 " corresponds to allele "A," as annotated by Affymetrix. Accordingly, "2" corresponds to allele "B." The haplotype sequences provided reflect the haplotype sequence that can be precisely ascertained by cross- referencing these data to NCBI genomic sequence data and HapMap haplotype data. This requires careful manual review of SNPs on the forward and reverse DNA strands. Haplotypes identified in the CardioGene analyses may then be definitely linked to each haplotype' s precise nucleotide sequence, such as has been described by Cargill et al. {Am. J. Hum. Genet. 80(2):273-90, 2007).
Figure imgf000109_0001
Figure imgf000110_0001
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Evaluation of Epistasis
Multi-locus models of epistasis contributing to disease have been considered for SNP-SNP interactions. The absolute best multi-locus approach to a genome-wide association study such as this is unknown due to the complexities of not knowing the number of interacting loci, and the form of interaction. Therefore, the method applied was designed to investigate two-SNP interactions, recently described in a series of simulation experiments (Lin et al. Bioinformatics 20(8):1233-1240, 2004). A 2-stage approach was used, in which the first stage selects SNPs with allelic association on a per-SNP bases meeting statistical significance with p<0.01 (1154 SNPs) or p<0.1 (10967 SNPs). The selected SNPs were then tested for all possible permutations, and after correction for multiple testing, no 2-way SNP interactions were significant (p=0.721, and p=1.0, in the respective analyses). Additionally, a modification of the approach suggested by Marchini et al. (Marchini et al. Nat. Genet. 37(4):413-417, 2005) was used to test all 2-way SNP interactions, with no significant interactions identified. Given that these relatively simple approaches to epistasis were negative, in which only 2-way interactions are considered, and any possible underlying interaction model in restenosis is completely unknown, analysis for more complex interactions were not pursued further.
SNP Signature Analysis Clustering techniques have been well-validated and extensively applied in the analysis of microarray transcript analysis, to identify relevant patterns in genomic data. The application of clustering methods to SNP genotype data has been limited, but methods have been developed and applied in specific situations, such as identifying loss of heterozygosity (Lindblad-Toh et al. Nat. Biotechnol. 18(9): 1001-1005, 2000; Janne et al. Oncogene 23(15):2716-2726, 2004; Lin et al. J. Biol. Chem. 280(9):8229-8237, 2005).
A K-nearest neighbor (KNN) analysis of the dataset was undertaken, to identify SNPs that cluster cases from controls. Since the predictive value of a set of SNPs would ideally be developed in an independent subset of the data and then tested for sensitivity and specificity in the remaining data, here again a training/test set approach was employed. No SNP patterns were identified that had predictive measures better than chance in the validation set. The decrease in power in the application of a training / test set approach may be limiting, so an analysis of the entire cohort was undertaken with a 10-fold cross-validation method. This also showed no clear predictive pattern. Using a cross-validation method, power can be diminished both in the 90% of the data in which the models are first developed as well as in validation testing in the final 10%. Therefore, a final analysis was undertaken on the entire cohort of patients simultaneously. This showed that 20 SNPs separated cases and controls, with sensitivity 68.7% and specificity 72.8%. To determine the likelihood that these results may be observed due to chance alone, a permutation test was performed 1000 times in which case and control labels were permuted. Sensitivity and specificity were observed greater than the values obtained by testing on the data in 45% of permutations. This corresponds to a p- value of 0.45, and it was therefore concluded that this analysis did not reveal a predictive pattern of SNPs that reliably separate cases from controls.
SNP Genotyping and Quality Control
A single aliquot of DNA from the CEPH longitudinal study was obtained and inserted in to the 96-well plates into which DNA samples were aliquoted prior to analysis in this study. The CEPH control sample therefore served as a quality control measure of the process of generating the genotype data. The CEPH control sample was analyzed in 22 sets of chips, providing a rich dataset for evaluation of reproducibility of each SNP' s genotype call. The concordance was calculated as:
Number of calls for the most frequent genotype C = (Number of replicates - Number of No Calls)
Discussion
A genome-wide association study of the complex and common cardiovascular disease ISR was conducted. Using univariate methods, two SNPs were identified with significant allelic association with ISR, after Bonferroni correction. However, the associations of these two SNPs did not extend into surrounding regions containing genes. Thus, specific functional hypotheses regarding these two SNPs and possible resulting alterations in gene function in ISR were not generated. A multivariate approach to the data was therefore undertaken, analyzing those SNPs meeting a less conservative threshold for allelic association. This expanded set of SNPs was examined for clustering of multiple SNPs within set physical distances of one another. Regions of 250 kb and 1 mb were analyzed for haplotypes contained within the regions identified. The final list of nonredundant haplotypes was then tested for association with ISR. Since multiple haplotypes were tested for association, a global Bonferroni correction was applied, correcting for all haplotypes tested across all regions. Eight candidate susceptibility regions for ISR were identified using this method, five of which contain genes. All five genes are expressed in human arterial tissue (Figures 2 and 10).
The genes identified by this study are NOV, ARNTL, TAF4B, PKP4 and a hypothetical protein referred to as FLJ21986. ARNTL, TAF4B, PKP4 and FLJ21986 have not been previously studied in the context of vascular injury responses, but NOV has a known role in vascular remodeling relevant to angiogenesis and vascular wound repair previously reported in the literature (Ellis et al. Arterioscler. Thromb. Vase. Biol. 20(8):1912-1919, 2000; Carlson et al. Nature 429(6990):446-452, 2004). Each of the five genes identified is expressed in human coronary arteries, to varying degrees and with differing patterns across the spectrum of arteries that are morphologically normal, atherosclerotic and restenotic. Interestingly, this gene shows the prominent difference in expression between normal and atherosclerotic arteries.
In a secondary analysis, a relationship between the time to development of ISR and haplotype status was identified, suggesting a possible allele dose effect. Further, in another secondary analysis of the data in which genotypes were called using the BRLMM algorithm for determining genotype calls, eight regions were identified that were not identified in the primary analysis of DM -based genotype calls. The BRLMM algorithm increases heterozygote calls and is increasingly used for primary analysis of Affymetrix IOOK genotype-calling but may lead to increased errors of those genotypes called. Therefore, this analysis is considered secondary. Two of the additional regions contain the genes EPHBl and ST18. EPHBl (ephrin receptor Bl) is a member of a class of receptors that has been well-described in a number of developmental processes, including angiogenesis and tumor neovascularization. ST 18 (suppression of tumorogenicity 18) is a zinc finger protein with little functional annotation in the literature. As genome-wide association (GWA) studies are expected to play a central role in the identification of genetic variants responsible for common human diseases, technical and informatic methodologies have been developed for whole-genome association analyses (Bonnen et al. Nat. Genet. 38(2):214-217, 2006). With advances in genotyping technology, GWA studies using hundreds of thousands of SNP markers are now possible. These tools are available to scientists who aim to study the genetic basis of complex diseases and now face the daunting challenge of data analysis. While the technologies used herein have been applied in other published studies (Klein et al. Science 308(5720):385-389, 2005; Herbert et al. Science 312(5771):279-283, 2006; Maraganore et al. Am. J. Hum. Genet. 77(5):685-693, 2005) and GWA has been conducted using less dense marker strategies (Tamiya et al. Hum. MoI. Genet. 14(16):2305-2321, 2005), this disclosure is the first application of a truly genome-wide case control association approach using high density SNP markers for the study of a complex cardiovascular disease.
In GWA studies, the concern that multiple testing will yield false positive results is valid, and this is the basis for the rationale to apply stringent correction methods. However, when analysis is undertaken on these large datasets for the purpose of identifying multiple loci, each with hypothesized subtle contribution to diseases risk, as is expected to be the case in genetically complex diseases, corrections of univariate tests may be prone to type 2 errors, rejecting true positives while attempting to minimize false positives. Given these analytic caveats, one goal of disease-specific studies such as this one is to identify candidate susceptibility loci and genes, which serve as the basis for further hypothesis-driven investigation of candidate regions (Mackay, Nat Rev Genet. 2(1): 11-20, 2001). This defines the "phase I" portion of genetic studies.
This disclosure demonstrates the effectiveness of a method that allows full use of genome-wide SNP marker data that does not stringently correct allelic associations upfront, but does so in a secondary haplotype analysis that further explores patterns evident among rank order relationships among the univariate test results. Multiple candidate susceptibility loci were identified containing five genes, each of which can be related to vascular homeostasis and alterations in disease. The reported findings from univariate tests of allelic association are in keeping with the observation that non-coding regulatory variants may more often contribute to complex traits than coding sequence variants (King and Wilson Science 188(4184): 107-116, 1975; Korstanje and Paigen, Nat. Genet. 31(3):235-236, 2002; Symula et al. Nat. Genet. 23(2):241-244, 1999). In the case of identifying noncoding regulatory variants, interpreting the consequences of these sequence variants is complex, in contrast to findings in coding regions, where the effect of polymorphisms on resulting protein sequence or RNA splicing is more readily tested in the basic laboratory. Tests can be performed of noncoding variants associated with a complex disease, including those variants identified in gene regulatory and intergenic regions, using techniques such as bacterial artificial chromosome transgenesis. Dissecting the molecular consequences of specific sequence variation are not as straightforward compared to experiments that may be designed to evaluate the significance of variants identified within protein coding sequences (Mackay. Nat Rev. Genet. 2(1): 11-20, 2001 ; Moliterno. Comprehensive Cardiovascular Medicine. Philadelphia: Lippincott- Raven, 1998). A multivariate approach to the data was therefore taken, relying on established methodologies for assessing haplotypes.
Identification of multiple candidate susceptibility regions for ISR, some of which contain genes with distinct vascular expression patterns and prior scientific data supporting specific hypothesis generation in the pathogenesis of ISR, is in keeping with the common- disease common- variant (CDCV) hypothesis of genetically complex diseases. Further, relationships are demonstrated at polymorphic sites between copy number and a marker of disease severity (the time between stent implantation and the development of the disease). The unique characteristics of the ISR clinical phenotype, with a precisely known time of injury to the vascular wall and well-defined clinical endpoints within established timeframes, makes ISR more amenable to genetic analysis compared to many other complex traits (Zondervan and Cardon, Nat .Rev. Genet. 5(2):89-100, 2004; Thomas et al. Am. J. Hum. Genet. 77(3):337-345, 2005). ISR also serves as a more severe phenotypic model for vascular remodeling that occurs over longer time courses in response to more indolent vascular injuries, such as those that culminate in the development of atherosclerosis. After stent implantation, the time course of ISR delineates the two major causes of stent occlusion over the months after stent implantation, with occlusion due to ISR occurring within a year and atherosclerotic disease progression typically occurring after one year. The time course of these events with drug-elution stents (DES) is less clear, favoring the research strategy of selectively studying patients who received bare metallic, or non-drug-eluting, stents (BMS). Several considerations in the design and execution of the described study emphasize the need to use clinical phenotype data to maximize power and to apply analytic methods that are best-suited to the structure of the data, when statistical power cannot be increased by unlimited patient enrollment and genotyping. To reduce the likelihood of bias and cryptic population structure (stratification), patients were enrolled prospectively without selection, at two enrollment sites in the United Statues with high volume cardiac catheterization facilities. With this study design, cases and controls had similar underlying atherosclerotic coronary artery disease (Zondervan and Cardon, Nat .Rev. Genet. 5(2):89-100, 2004). All patients enrolled had coronary atherosclerosis of sufficient severity to warrant invasive revascularization therapy with a stent. Therefore, all individuals in this study represent the more severe end of the spectrum of atherosclerosis, and this study specifically sought to examine differences specific to the development of ISR against the background of severe CAD. Patients receiving BMS before the widespread approval and use of DES in the United States were studied. ISR occurs in approximately 10-30% of patients receiving BMS and 4-12% of patients receiving DES. Patients (357) were enrolled prior to the widespread use of DES at the enrollment centers for this study in late 2003. At that time and since then, over 90% of percutaneous interventions involve stents, the vast majority of which involve the use of DES. After the availability of DES, BMS use is more often limited to those lesions with characteristics that required stent characteristics not manufactured with a drug-eluting component. Therefore, enrollment of patients treated with BMS after DES entered the picture would have introduced a significant selection bias to the study, given that more complicated lesions tend to have higher rates of ISR and would have constituted the majority of lesions treated with BMS. Since clinical study design is of critical importance in genetic association studies (Zondervan and Cardon, Nat. Rev. Genet. 5(2):89-100, 2004), enrollment was closed when DES were approved for use in the United States.
Among the 357 patients enrolled prospectively, 58 developed ISR. To ensure that the study would have adequate power for a range of genotype relative risks and allele frequencies, additional cases were enrolled with a history of ISR. Such enrichment strategies may have both advantages and disadvantages (Clayton and McKeigue, Lancet 358(9290): 1356- 1360, 2001; Pritchard e? α/. Genetics 155(2):945-959, 2000), but when appropriately applied, enriched ascertainment schemes may increase power by increasing the relative contribution of disease alleles in cases (Zondervan and Cardon, Nat .Rev. Genet. 5(2):89-100, 2004).
Further phenotype driven analyses were applied, using the continuous trait of time elapse between stent placement and the development of ISR. These analyses aim to increase confidence in specific findings using all available information related to the clinical phenotype of ISR, thereby furthering the basis for specific functional hypothesis generation regarding the genes identified in the analyses described herein.
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Figure imgf000137_0001
Figure imgf000138_0001
Figure imgf000139_0001
Figure imgf000140_0001
Figure imgf000141_0001
Figure imgf000142_0001
Figure imgf000143_0001
Figure imgf000144_0001
Figure imgf000145_0001
Figure imgf000146_0001
Figure imgf000147_0001
Figure imgf000148_0001
Figure imgf000149_0001
Figure imgf000150_0001
Figure imgf000151_0001
Figure imgf000152_0001
Figure imgf000153_0001
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Figure imgf000159_0001
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
Figure imgf000171_0001
This disclosure provides the identification of several SNPs and specific haplotypes that are linked to susceptibility to ISR. The disclosure further provides use of the identified SNPs and haplotypes in methods, including diagnostic, prognostic, and predictive methods, as well as methods for screening for compounds that interact with a variant nucleotide (such as specific variants discussed herein at cytobands 2q24.1, 7q31.31, 8q24.12, I lpl5.2, 18ql l.2, 2pl6.1, 4q31.21, 7p21.2, Ip31.1, 2p24.1, 2q22.3, 3q22.1, 8ql l.23, 13ql4.3, 15q25.1 or 18q22.3) or compounds that interact with or influence the expression of a protein encoded thereby.
It will be apparent that the precise details of the methods described may be varied or modified without departing from the spirit of the described subject matter. We claim all such modifications and variations that fall within the scope and spirit of the claims below.

Claims

1. A method for identifying a subj ect having an increased risk of developing restenosis, comprising: (i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 159314989, 159324358, 159328041 and 159328524 of the PKP4 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 1121 is associated with an increased risk of developing restenosis.
2. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 120477232, 120478404 and 120498610 of the FLJ21986 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 112 is associated with an increased risk of developing restenosis.
3. A method for identifying a subject having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 120490715,
120495854, 120496829, 120504993, 120505089, 120505599, 120506127, 120506187, 120506423, 120513265, 120513339 and 120515067 of the NOV gene in the nucleic acid sample, wherein the presence of a haplotype comprising 112221212111 is associated with an increased risk of developing restenosis. 4. A method for identifying a subject having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 13240687, 13241857, 13250481, 13254501, 13254501 and 13255095 of the ARNTL gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21211 is associated with an increased risk of developing restenosis.
5. A method for identifying a subject having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 22175498 and 22176091 of the TAF4B gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21 is associated with an increased risk of developing restenosis.
6. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 57171286, 57186783 and 57187625 of cytoband 2pl6.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 211 is associated with an increased risk of developing restenosis.
7. A method for identifying a subject having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 145650644, 145658039 and 145682952 of cytoband 4q31.21 in the nucleic acid sample, wherein the presence of a haplotype comprising 221 is associated with an increased risk of developing restenosis.
8. A method for identifying a subject having an increased risk of developing restenosis, comprising: (i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 13506254,
13506374, 13507300, 13507512, 13507965, 13513798 and 13514074 of cytoband 7p21.2 in the nucleic acid sample, wherein the presence of a haplotype comprising 1222211 is associated with an increased risk of developing restenosis.
9. A method for identifying a subject having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 75060532,
75060768, 75063372 and 75064286 of cytoband Ip31.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 1112 is associated with an increased risk of developing restenosis.
10. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 57113139, 57115011, 57128636 and 57129478 of cytoband 2pl6.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 2211 is associated with an increased risk of developing restenosis.
11. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 22093576 and 22113252 of cytoband 2p24.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 11 is associated with an increased risk of developing restenosis.
12. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 146511704 and 146552402 of cytoband 2q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 22 is associated with an increased risk of developing restenosis.
13. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 146613745 and
146620324 of cytoband 2q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 11 is associated with an increased risk of developing restenosis.
14. A method for identifying a subj ect having an increased risk of developing restenosis, comprising: (i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 159141311, 159144184, 159151183, 159152825, 159155564, 159163035, 159180909 and 159181800 of the PKP4 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 22112122 is associated with an increased risk of developing restenosis.
15. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 136151449 and 136152557 of the EPHBl gene in the nucleic acid sample, wherein the presence of a haplotype comprising 21 is associated with an increased risk of developing restenosis.
16. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and
(ii) determining the nucleotide present at chromosomal positions 53402598 and 53403770 of the ST18 gene in the nucleic acid sample, wherein the presence of a haplotype comprising 12 is associated with an increased risk of developing restenosis.
17. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 13240687, 13241857, 13250481, 13254501, 13255061 and 13255095 of the ARNTL gene in the nucleic acid sample, wherein the presence of a haplotype comprising 212111 is associated with an increased risk of developing restenosis.
18. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 53592086,
53592319 and 53598855 of cytoband 13ql4.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 211 is associated with an increased risk of developing restenosis.
19. A method for identifying a subj ect having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 78367393, 78368329, 78370985 and 78371562 of cytoband 15q25.1 in the nucleic acid sample, wherein the presence of a haplotype comprising 1221 is associated with an increased risk of developing restenosis.
20. A method for identifying a subject having an increased risk of developing restenosis, comprising:
(i) obtaining a nucleic acid sample from the subject; and (ii) determining the nucleotide present at chromosomal positions 68396720, 68396879, 68396951 and 68397041 of cytoband 18q22.3 in the nucleic acid sample, wherein the presence of a haplotype comprising 1111 is associated with an increased risk of developing restenosis.
21. The method of any one of the preceding claims wherein the nucleotide present at a chromosomal position is determined by PCR, in situ hybridization, Southern blotting, allele-specific hybridization or using an array.
22. Method of any one of claims 1-20 wherein the nucleic acid sample is obtained from a bodily fluid of the subject.
23. The method of claim 22 wherein the bodily fluid is blood or a blood fraction.
24. The method of any one of claims 1-20 wherein the nucleic acid sample is obtained from cells or tissue of the subject.
25. The method of any one of claims 1-20, comprising determining nucleotide(s) present in two or more of the group consisting of the PKP4 gene, the FLJ21986 gene, the NOV gene, the ARNTL gene, the TAF4B gene, the EPHB 1 gene, the ST 18 gene, cytoband 2p 16.1 , cytoband 4q31.21, cytoband 7p21.2, cytoband Ip31.1 , cytoband 2p24.1 , cytoband 2q22.3, cytoband 13ql4.3, cytoband 15q25.1 and cytoband 18q22.3.
PCT/US2007/068293 2006-05-04 2007-05-04 Genomics of in-stent restenosis WO2007131202A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US79801906P 2006-05-04 2006-05-04
US60/798,019 2006-05-04

Publications (2)

Publication Number Publication Date
WO2007131202A2 true WO2007131202A2 (en) 2007-11-15
WO2007131202A3 WO2007131202A3 (en) 2008-08-14

Family

ID=38668606

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/068293 WO2007131202A2 (en) 2006-05-04 2007-05-04 Genomics of in-stent restenosis

Country Status (1)

Country Link
WO (1) WO2007131202A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2402458A1 (en) * 2009-02-24 2012-01-04 Fina Biotech, S.L.U. Genetic markers of the risk of suffering from restenosis
TWI803994B (en) * 2021-09-29 2023-06-01 國立成功大學 Methods and kits for evaluating the risk of diseases or conditions associated with atherosclerosis

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
GANESH SANTHI K ET AL: "Rationale and study design of the CardioGene Study: genomics of in-stent restenosis" PHARMACOGENOMICS, ASHLEY PUBLICATIONS, GB, vol. 5, no. 7, October 2004 (2004-10), pages 952-1004, XP009096518 ISSN: 1462-2416 *
KOCH WERNER ET AL: "Apolipoprotein E gene polymorphisms and thrombosis and restenosis after coronary artery stenting" JOURNAL OF LIPID RESEARCH, BETHESDA, MD, US, vol. 45, no. 12, December 2004 (2004-12), pages 2221-2226, XP009096509 ISSN: 0022-2275 *
KOCH WERNER ET AL: "Tumor necrosis factor-alpha, lymphotoxin-alpha, and interleukin-10 gene polymorphisms and restenosis after coronary artery stenting" CYTOKINE, ACADEMIC PRESS LTD, PHILADELPHIA, PA, US, vol. 24, no. 4, 21 November 2003 (2003-11-21), pages 161-171, XP009096508 ISSN: 1043-4666 *
MONRAATS PASCALLE S ET AL: "Genetic inflammatory factors predict restenosis after percutaneous coronary interventions" CIRCULATION, AMERICAN HEART ASSOCIATION, DALLAS, TX, US, vol. 112, no. 16, 18 October 2005 (2005-10-18), pages 2417-2425, XP009096503 ISSN: 0009-7322 *
MONRAATS PASCALLE S ET AL: "Tumor necrosis factor alpha G(-244)A(-238) haplotype is associated with lower risk of restenosis after percutaneous coronary intervention" CIRCULATION, AMERICAN HEART ASSOCIATION, DALLAS, TX, US, vol. 110, no. 17 Suppl S, October 2004 (2004-10), page 95, XP009096524 ISSN: 0009-7322 *
NIESSNER ALEXANDER ET AL: "Fractalkine receptor polymorphisms V249I and T280M as genetic risk factors for restenosis" THROMBOSIS AND HAEMOSTASIS, STUTTGART, DE, vol. 94, no. 6, December 2005 (2005-12), pages 1251-1256, XP009096519 ISSN: 0340-6245 *
ZEE ROBERT Y L ET AL: "TP53 haplotype-based analysis and incidence of post-angioplasty restenosis" HUMAN GENETICS, BERLIN, DE, vol. 114, no. 4, March 2004 (2004-03), pages 386-390, XP009096507 ISSN: 0340-6717 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2402458A1 (en) * 2009-02-24 2012-01-04 Fina Biotech, S.L.U. Genetic markers of the risk of suffering from restenosis
JP2012521744A (en) * 2009-02-24 2012-09-20 フィナ、バイオテク、エセ.エレ.ウ. Genetic markers for the risk of restenosis
EP2402458A4 (en) * 2009-02-24 2013-01-02 Fina Biotech Slu Genetic markers of the risk of suffering from restenosis
TWI803994B (en) * 2021-09-29 2023-06-01 國立成功大學 Methods and kits for evaluating the risk of diseases or conditions associated with atherosclerosis

Also Published As

Publication number Publication date
WO2007131202A3 (en) 2008-08-14

Similar Documents

Publication Publication Date Title
JP6078211B2 (en) Genetic changes associated with autism and the phenotype of autism and its use for diagnosis and treatment of autism
CA2766246C (en) Genetic variants underlying human cognition and methods of use thereof as diagnostic and therapeutic targets
JP5759500B2 (en) How to determine glaucoma progression risk
US20120142608A1 (en) Rca locus analysis to assess susceptibility to amd and mpgnii
CA2679954A1 (en) Assessment of risk for colorectal cancer
CA2627686A1 (en) Method evolved for recognition and testing of age related macular degeneration (mert-armd)
US20090269761A1 (en) Genetic markers associated with age-related macular degeneration, methods of detection and uses thereof
US7488576B2 (en) Methods for diagnosis and treatment of psychiatric disorders
JP2008524999A (en) Compositions and methods for treating mental disorders
JP2005522221A (en) Methods for predicting patient responsiveness to tyrosine kinase inhibitors
US7407756B2 (en) Methods for detecting mutations associated with familial dysautonomia
WO2007131202A2 (en) Genomics of in-stent restenosis
US20060003354A1 (en) Methods and compositions for the diagnosis of Cornelia de Lange Syndrome
CA2682030A1 (en) Methods and agents for evaluating inflammatory bowel disease, and targets for treatment
US20130029330A1 (en) Mutant ldl receptor gene
EP1536000B1 (en) Method of judging inflammatory disease
WO2009073167A2 (en) Identification and diagnosis of pulmonary fibrosis using mucin genes, and related methods and compositions
WO2010071405A1 (en) Markers for detecting predisposition for risk, incidence and progression of osteoarthritis
JP2006526986A (en) Diagnosis method for inflammatory bowel disease
US6825336B1 (en) Polymorphisms in known genes associated with osteoporosis, methods of detection and uses thereof
JPWO2007032496A1 (en) Method for determining the risk of developing type 2 diabetes
AU2001239837B2 (en) Methods and composition for diagnosing and treating pseudoxanthoma elasticum and related conditions
WO2010072608A1 (en) Pcsk1 single nucleotide polymorphism in type 2 diabetes
JP2008520233A (en) Human obesity susceptibility gene encoding potassium ion channel and use thereof
US20070202502A1 (en) Assay For Bipolar Affective Disorder

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07797345

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07797345

Country of ref document: EP

Kind code of ref document: A2