DISEASE REVERSING TARGETS
Background of the Invention
This application claims the benefit of prior U.S. provisional application 60/130,129, filed April 20, 1999.
Summary of the Invention
The inventors have discovered that the administration of a drug, e.g., a steroid, which treats a disorder, e.g., a disorder characterized by fibrosis, e.g., a kidney disorder, can be used to identify genes and drugs involved in healing or reversing a symptom of the disorder.
In general, the invention features, a method of selecting a gene, e.g., a disease state reversal-related gene. As used herein a disease state reversal-related gene is one which is expressed, e.g., preferentially or specifically expressed, in healing or remodeling tissue. The method includes: providing a nucleic acid sample from a cell, tissue or animal which exhibits a reversed symptom, e.g., less fibrosis, or increased function; selecting a gene which is expressed, e.g., differentially expressed, in said sample, thereby selecting a gene, e.g., a disease state reversal-related gene.
Differentially expressed can mean differentially expressed as compared to a control, e.g., a normal or wild type, tissue.
In a preferred embodiment, the cell, tissue or animal has received a treatment, e.g., the treatment with a steroid.
In a preferred embodiment, the sample is from neural, e.g., retinal or nerve, kidney, heart, or endocrine, e.g., pancreatic, tissue. In a preferred embodiment, the sample is from a tissue which exhibits fibrosis.
In a preferred embodiment, the sample is from the kidney, e.g., it is from glomerulus tissue. In a preferred embodiment the sample is from an animal, e.g., an animal model for a disease. The model can be an animal model for diabetes, an animal model for a kidney disorder, or an animal model in which fibrosis occurs. In a particularly preferred embodiment the animal is an animal model for familial lipoidnephropathy. The sample can also be from a tissue from such as an animal,
e.g., an animal described herein. The sample can also be from a human, e.g., a normal human or one afflicted with a disease state.
The treatment can be administered to an animal. In other embodiments the treatment is administered to tissue or cells, in vitro.
In a preferred embodiment, the method further includes determining if a selected gene is expressed in a human tissue, e.g., a normal or disease state tissue. For example, the method can include determining if a selected gene is expressed in normal or diseased kidney, e.g., glomeruli, tissue, or tissue associated with fibrosis. This method can be used to further evaluate the role of a selected gene in a disease. In a preferred embodiment more than one, e.g., 2, 5, 10, or 20, or more, selected gene is analyzed. In a preferred embodiment, the method further includes expressing a polypeptide encoded by a selected gene. The method can also include characterizing the polypeptide, e.g., by a physical characteristic such as molecular weight or sequence, or by a biological or biochemical characteristic such as enzymatic activity, e.g., phosphatase or phosphorylase activity, or by determining if the polypeptide is the substrate for a reaction, e.g., a phosphorylation.
In a preferred embodiment the tissue is from a transgenic animal, e.g., a transgenic mouse. In a preferred embodiment the selected gene is misexpressed in healing, remodeling, treated, or disease state tissue.
In a preferred embodiment the sample is from a cultured cell, e.g., a genetically engineered cell. Such cells include cells which have been modified, e.g., by the introduction of a construct which directs the production of a selected gene. The construct can be introduced by viral vector, e.g., retroviral or adenoviral vector. The construct can include a coding sequence functionally coupled to a promoter, e.g., a tissue specific promoter, e.g., a kidney or glomerulus specific promoter. Methods described herein can be used to find a selected gene which inhibits, or is correlated with the inhibition, of tissue growth, e.g., unwanted fibrosis.
Methods of the invention allow for the identification of genes, selected genes, the expression of which is modulated, e.g., increased or decreased, in healing or remodeling tissue, or in tissue which has been treated (e.g., tissue from an animal which has been administered a treatment) with a drug or drug candidate. Accordingly, the invention includes a population of nucleic acids, the expression of which is modulated in such tissue. The population can be placed on a substrate or otherwise disposed such that individual selected nucleic acids are positionally distinguishable. For example, they can be placed on a two-
dimensional array, e.g., a chip, a titer plate, or a membrane, and used to profile gene expression from tissue, e.g., normal, disease state, or treated tissue. For example, RNA, or cDNA, from control and treated tissue can be used to determine the effect of a treatment, e.g., a drug candidate, on expression of selected genes. Similar methods can be performed with proteins. For example, proteins encoded by selected genes can be analyzed or used as probes of the components of control and experimental tissue.
The invention also included preparations of selected nucleic acid, and arrays thereof, described herein.
Some disorders, e.g., diabetes, respond to treatment with a reversal of symptoms, e.g., a reversal of nerve and kidney damage. Such reversal can take many years and is thought to involve tissue remodeling. The kidney disorder familial lipoidnephropathy gives rise to fibrosis in the glomeruli. Steroid treatment can reverse this damage. The methods described herein allow the identification and use of gene products which are positively correlated with recover, e.g., to speed the process of disease reversal by administering a gene or gene product identified with a method described herein.
Other embodiments are within the following description and the claims.
Detailed Description
A "heterologous promoter", as used herein is a promoter which is not naturally associated with a gene or a purified nucleic acid.
A "purified" or "substantially pure" or isolated "preparation" of a polypeptide, as used herein, means a polypeptide that has been separated from other proteins, lipids, and nucleic acids with which it naturally occurs. Preferably, the polypeptide is also separated from substances, e.g., antibodies or gel matrix, e.g., polyacrylamide, which are used to purify it. Preferably, the polypeptide constitutes at least 10, 20, 50 70, 80 or 95% dry weight of the purified preparation. Preferably, the preparation contains: sufficient polypeptide to allow protein sequencing; at least 1, 10, or 100 μg of the polypeptide; at least 1, 10, or 100 mg of the polypeptide.
A "purified preparation of cells", as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.
The terms "peptides", "proteins", and "polypeptides" are used interchangeably herein.
As used herein, the term "transgene" means a nucleic acid sequence, which is partly or entirely heterologous, i.e., foreign, to the transgenic animal or cell into which it is introduced, or, is homologous to an endogenous gene of the transgenic animal or cell into which it is introduced, but which is designed to be inserted, or is inserted, into the animal's genome in such a way as to alter the genome of the cell into which it is inserted (e.g., it is inserted at a location which differs from that of the natural gene or its insertion results in a knockout). A transgene can include one or more transcriptional regulatory sequences and any other nucleic acid, such as introns, that may be necessary for optimal expression of the selected nucleic acid, all operably linked to the selected nucleic acid, and may include an enhancer sequence.
As used herein, the term "transgenic cell" refers to a cell containing a transgene.
As used herein, a "transgenic animal" is any animal in which one or more, and preferably essentially all, of the cells of the animal includes a transgene. The transgene can be introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus. This molecule may be integrated within a chromosome, or it may be extrachromosomally replicating DNA.
As used herein, the term "tissue-specific promoter" means a DNA sequence that serves as a promoter, i.e., regulates expression of a selected DNA sequence operably linked to the promoter, and which effects expression of the selected DNA sequence in specific cells of a tissue. The term also covers so- called "leaky" promoters, which regulate expression of a selected DNA primarily in one tissue, but cause expression in other tissues as well.
"Misexpression", as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are described in the literature. See, for example, Molecular Cloning A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Nolumes I and II (D. Ν. Glover ed., 1985); Oligonucleotide
Synthesis (M. J. Gait ed., 1984); Mullis et al. U.S. Patent No: 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), Immunochemical Methods In Cell And Molecular Biology (Mayer and Walker, eds., Academic Press, London, 1987); Handbook Of Experimental Immunology, Nolumes I-IN (D. M. Weir and C. C. Blackwell, eds., 1986); Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, Ν.Y., 1986).
The methods described herein provide for the identification and evaluation of genes (and the protein products thereof) which are related to a disease state. These selected genes or proteins can serve as a point of intervention or as diagnostic or drug discovery tool. Expression Vectors, Host Cells and Genetically Engineered Cells
In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid identified by a method described herein.
A vector can include a nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term "regulatory sequence" includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the
like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides.
Another aspect the invention provides a host cell which includes a nucleic acid molecule identified by a method described herein. The terms "host cell" and "recombinant host cell" are used interchangeably herein. Such terms refer not only to the particular subject cell but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art- recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation
A host cell of the invention can be used to produce (i.e., express) a protein. Accordingly, the invention further provides methods for producing a protein identified by a method described herein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a protein has been introduced) in a suitable medium such that a protein is produced. In another embodiment, the method further includes isolating a protein from the medium or the host cell.
The invention provides methods for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids,
small molecules or other drugs) which bind to proteins encoded by a nucleic acid identified by a method described herein, have a stimulatory or inhibitory effect on, for example, expression or activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.
In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds which bind to or modulate the activity of a protein encoded by a nucleic acid identified by a method described herein or polypeptide or a biologically active portion thereof.
The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art.
Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et α/. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.
In one embodiment, an assay is a cell-based assay, or an assay which includes cell-free cellular components, in which a protein or biologically active
portion thereof is contacted with a test compound, and the ability of the test compound to modulate its activity is determined.
The ability of the test compound to modulate binding of a protein encoded by a nucleic acid identified by a method described herein to a compound, e.g., a substrate, or to bind to the gene can also be evaluated.
Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used e.g., to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate the gene with a disease; or (ii) identify or evaluate an individual from a minute biological sample (tissue typing).
Genes can be mapped to chromosomes by preparing PCR primers
(preferably 15-25 bp in length) from the nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the gene sequences will yield an amplified fragment.
Sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Patent 5,272,057).
Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the nucleotide sequences described herein can be used to prepare two PCR primers from the 5' and 3' ends of the sequences. These
primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.
Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences are used, a more appropriate number of primers for positive individual identification would be 500-2,000.
If a panel of reagents from nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.
The present invention also includes diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.
Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes a protein encoded by a nucleic acid identified by a method described herein.
Such disorders include, e.g., a disorder associated with the misexpression a protein encoded by a nucleic acid identified by a method described herein.
The method includes one or more of the following: detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of a gene identified by a method described herein, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5' control region; detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of a gene identified by a method described herein; detecting, in a tissue of the subject, the misexpression of a gene identified by a method described herein, at the mRNA level, e.g., detecting a non-wild type level of a mRNA ; detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a protein encoded by a nucleic acid identified by a method described herein.
In prefeπed embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.
For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from the gene or naturally occurring mutants thereof or 5' or 3' flanking sequences naturally associated with the gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.
In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of the gene.
Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.
In preferred embodiments the method includes determining the structure of the gene, an abnormal structure being indicative of risk for the disorder.
In preferred embodiments the method includes contacting a sample form the subject with an antibody to the protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.
The presence, level, or absence of protein or nucleic acid identified by a method described herein in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting the protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes the protein such that the presence of protein or nucleic acid is detected in the biological sample. The term "biological sample" includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the _ gene; measuring the amount of protein encoded by the gene; or measuring the activity of the protein encoded by the gene.
The level of mRNA corresponding to the gene in a cell can be determined both by in situ and by in vitro formats.
In another aspect, the invention features, a method of analyzing a plurality of capture probes. The method can be used, e.g., to analyze gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence; contacting the array with a nucleic acid, or a protein encoded by a nucleic acid, identified by a method described herein, preferably purified, nucleic acid, preferably purified, polypeptide, preferably purified, or antibody, and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.
The capture probes can be a set of nucleic acids from a selected sample, e.g., a sample of nucleic acids derived from a control or non-stimulated tissue or cell.
The method can include contacting the nucleic acid, polypeptide, or antibody with a first array having a plurality of capture probes and a second array having a different plurality of capture probes. The results of each hybridization can be compared, e.g., to analyze differences in expression between a first and second sample. The first plurality of capture probes can be from a control sample, e.g., a wild type, normal, or non-diseased, non-stimulated, sample, e.g., a biological fluid, tissue, or cell sample. The second plurality of capture probes can be from an experimental sample, e.g., a mutant type, at risk, disease-state or disorder-state, or stimulated, sample, e.g., a biological fluid, tissue, or cell sample.
The method can be used to detect SNPs, as described above.
In another aspect, the invention features, a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a PKC-conditioned cell or subject; contacting the aπay with one or more inquiry probe, wherein an inquiry probe can be a nucleic acid, polypeptide, or antibody; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which are not PKC-conditioned; and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.
In another aspect, the invention features, a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional aπay having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the aπay with a first sample from a cell or subject which exhibits a reversed symptom, e.g., has been treated with a steriod; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the aπay with a second
sample from a cell or subject which does not exhibit a reversed symptom, e.g., has not been treated with a steriod; and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same aπay can be used for both samples or different aπays can be used. If different aπays are used the plurality of addresses with capture probes should be present on both aπays.
In another aspect, the invention features, a method of analyzing a nucleic acid or protein encoded by, a nucleic acid identified by a method described herein, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a nucleic acid or amino acid sequence; comparing the sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze the sequence.
The method can include evaluating the sequence identity between a sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the internet.
In another aspect, the invention features, a set of oligonucleotides, useful, e.g., for identifying SNP's, or identifying specific alleles of a nucleic acid identified by a method described herein. The set includes a plurality of oligonucleotides, each of which has a different nucleotide at an inteπogation position, e.g., an SNP or the site of a mutation. In a prefeπed embodiment, the oligonucleotides of the plurality identical in sequence with one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotides which hybridizes to one allele provides a signal that is distinguishable from an oligonucleotides which hybridizes to a second allele.
The methods described herein provide for the identification and evaluation of genes (and the protein products thereof) which are related to a disease state. These selected genes or proteins can serve as a point of intervention or as diagnostic or drug discovery tool.
Other embodiments are within the following claims. What is claimed is: